Advertisement
Guest User

Untitled

a guest
Jul 1st, 2018
101
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.48 KB | None | 0 0
  1. >>66575417
  2. You're overcomplicating it. I tried to solve it in my head, and this is what I came up with. It should be good enough. Haven't tested this code, since I can't be bothered with the pregeneration, but it should work.
  3. static[] is a big ass const char array, it should look something like this
  4. "LATIN CAPITAL LETTER A\0\x94B\0\x94C\0" {...} "Z\0LEFT SQUARE BRACKET\0REVERSE SOLIDUS\0\x81IGHT SQUARE BRACKET\0CIRCUMFLEX ACCENT\0LOW LINE\0GRAVE ACCENT\0LATIN SMALL LETTER A\0\x82B\0\x82C\0" {...}
  5.  
  6. my notes:
  7. [code]
  8. type - full or continue
  9. if continue, last string, overwrite from n, include null term
  10.  
  11. type = full (0000 0000) or continue (1xxx xxxx) -- write from n-bin(10000000)
  12.  
  13. [/code]
  14. code:
  15. [code]
  16. #define HIGH 0x80
  17. #define HARDCODED_VALUE (1024*1024)
  18. char* data_start = malloc(HARDCODED_VALUE);
  19. char* data = data_start;
  20. const char* static = static_start;
  21. if (data[0] > HIGH)
  22. { unsigned char n = data[0] - HIGH;
  23. strcpy(data, static);
  24. static += strlen(static);
  25. strcpy(data+n, static);
  26. } else if (data [0] != '\0')
  27. { static += strlen(static);
  28. strcpy(data, static);
  29. } else //null terminator
  30. { break;
  31. }
  32. size_t len = data-data_start; //replace HARDCODED_VALUE with this+1
  33. [/code]
  34. To index, first transform the codepoint so it's contiguous (e.g. equals line number in https://unicode.org/Public/UNIDATA/UnicodeData.txt), then do
  35. [code]
  36. const char* index(const char* data, unsigned int n)
  37. {
  38. while (--codepoint, data += strlen(data))
  39. ;
  40. return data;
  41. }
  42. [/code]
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement