Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Rice+STF LZ77
- * Aim: Rice+STF LZ77 Codec
- * Optimize for low-cost implementation.
- ** Decoder should be relatively simple with low memory requirements.
- * Aim for higher compression (at lower speeds) than LZ4 or RP2.
- === AdRice ===
- Will use an LSB-first bitstream.
- Will use a prefix with zero or more zero bits terminated by a 1 bit.
- The Q prefix will range from 0-7, a prefix with 8 zero bits will encode an escape:
- * 8 bits: 0x00
- * 8 bits: STF Index
- In other cases, the Q prefix will be followed by a K bit suffix (P). The value will be decoded as (Q<<K)|P;
- K will be updated based on Q:
- * 0: K-- if K>0
- * 1: Leave as-is
- * 2..7: K++ (Implicitly clammped to 8).
- The AdRice contexts will have a starting K of 4.
- === STF ===
- This will be a positional entropy transform.
- It aims to order symbols (roughly) in the order of their priority.
- The native implementation is a 256 byte array, with an 8-bit position rover.
- An index (I) is used to fetch a value from the circular array (mod-256), and then the array is updated based on the index.
- The initial state for the array will encode values from 00 to FF in order, with an initial rover value of 00.
- Will use slightly different behaviors based on the index (I):
- * 0: Do Nothing
- * 1..31: Swap symbols at I with I-1
- * Else:
- ** Swap symbols at I with 255;
- ** Rotate window by 1 position (so 255 is now at 0).
- Nominally, STF indices will be encoded using Rice, in a form known as STFRice.
- This may be denoted with a context given as an argument, so "STFRice(Lit)" may mean a STF+Rice symbol encoded in the "Literals" context. Each STFRice context will have its own array, rover, and K value.
- === VLN ===
- VLN Values may also be encoded using AdRice:
- * 00..1F: 0.. 31, 0
- * 20..2F: 32.. 63, 1
- * 30..3F: 64.. 127, 2
- * 40..4F: 128.. 255, 3
- * 50..5F: 256.. 511, 4
- * 60..6F: 512.. 1023, 5
- * 70..7F: 1024.. 2047, 6
- * 80..8F: 2048.. 4095, 7
- * 90..9F: 4096.. 8191, 8
- * A0..AF: 8192..16383, 9
- * B0..BF: 16384..32767, 10
- * C0..CF: 32768..65535, 11
- === LZ Stream ===
- Will consist of a series of Tag bytes giving a literal byte count and match length.
- STFRice(Tag):
- * High 4 bits: Literal Bytes (0-14)
- * Low 4 bits: Match Length (3-17)
- If there are 15 or more literal bytes, then the tag will encode 15, and will be followed by a VLN encoding the actual length, VLN(Len).
- A similar encoding will be used for matches longer than 17 bytes, which will also be encoded as VLN(Len). Otherwise, a separate VLN is not used.
- The match length will be followed by the distance, encoded as VLN(Dist).
- After the distance, a series of literal bytes may be encoded as STFRice(Lit).
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement