BGBTech Image Codec 1 (Preliminary Spec)

BGBTech Image Codec 1

Goal: Fast decoding for video maps.
Should be able to directly encode/decode DXTn images.

Basic format derived from BTJ-NBCES.

(Most) Multibyte values will be big-endian.

Note that a valid decoder need not be able to accept all possible combinations of features, only the specific combination of features it expects (This format is more intended for task specific usage than as an interchange format).

Will reuse concepts and markers for tag-layers and component layers.


== General ==

Images may have multiple layers.

The origin point for images will be in the lower-left corner of the image canvas (0,0), with +X as right and +Y as up.

All textures within a given tag-layer will currently be required to use the same colorspace.

For video, this may also mean that all subsequent frames may need to have the same image format as laid out in the initial frame.

For DXTn Block Pack and video, the prior frame will provide the initial contents of the sliding window for decoding the next frame.


=== Markers ===

<0xFF:BYTE> <marker:BYTE> <size:WORD> <data:BYTE[size-2]>

<0xFF:BYTE> <marker:BYTE> <0x0000:WORD>
	<size:DWORD> <data:BYTE[size-6]>

<0xFF:BYTE> <marker:BYTE> <0x0001:WORD>
	<size:QWORD> <data:BYTE[size-10]>

Marker:
	Escape:			0x00-0x0F	Levels 0-15
	EscapeChain:	0x10		Levels 16+
	Reserved:		0x11-0xBF
	ReservedJPEG:	0xC0-0xDF	JPEG Markers, Not Used
	APP0-APP15:		0xE0-0xEF	Application Markers
	FMT0-FMT13:		0xF0-0xFD	Markers, Must Understand
	COM:			0xFE		Comment Marker, Ignored

Data within a marker will be 0xFF escaped.
This will mean any literal 0xFF bytes will be replaced with an escape marker (initially 0x00, but for nested-structures, escape-levels are used).

Data within an unknown APPn marker should be silently ignored.
Data within an unknown FMTn marker should result in the image being rejected.


APP11/APP12/FMT12/FMT13 markers may be optionally split into multiple parts via a continuation markers, which will be a marker of matching type but with 0 for the FOURCC or tag.


=== APP11 / FMT11 ===

<tag: ASCIIZ> <args:ASCIIZ[]>

Note that APP11 markers are presently limited to ASCII data.


=== APP12 / FMT12 ===

<tag: FOURCC> <data:BYTE[]>


=== FMT13 ===

<tag: ASCIIZ> <data:BYTE[]>


== BTIC1 Wrapper ==

FMT13: "BTIC1"
FMT13: "BTIC1Z"

Will contain all data for the image.

The second (Z suffix) form, will Deflate encode the image data, and will use a Zlib header, with method=8 for Deflate, and 9 for Deflate64.


=== Layers ===

APP11: "CompLayer", LayerName
	Denotes the start of a given Component Layer.
	This marker is directly followed by the relevant image data.

	Layer Names:
		"RGB": RGB Base Layer
		"RGBA": RGBA Base Layer
		"Alpha": Optional disjoint alpha image (RGB)
		"XYZ": XYZ Normals
		"XYZD": XYZ Normals+Depth
		"Depth": Depth image (Bump Map)
		"SpRGB": Specular RGB
		"SpRGBE": Specular RGB+Exponent
		"SpExp": Specular Exponent
		"LuRGB": Luma RGB
		"LuRGBE": Luma RGB+Exponent
		"LuExp": Luma Exponent

APP11: "TagLayer", LayerName
	Gives a named tag layer.
	This may be followed by 1 or component-layers.


=== BTIC1 Image ===

Stores a single or mipmap image.

Note that for mipmap images, the layers will be packed end-to-end.

FMT12: "LHDR" (Layer Header)
	<layerID:DWORD>		LayerID of Image
	<xorg:DWORD>		X Origin of Image
	<yoorg:DWORD>		Y Origin of Image
	<xsize:DWORD>		X Size of Image
	<ysize:DWORD>		Y Size of Image
	<xcenter:DWORD>		X Center of Image
	<ycenter:DWORD>		Y Center of Image
	<flags:DWORD>		Layer Flags

Note that while it may seem redundant to give the image size twice, the sizes will represent different sizes. The layer header size will represent the images' size relative to the canvas, whereas the image header will encode the physically-encoded size (potentially padded up to a power-of-2).

The origin will indicate the position of the layer image (relative to its center) within the canvas.

The center will indicate the center of a layer image (in pixels) relative to its lower-left corner.

LayerID gives a layer ID for each image. This is required to be unique for all layer-images within a compound image, and is required to match that of the same layer (same tag-layer and component) within the base-frame.


FMT12: "THDR" (Image Header)
	<width:DWORD>		Image Width
	<height:DWORD>		Image Height
	<imgtype:WORD>		Image Type
	<mip_start:BYTE>	MipMap Level Start
	<mip_end:BYTE>		MipMap Level End
	<filtmode:BYTE>		Filter Modes (Depends on ImageType)
	<clrtype:BYTE>		Colorspace Type (Depends on ImageType)
	<pixtype:BYTE>		Pixel Type (Depends on ImageType)

FMT12: "TDAT" (Image Data)
	<data:BYTE[]>		Image Data

Image Types:
	0	RGBA	(Raw RGBA)
	1	RGB		(Raw RGB)
	2	-
	3	BGRA
	4	BGR
	5	YUVA	(Raw YUVA)
	6	YUV		(Raw YUV)
	7	Y		(Raw Luma)
	8	YA		(Raw Luma+Alpha)
	...
	16	BC1 / DXT1 (Opaque)
	17	BC2 / DXT3
	18	BC3 / DXT5
	19	BC4
	20	BC5
	21	BC6
	22	BC7
	23	BC1F / DXT1F (Fast)
	24	BC3F / DXT5F (Fast)
	25	BC1A / DXT1A (DXT1 + Alpha)
	26	DXT5_UVAY

Filter Modes:
	0	None (RGB / YUV / DXTn)
	1	Scanline Filtering (RGB / YUV)
	2	Simple Block Filtering (RGB / YUV)
	3	Block Pack (DXTn)

Clrtype:
	0	RGB(A)	(Normal)
	1	YCbCr	(YUV / UVAY)
	2	RCT		(YUV)
	3	BLCT1	(YUV / UVAY)
	4	BLCT2	(YUV / UVAY)

RCT:
	Y=(R+2G+B)/4
		Y=G+(B+R-2*G)/4;
	U=B-G
	V=R-G

	G=Y-(U+V)/4
	B=G+U
	R=G+V

BLCT1:
	Y=(R+2G+B)/4
		Y=G+(B+R-2*G)/4
	U=(B-G)/2 + DC
	V=(R-G)/2 + DC

	G=Y-(U+V-2DC)/2
	B=G+2*(U-DC)
	R=G+2*(V-DC)

BLCT2:
	Y=(R+2G+B)/4
		Y=G+(B+R-2*G)/4
		Y=(G+V)/2
	U=(B-R)/2 + DC
	V=(R+B)/2

	R=V-(U-DC)
	G=2Y-V
	B=V+(U-DC)


=== Scanline and Block Filtering ===

Scanline filtering and block-filtering will store the filter bytes prior to the image data. Scanline filtering will apply the filter for a single scanline, whereas block-filtering will apply it to an 8x8 block of pixels.

Pixel Type:
	0	Default		(Default / Undefined)
	1	Byte		(Raw Byte, RGBA/YUVA)
	2	Short		(Raw Signed 16-bit, RGBA/YUVA)
	3	UShort		(Raw Unsigned 16-bit, RGBA/YUVA)
	4	ByteVL		(Byte, VLI-Packed)
	5	ShortVL		(16-Bit Signed Short, VLI-Packed)
	6	UShortVL	(16-Bit Unsigned Short, VLI-Packed)
	7	Float16VL	(16-Bit Float, VLI-Packed)

This is specific to RGB(A) and YUV(A) modes, N/A for DXTn.

Note that float16 data will be treated as if it were unsigned-short data.


Pixels:
	C A
	B x

Filters (Scanline or Block):
	0	None	(P=0)
	1	Left	(P=B)
	2	Up		(P=A)
	3	Average	(P=(A+B)/2)
	4	Paeth	(...)
	5	Linear	(P=A+B-C)

Block Only (Possible):
	16	Hadamard
	17	DCT
	18	RDCT

DC Coefficients will use Paeth prediction for block filtering.

Paeth:
	P0=A+B-C
	Pick P value closest to P0.


=== DXTn ===

Filter for DXTn / BCn texture compression.
DXTn packs bits starting from the LSB.

DXTn stores a 4x4 block of pixels using colors interpolated from 2 pixel values.

Pixel Block:
	A B C D
	E F G H
	I J K L
	M N O P

Color: 5:6:5 (LE WORD)
	Red: Bits 11-15
	Green: Bits 5-10
	Blue: Bits 0-4

DXT1 / DXT5 RGB

Color0: Color
Color1: Color
pixels: BYTE[4] (2 bpp)
	DCBA
	HGFE
	LKJI
	PONM

Pixel Bits:
	0=Color0
	1=Color1
	2=	0.66*Color0 + 0.33*Color1 (DXT5)
		0.5*Color0 + 0.5*Color1(DXT1)
	3=	0.33*Color0 + 0.66*Color1 (DXT5)
		Transparent / Black (DXT1)

DXT5

Alpha0: BYTE
Alpha1: BYTE
alphas: BYTE[6] (3bpp)
Color0: Color
Color1: Color
pixels: BYTE[4]


DXT5 Alpha / BC4:
	Alpha0: BYTE
	Alpha1: BYTE
	alphas: BYTE[6] (3bpp for each pixel)

if(Alpha0<=Alpha1)
{
	0=Alpha0, 1=Alpha1;
	2-5=interpolated alphas;
	6=0, 7=255.
}else
{
	0=Alpha0, 1=Alpha1;
	2-7=interpolated alphas;
}


=== DXT5 UVAY ===

DXT5 UVAY: YUVA Embedded in DXT5 Textures.

Colorspace:
	Y, U, V, A

"A" may encode either Alpha or a UV Scale factor.

Values between 0 and 127 encode Alpha (With Scale=1.0), and 128-255 encode Scale (With Alpha=1.0).

Note that A=128 means Scale=1.0, and A=255 means Scale=1.0/127.

	0-127: Alpha=2.0*B, Scale=1.0.
	128-255: Alpha=1.0, Scale=1.0-(2.0*B-1.0).

The use of a Scale allows more accurate reproduction of colors.

Assertion: For alpha-blending, reduced color precision is acceptable.


Within the DXT5 image, components are encoded as in the order (U, V, A, Y), or essentially: R'=U, G'=V, B'=A, A'=Y.


=== VLI Pack ===

Images will be packed in terms of variable-length quantaties.

Each block tag will be encoded in the form (byte):
0-127				Literal Value (0..127, -64..63).
128-191	X			Literal Value (128..16383, -8192..8191).
192-223	XX			Literal Value (16384..2097152, -1048576..1048575).
224-238 I			LZ/RLE Run (2-16 items, Index)
239		LI			LZ/RLE Run (Length, Index)
240		XXX			24-Bit Value
241		XXXX		32-Bit Value
242-246				Literal Blocks (2-6 Values)
247		L			Literal Blocks (L Values)
248-255				Reserved

Values may be interpreted as signed or unsigned.
Normal values will be signed, whereas indices and lengths will be unsigned.

Sign will be folded into the LSB following the pattern:
0, -1, 1, -2, 2, ...


=== Block Pack ===

DXTn packed images.

Images will be packed in terms of 8-byte blocks. Formats using 16-byte blocks (such as DXT5) will store blocks instead as 2 planes.


Each block tag will be encoded in the form (byte):
0 <block:QWORD>		Literal Block.
1-127				Single byte block index.
128-191	X			Two byte block index (16384 blocks).
192-223	XX			Three byte block index (2097152 blocks).
224-238 I			LZ/RLE Run (2-16 blocks, Index)
239		LI			LZ/RLE Run (Length, Index)
240		XXX			24-Bit Index
241		XXXX		32-Bit Index
242-246				Literal Blocks (2-6 Blocks)
247		L			Literal Blocks (L Blocks)
248-255				Reserved

The block index will indicate how many blocks backwards to look for a matching block (1 will repeat the prior block).

Length/Index values will use the same organization as above, only limited to encoding numeric values.

0-127				0-127.
128-191	X			128-16383.
192-223	XX			16384-2097151.
240		XXX			24-Bit Index (0-16777215)
241		XXXX		32-Bit Index (0-4294967295)

Note that DXT5 images will be split into 2 block-planes, with the first encoding the alpha component, followed by the plane encoding the RGB components.