Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- GIT index format
- ================
- = The git index file has the following format
- All binary numbers are in network byte order. Version 2 is described
- here unless stated otherwise.
- - A 12-byte header consisting of
- 4-byte signature:
- The signature is { 'D', 'I', 'R', 'C' } (stands for "dircache")
- 4-byte version number:
- The current supported versions are 2 and 3.
- 32-bit number of index entries.
- - A number of sorted index entries (see below).
- - Extensions
- Extensions are identified by signature. Optional extensions can
- be ignored if GIT does not understand them.
- GIT currently supports cached tree and resolve undo extensions.
- 4-byte extension signature. If the first byte is 'A'..'Z' the
- extension is optional and can be ignored.
- 32-bit size of the extension
- Extension data
- - 160-bit SHA-1 over the content of the index file before this
- checksum.
- == Index entry
- Index entries are sorted in ascending order on the name field,
- interpreted as a string of unsigned bytes (i.e. memcmp() order, no
- localization, no special casing of directory separator '/'). Entries
- with the same name are sorted by their stage field.
- 32-bit ctime seconds, the last time a file's metadata changed
- this is stat(2) data
- 32-bit ctime nanosecond fractions
- this is stat(2) data
- 32-bit mtime seconds, the last time a file's data changed
- this is stat(2) data
- 32-bit mtime nanosecond fractions
- this is stat(2) data
- 32-bit dev
- this is stat(2) data
- 32-bit ino
- this is stat(2) data
- 32-bit mode, split into (high to low bits)
- 4-bit object type
- valid values in binary are 1000 (regular file), 1010 (symbolic link)
- and 1110 (gitlink)
- 3-bit unused
- 9-bit unix permission. Only 0755 and 0644 are valid for regular files.
- Symbolic links and gitlinks have value 0 in this field.
- 32-bit uid
- this is stat(2) data
- 32-bit gid
- this is stat(2) data
- 32-bit file size
- This is the on-disk size from stat(2), truncated to 32-bit.
- 160-bit SHA-1 for the represented object
- A 16-bit 'flags' field split into (high to low bits)
- 1-bit assume-valid flag
- 1-bit extended flag (must be zero in version 2)
- 2-bit stage (during merge)
- 12-bit name length if the length is less than 0xFFF; otherwise 0xFFF
- is stored in this field.
- (Version 3) A 16-bit field, only applicable if the "extended flag"
- above is 1, split into (high to low bits).
- 1-bit reserved for future
- 1-bit skip-worktree flag (used by sparse checkout)
- 1-bit intent-to-add flag (used by "git add -N")
- 13-bit unused, must be zero
- Entry path name (variable length) relative to top level directory
- (without leading slash). '/' is used as path separator. The special
- path components ".", ".." and ".git" (without quotes) are disallowed.
- Trailing slash is also disallowed.
- The exact encoding is undefined, but the '.' and '/' characters
- are encoded in 7-bit ASCII and the encoding cannot contain a NUL
- byte (iow, this is a UNIX pathname).
- 1-8 nul bytes as necessary to pad the entry to a multiple of eight bytes
- while keeping the name NUL-terminated.
- == Extensions
- === Cached tree
- Cached tree extension contains pre-computed hashes for trees that can
- be derived from the index. It helps speed up tree object generation
- from index for a new commit.
- When a path is updated in index, the path must be invalidated and
- removed from tree cache.
- The signature for this extension is { 'T', 'R', 'E', 'E' }.
- A series of entries fill the entire extension; each of which
- consists of:
- - NUL-terminated path component (relative to its parent directory);
- - ASCII decimal number of entries in the index that is covered by the
- tree this entry represents (entry_count);
- - A space (ASCII 32);
- - ASCII decimal number that represents the number of subtrees this
- tree has;
- - A newline (ASCII 10); and
- - 160-bit object name for the object that would result from writing
- this span of index as a tree.
- An entry can be in an invalidated state and is represented by having -1
- in the entry_count field.
- The entries are written out in the top-down, depth-first order. The
- first entry represents the root level of the repository, followed by the
- first subtree---let's call this A---of the root level (with its name
- relative to the root level), followed by the first subtree of A (with
- its name relative to A), ...
- === Resolve undo
- A conflict is represented in the index as a set of higher stage entries.
- When a conflict is resolved (e.g. with "git add path"), these higher
- stage entries will be removed and a stage-0 entry with proper resoluton
- is added.
- When these higher stage entries are removed, they are saved in the
- resolve undo extension, so that conflicts can be recreated (e.g. with
- "git checkout -m"), in case users want to redo a conflict resolution
- from scratch.
- The signature for this extension is { 'R', 'E', 'U', 'C' }.
- A series of entries fill the entire extension; each of which
- consists of:
- - NUL-terminated pathname the entry describes (relative to the root of
- the repository, i.e. full pathname);
- - Three NUL-terminated ASCII octal numbers, entry mode of entries in
- stage 1 to 3 (a missing stage is represented by "0" in this field);
- and
- - At most three 160-bit object names of the entry in stages from 1 to 3
- (nothing is written for a missing stage).
Add Comment
Please, Sign In to add comment