Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- This program reads an export reference graph (i.e. a graph representing the
- runtime dependencies of a set of derivations) created by Nix and groups them
- in a way that is likely to match the grouping for other derivation sets with
- overlapping dependencies.
- This is used to determine which derivations to include in which layers of a
- container image.
- # Inputs
- * a graph of Nix runtime dependencies, generated via exportReferenceGraph
- * a file containing absolute popularity values of packages in the
- Nix package set (in the form of a direct reference count)
- * a maximum number of layers to allocate for the image (the "layer budget")
- # Algorithm
- It works by first creating a (directed) dependency tree:
- ```
- img (root node)
- │
- ├───> A ─────┐
- │ v
- ├───> B ───> E
- │ ^
- ├───> C ─────┘
- │ │
- │ v
- └───> D ───> F
- │
- └────> G
- ```
- Each node (i.e. package) is then visited to determine how important
- it is to separate this node into its own layer, specifically:
- 1. Is the node within a certain threshold percentile of absolute
- popularity within all of nixpkgs? (e.g. `glibc`, `openssl`)
- 2. Is the node's runtime closure above a threshold size? (e.g. 100MB)
- In either case, a bit is flipped for this node representing each
- condition and an edge to it is inserted directly from the image
- root, if it does not already exist.
- For the rest of the example we assume 'G' is above the threshold
- size and 'E' is popular.
- This tree is then transformed into a dominator tree:
- ```
- img
- │
- ├───> A
- ├───> B
- ├───> C
- ├───> E
- ├───> D ───> F
- └───> G
- ```
- Specifically this means that the paths to A, B, C, E, G, and D
- always pass through the root (i.e. are dominated by it), whilst F
- is dominated by D (all paths go through it).
- The top-level subtrees are considered as the initially selected
- layers.
- If the list of layers fits within the layer budget, it is returned.
- Otherwise layers are merged together in this order:
- * layers whose root meets neither condition above
- * layers whose root is popular
- * layers whose root is big
- * layers whose root meets both conditions
- # Threshold values
- Threshold values for the partitioning conditions mentioned above
- have not yet been determined, but we will make a good first guess
- based on gut feeling and proceed to measure their impact on cache
- hits/misses.
- # Example
- Using the logic described above as well as the example presented in
- the introduction, this program would create the following layer
- groupings (assuming no additional partitioning):
- Layer budget: 1
- Layers: { A, B, C, D, E, F, G }
- Layer budget: 2
- Layers: { G }, { A, B, C, D, E, F }
- Layer budget: 3
- Layers: { G }, { E }, { A, B, C, D, F }
- Layer budget: 4
- Layers: { G }, { E }, { D, F }, { A, B, C }
- ...
- Layer budget: 10
- Layers: { E }, { D, F }, { A }, { B }, { C }
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement