-
Notifications
You must be signed in to change notification settings - Fork 690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Image "Layer" #922
Comments
But
Thanks, my matrix ID is |
Is there a manifest definition for using these blobs? I'm curious if it's one manifest per blob, or if you are taking them as a group. |
NixOS sets See https://github.com/NixOS/nixpkgs/blob/master/pkgs/tools/compression/gzip/default.nix#L41= NixOS/nixpkgs#86348
nix already supports itself gzip, brotli, bzip2 and xz, so that shouldn't be a big issue.
Each store path entry has dependencies which themselves have other dependencies. You can either parse the drv files to get this information or the cli. |
Where I was going with the question is pushing a blob to a registry will result in the blob being deleted on the next garbage collection. There needs to be an OCI image manifest that points to the blob(s), and a tag pointing to that manifest. There's not even a media type on the blob push, that gets set in a descriptor in a manifest. |
Ah got ya! In fact, we're prototyping a nix cache implementation that is also planned to be OCI-distribution compliant (read only) over at https://github.com/input-output-hk/spongix I'm unfortunately not very familiar with the OCI terminology, yet. Hence, I was wondering if I need to start out defining In principle, it's pretty straight forward: each store path should be fetched in parallel, so I guess they would be different "layers". But this is exactly where I need guidance on. |
Okay, that makes sense. I was thinking you wanted to push these to any registry supporting distribution-spec. Making your own registry changes that completely. 👍 |
Yes, but this requires the specific version of GNU gzip (and maybe the specific host CPU and I'd like to see a "formal" specification that is enough for implementing a reproducible compressor from scratch without peeking into the source code of GNU gzip. If it is too difficult, probably we should clearly document the expected version of GNU gzip.
Any of them has the "formal" specification? |
Why is reproducible compression of importance? Compression is only used for transit and storage and can be turned into the real (and reproducible!) data at any point. Compression is an implementation detail; a solution that compresses its data should behave the same as one that doesn't. The former just achieves its goal more efficiently. |
Unfortunately the CAS design of registries is based on the data in transit (typically compressed). We've looked at how that would be considered a storage and transit detail, and track the digest of the uncompressed data, but that's currently an unsolved problem. |
How do other layer types handle compression? Do they already use a reproducible compressor we could adopt for NARs aswell? |
Other layer types don't handle reproducibility. You can have two different CAS entries for effectively the same content, but with different timestamps, compression artifacts, etc. |
I think there might be a misunderstanding here. While we Nix people do strive for 100% reproducibility, output paths ( CA-derivations are coming as an experimental feature but they won't be the MO for a long time to come or even all purposes. In Nix, you can therefore also have many different CAS entries for a given output path but you generally don't care which one you get; any one of them is fine. |
@sudo-bmitch @AkihiroSuda I want to get my hands dirty, next. Where should I start? 😄 -- maybe to clear up the clouds, we should have a quick call, even? |
Can I get a slack invitation to opencontainers.slack.com for [email protected] ? |
We've got a weekly meeting, and this week's agenda is open: https://opencontainers.org/community/overview/ Invite to the slack sent. |
Maybe we can have a new digest algo like
How does the output paths relate to the OP? |
For reference, here's where distribution-spec is looking to solve it with content encoding headers, which feels a lot cleaner to me, but also risky for registries to support if there are existing clients that would pull large uncompressed blobs: opencontainers/distribution-spec#235 |
@AkihiroSuda the OP had a bit unclear wording around the Nix output path hash being able to address content which it is not. It addresses a desired result but the exact content of that result is not defined. (Obviously you could calculate a hash but that would only be the hash of one instance of an output path of which there could be multiple.) NARs themselves are reproducible. Given the same content, they will have the same hash. The content itself (as "identified" by the Nix output path hash) is not necessarily reproducible however. I just wanted to make that clear. |
For contextual awareness following up on various discussions, here is an instance where current optimizations reach their limit, by pigeon-holing a large dependency tree into the layer limit: nlewo/nix2container#27 |
The input addressed (not contend addressed) nature essentially ties the manifest creation to a particular registry that actually holds the data. For the time being and experimentation, I consider this to be absolutely fine, since it doesn't make any difference as the desired (and observable) functionality of a binary is still captured by the hash (@Atemu , please correct me if I miss something, here). On the longer, though, |
I think we are getting unnecessarily into the weeds here. OCI should not need to care how the "store paths" are chosen; that can be a black box. It can simply require that such paths to not conflict, while leaving how conflicts are avoided as a job for images themselves to deal with. I also wouldn't mind changing |
A |
Has there been any progress on this? |
This has been implemented here in nix-snapshotter. Introducing a new media type would've required native integration in container runtimes like containerd, so instead we opted to use annotations to reference the necessary Nix packages. This means no changes to the image-spec is necessary to support native Nix images. See here for low level details. Essentially this replaces the API calls for fetching blobs with Nix protocols for fetching blobs from a Nix binary cache. Regarding the |
Given this good news, I would be inclined to close this as completed. Looking into the far distant future, we may ask different questions such as: is there a way we can incentive adoption so that more and more deployments would naively support nix store paths, but that's more of a governance question. Let me know if I should reopen. -> nix-snapshotter. |
@blaggacao I would reopen this because spec changes are still important. |
Find an implementation natively supporting Nix store paths in this comment.
I'm seeking advice on a long-held consideration to amend the OCI image spec with means to add an infinite number of content==location-addressed blobs to the file system.
Let's call these "Nix Store Paths", that reside in a special place on the (linux) filesystem under
/nix/store/<hash>-<name>
.Because of the content hash being part of the location address, location and content adressess become interchangeable and conflicts in the location-address space never occur.
Hence, no "merging" is required, no overlay filesystem, and above all, no layer limit, which sets the stage for arbitrary deduplication.
Wait a sec; how so? Doesn't linking expect well-known location-addresses?
The two porposed mime types would "read" as follows:
An OCI runtime should be able to:
A NAR archive is a completely reproducible TAR variant, a go implementation can be found here: https://github.com/nix-community/go-nix/tree/master/pkg%2Fnar
Can somebody give me some pointers how I should approach this?
/cc @AkihiroSuda for the work on
builtkit-nix
- I was also trying to ping you on matrix, not sure if that's the right venue - https://github.com/AkihiroSuda/buildkit-nix/cc @nlewo for a prototypical implementation of the fundamental synergies at a different layer of the stack with his skopeo patches over at https://github.com/nlewo/nix2container
The text was updated successfully, but these errors were encountered: