Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: drop embedded config data from manifest image #774

Merged
merged 2 commits into from
Feb 11, 2025

Conversation

plobsing
Copy link
Contributor

@plobsing plobsing commented Feb 11, 2025

The OCI spec provides a data field in order to embed a copy of an image's config directly into that image's manifest:

data string

This OPTIONAL property contains an embedded representation of the referenced content. Values MUST conform to the Base 64 encoding, as defined in RFC 4648. The decoded data MUST be identical to the referenced content and SHOULD be verified against the digest and size fields by content consumers. See Embedded Content for when this is appropriate.

This is intended as an optimization, in order to reduce the number of network round-trips incurred to pull a container from a remote repository.

However, as highlighted in the release announcement for the new data field, the contexts in which an embedded config is beneficial can be a subtle determination. Further, in the same announcement, the spec clarified that registry implementations usually enforce a maximum size on manifests, an obscure limit which incautious use of the data field can trigger.

Given the trade-offs, dropping any embedded data (e.g. that might have been propagated from a base image) seems the preferable option. The alternative — updating embedded data to match the config on every change, as proposed by #757 — is less universally correct and only sometimes more performant. Either option would fix #756 .

The [OCI spec](https://github.com/opencontainers/image-spec/blob/main/descriptor.md#properties)
provides a `data` field in order to embed a copy of an image's config
directly into that image's manifest:

> data string
>
> This OPTIONAL property contains an embedded representation of the referenced content. Values MUST conform to the Base 64 encoding, as defined in RFC 4648. The decoded data MUST be identical to the referenced content and SHOULD be verified against the digest and size fields by content consumers. See Embedded Content for when this is appropriate.

This is [intended as an optimization](https://github.com/opencontainers/image-spec/blob/main/descriptor.md#embedded-content),
in order to reduce the number of network round-trips incurred to pull a
container from a remote repository.

However, as highlighted in the release announcement for the new
[`data` field](https://opencontainers.org/posts/blog/2024-03-13-image-and-distribution-1-1/#data-field),
the contexts in which an embedded config is beneficial can be a
subtle determination. Further, in the same announcement, the spec
clarified that registry implementations usually enforce a
[maximum size](https://opencontainers.org/posts/blog/2024-03-13-image-and-distribution-1-1/#manifest-maximum-size)
on manifests, an obscure limit which incautious use of the `data` field
can trigger.

Given the trade-offs dropping any embedded data (e.g. that might have
been propagated from a base image) seems the preferable option. The
alternative — updating embedded data to match the config on every
change, as proposed by bazel-contrib#757 — is
less universally correct and only sometimes more performant.
Either option would satisfy bazel-contrib#756 .
@plobsing plobsing requested a review from thesayyn February 11, 2025 04:42
@plobsing plobsing changed the title drop embedded config data from manifest image fix: drop embedded config data from manifest image Feb 11, 2025
@thesayyn thesayyn merged commit 770c55e into bazel-contrib:main Feb 11, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

oci_image inherits data field from config in base image
2 participants