Discussion of a new manifest #41
Replies: 2 comments 13 replies
-
I think that's up to interpretation. Plenty of folks are clearly successfully using the format to store other stuff, otherwise it's just a small misnomer in the content type, that no user really sees anyway.
Sure, why not? If there's use cases and sufficiently motivated people to push for the change, I think that could be great. I assume it wouldn't be a gigantic change for registries to loosen validation for incoming requests, at least not as gigantic as supporting a whole new artifact type.
Versioning is important for negotiating and staging breaking changes, but if spec authors are careful not to add breaking changes (requiring a new field, changing the type of a field, inventing and requiring new types), then I believe a sufficiently diligent, deliberate evolution process could improve the spec over time without hard breaking changes, and without requiring registries and clients to adopt new versions. Luckily, the OCI innovation process has lots of experience being very diligent and deliberate, only it's so slow that it's hardly evolved at all. 😅
Again, I think this becomes just an unfortunate and mostly invisible misnomer. The thing we've previously called an "image" is just a descriptor + zero or more layer blob references in a meaningful order. People are already successfully building lots of useful software inside that definition, and I don't see many people eager to stop and build on a new spec instead, just to remove the term "image" from their codebase. (Meta point: Is this really the best, most visible forum to be having this discussion? We should make sure all interested parties are involved, and a new 1500-word discussion in the artifacts repo might not be the best place to advertise this conversation, to get the most diverse set of responses -- maybe a TOB issue, with relevant TOB members cc'ed?) |
Beta Was this translation helpful? Give feedback.
-
great discussion.. heads up this is going read only on repo archive... |
Beta Was this translation helpful? Give feedback.
-
OCI Artifacts PR #29 and PR #37 both recommend a new manifest be used for distribution of new artifact types. The following captures the discussions that led to the recommendation for a new manifest type:
oci.image.manifest
is specific to the container image format. While it made sense at the time, distribution has evolved to support a wide range of artifact types including Helm, Singularity, CNAB, OPA, WASM, VMs, and many other types.How do the OCI specs refactor the needs of one artifact type from all other artifact types?
config
REQUIRED: the image-spec declares config as REQUIRED. Most artifact types don't need a config object. To work around this, OCI Artifacts promotes pushing a null config. This has surfaced as a challenge for some registries to support.Would the image-spec change
config
to OPTIONAL?layers
are REQUIRED and specified as ordinal. Most other artifact types store collections of files, but are not ordinal. Some artifacts may not have any layers/blobs. Further, other reference types may persist information as annotations, enhancing an existing artifact without adding a layer/blob.Would the image-spec change
layers
to OPTIONAL?artifactType
property to the image-spec, but the feedback was to not add new properties to the versioned spec, as it risked breaking existing clients. Instead, the working group "hacked" in the concept by (overloading) themanifest.config.mediaType
. This may be a bit confusing as most non-image artifacts do not have a config.See: Defining OCI Artifact Types for more info.
Note: Phase 1 is NOT intended to support OCI image types. Phase 2, with the evolution of PR WIP generic object spec #37, is when phase 1 reference types and other artifacts could move/migrate.
After working through all the above reasons for creating a new manifest for the existing issue, consider:
Reference Type support: How to enable a wide range of new scenarios for adding information to a registry, including Signature(s), SBoM, Security Scan Results, meta-data, gpl source that must extend (reference) existing types? While supporting lifecycle management to know how and when to delete reference artifacts
Considerations for adding to the existing manifests
Various discussions did focus on using the OCI image-spec. Whether it had (required) a new version, or would rely on the existing version.
Why not add to the existing image-spec?
Separation of concerns: #1 above. Even if the image-spec could solve all the problems, it's still the container image-spec, which creates a requirement on the image-spec maintainers to account for all other artifact types, and similar scenarios. By listing certain requirements for each new artifact type, the distribution and artifact specs may/can then consider implementing a generic approach to cover the new requirements. For instance, making a config object just one of the descriptors in the
[blobs]
collection, providing the container image artifact type to have the desired behavior, but not having to support a REQUIRED or even OPTIONAL property in the schema. The container image clients could look for a single configmediaType
and process it.NOTE:, Phase 1 isn't proposing the image-spec move to PR#29. This is an example of how a future generic artifact manifest (PR #37) could support artifact-specific scenarios.
Why can't optional properties be added to the image-spec
It can, but what does optional imply? There's a pretty big difference between adding an optional property like an annotation that registries largely ignore and something that impacts GC and lifecycle management. The purpose of a spec is to provide confidence in consistency to those that support a specific version of the spec. If 1.0 image-spec manifests PUSHED to a registry sometimes had
subjectManifest
, how would a user understand the expected behavior from a registry that accepted it because the registry didn't limit additional schema elements? How would that compare with registries that support the reference type concept? How would users get the artifacts out of the registry without the/referrers
api?Garbage collection of untagged manifests
Registries implement GC differently, however many have the concept of untagged manifests are garbage collected. Docker Hub will maintain any digest that was once associated with a tag. Associated in the image-spec means the digest was tagged. ACR & ECR support automatically deleting untagged manifests, which many customers have on by default as it allows them to manage the sizes of their registries. The reference types are considered enhancements to an artifact, and not thought to have a life unto themselves, therefore they wouldn't be tagged.
In the image-spec case, a registry may accept an
oci.image.manifest
with thesubjectManifest
, but without GC implementations to understand the reference, the artifact would be automatically deleted at some later point.While clients could tag the reference types, this only adds to the ambiguity of content lifecycle management and how users might delete the related artifacts when the
subjectManifest
is deleted.NOTE: See Content Discovery for more info
Timed release for the fall of 2021
The Notary v2 effort has a time-bounded/quality-focused release and needs to ship a core set of capabilities. The work has identified a larger set of usecases that benefit from reference types, including SBoMs, Scan Results. Because the effort is time & quality bound, the working group knows they'll intentionally miss certain features and not have a chance to incorporate some things that will surface along the way. By defining a different manifest format, registry ingestion can isolate the handling of this specific versioned release. If the properties merged into the image-spec, ingestion would forever be checking for the existence of these elements. As registries move to RC2 and v1 of the manifest, this codepath is isolated and eventually retired. (recognizing it's hard to deprecate content, but all the more important to have this codepath isolated)
Why create a new manifest now, vs. wait for PR #37 to complete?
This comes back to how should registries process manifests with data that must be acted upon. To provide any reasonable performance, registries will want to reverse index the
subjectManifest
, likely with thereferenceType
property to provide filtered results in the/referrers
api.The underlying schema of the phase 1 manifest will change as it evolves to PR #37. If the above parsing logic is isolated using the distinct type as a switch, registries can isolate the impacts in execution/implementation.
Image toolchain isolations
Container image toolchains know to pull images by tags and sometimes by digests. However, they also know how to parse specific manifest types (
image.manifest
,image.index
). By isolating this new artifact reference type, image clients are able to parse it with separate logic and thus avoid partial failure scenarios.Should image toolchains know about signatures
Notary v2, refers to this as the Trojan Horse Protection. Before the image toolchains of containerd, Kubernetes and every other container runtime even pulls the image, some sort of admissions controller is implemented. k8s may use OPA/Gatekeeper, while non k8s implementations may use a containerd plug-in. Within the admission controller, the signature is discovered, pulled, and verified. Only if it passes does the pipeline continue, handing the image pull url to the container runtime. This separation of concerns enables signature verification clients to focus on their artifact type while the container runtime goes unchanged.
Verification implementations
To test if a new manifest model would even work, several end-to-end verification implementations were created. Notary v2 Prototype 1 tested a
references
api from CNCF distribution that returned a list of descriptors, while the current prototype returns a collection of referenced manifests to make the desired artifact pulls efficient.The current implementations are:
Call for early adopters
While cloud providers and vendors could quickly move forward with a vendor-specific solution, a vendor-neutral approach, where artifacts and their references can move across vendor-provided registries, is a better starting point. Some registry products and projects may decide to wait for a more stable solution.
However, those that wish to implement secure supply chain scenarios within registries sooner would have a vendor-neutral design to leverage, get feedback and help secure the cloud-native supply chain of artifacts.
Beta Was this translation helpful? Give feedback.
All reactions