OCI Artifact Manifest - with weak reference support #27

SteveLasker · 2021-01-25T22:53:20Z

The OCI artifact manifest provides a means to define a wide range of artifacts, including a chain of dependencies of related artifacts. It provides a means to define multiple collections of types, including blobs, dependent artifacts and referenced artifacts, expanding on the work done around OCI Artifacts based on oci.image.manifest, addressing the challenges attempted with image index

This is an initial PR for discussion.

Signed-off-by: Steve Lasker <[email protected]>

dmcgowan · 2021-01-28T00:37:08Z

This first draft is looking good. I have two pieces of high level feedback advocating to keep as much simplicity in the manifest as possible.

I see there are 3 different keys to represent lists of referenced objects. These could be categorized into two types of references, weak references and strong references. The strong references represent objects which must be kept around as long as the manifest exists. The weak references represent objects the manifest associates with, and the manifest may not need to be kept around if none of its weak references still exist. I think adding in the weak reference types (called "dependencies" now but seems name is up for debate) is a good addition to the data model and solves the mentioned use cases well. I don't see a need to have multiple different keys represent the strong references since dividing objects into one or the other could be considered metadata. Which brings me to my second point...
Keep the manifest definition as simple as possible and use annotations for metadata. Each type may use metadata differently and the metadata may not be relevant for the distribution of content. The artifact type and artifact name could be defined in annotation if necessary, but normally the config media type will need to be checked by clients for artifact compatibility. Having two fields which define the same property and must agree just open up more cases to check and error out. For images, since this is a new manifest type, there is no need to treat layers (or "blobs") as special case references.

SteveLasker · 2021-01-28T20:32:03Z

Thanks, @dmcgowan,
KISS (Keep It Silly Simple) is definitely a key goal. The manifest should provide just enough information to do what needs to be done, enabling registries to work generically over different artifacts, while providing client tools the info they need to work with their specific artifacts.

These could be categorized into two types of references, weak references and strong references.

We've played with the collections a few ways, including a single collection that contained a direction and strictness (weak/strong, hard/soft, lose/tight?).

Again, please don't read too much into the names as we wanted to figure if the structure worked, and we'd figure out the proper names later, but here was an example of the single collection:

{
  "mediaType": "application/vnd.oci.artifact.manifest.v1+json",
  "artifactType": "application/vnd.cncf.notary.v2",
  "config": {
    "mediaType": "application/vnd.cncf.notary.config.v2",
    "digest": "sha256:b5b2b2c507a0944348e0303114d8d93aaaa081732b86451d9bce1f432a537bc7",
    "size": 102
  },
  "references": [
    {
      "mediaType": "application/vnd.cncf.notary.v2.json",
      "digest": "sha256:9834876dcfb05cb167a5c24953eba58c4ac89b1adf57f28f2f9d09af107ee8f0",
      "size": 32654,
      "direction": "child",
      "strictness": "weak|strong"
    }
  ]
}

The problem with a single collection:

Mixes content stored as blobs (layers/blobs/config) with content stored as a manifest (links to an image)
MediaType could be parsed to understand which mediaTypes would be stored as manifests vs. blobs, but that requires the registry to know about all the mediaTypes. Again, something we're trying to avoid as file systems can store any file. The file.extension is optional with an optional way to register how extensions are visualized.

By using different collections to represent:

blobs of content: to make up the thing, stored as blobs (downward references)
dependencies, dependent-upon, required/s: which are reverse pointers. This avoids a property required for tracking when/if it should be deleted
references, weak-references, loose-references: which are the things that are good to know, enabling validation, copying and visualization, but don't get tracked for deletion. Although, a client CLI could parse these and ask the registry to delete these if that's the experience wanted.

metadata

There's lots of interesting metadata scenarios. Some are known at the time the artifact is submitted. Others are later. The problem is how do we address metadata added after, as we need a way to add without changing the digest. I'm punting this to the OCI Metadata Service round of discussions.

artifactType/mediaType

Not all artifacts will have config, yet different artifacts may share the same config schema. In ORAS, we had to explicitly enable the scenario of defining a manifest.config.mediaType, without having a config blob. But, this was mostly a concession to avoid having to rev the image-index.manifest to identify what type of artifact the schema represented.

Since we're defining a new manifest, it seemed time to lift the artifactType property to the root. This enables the manifest.config.mediaType to be decoupled from the manifest.artifactType, allowing them to rev, or even be defined independently. I could see a few different artifact types sharing the same config schema, such as different ways to represent images with different compression formats, or even the new IBM z/OS types.
If/when we get a clean new artifact.manifest, I could see all non-container image artifacts moving from using the current oci-image manifest to this artifact.manifest as they'd have more freedom to define references, and cleanup other aspects. Who know, maybe OCI Image manifest v2 might switch as well...

there is no need to treat layers (or "blobs") as special case references. [from dependencies]

You're correct, these are both "hard" references from the artifact manifest perspective. The two main differences noted above is directionality for ref-counting, and whether the registry looks in the blob store for content or the manifest store.

sudo-bmitch · 2021-01-29T01:57:36Z

Will references in an artifact always be local to the current repository? I think Helm breaks that logic with image references, but Helm also breaks a lot of the artifact logic since the image names themselves could be templated to point to a different location by a values.yaml after pulling the chart, so it may be worth excluding Helm image references from any of the chained reference logic. I suspect having everything in the same repo is important for GC, but also useful for portability of manifests and their artifacts.

sudo-bmitch · 2021-01-29T01:57:51Z

What are the methods to lookup an artifact? Is it only with a query using a manifest sha, or can tags point to artifacts? If so, do we need to namespace a tag lookup to the type of artifact we are looking for, so that artifacts with tags don't accidentally collide with images or other artifacts in the same repo? That question comes from looking at how TUF could possibly be implemented with artifacts, and they may want e.g. a "snapshot" TUF artifact in a repo, that applies to multiple manifests, and that they can lookup and update at any time.

nishakm

This is pretty comprehensive from the CNAB point of view. I think more information is needed at the container level. Can I make use of your examples to play around with scenarios? I've put some of the scenarios in comments on artifact-manifest.md.

nishakm · 2021-01-28T23:57:47Z

artifact-manifest/artifact-manifest.md

@@ -0,0 +1,462 @@
+# OCI Artifact Manifest
+
+The OCI artifact manifest provides a means to define a wide range of artifacts, including a chain of dependencies of related artifacts. It provides a means to define multiple collections of types, including blobs, dependent artifacts and referenced artifacts.


I think what we want is a list of related artifacts. The SBoM of the artifact would typically have the name of the artifact and its relation to other artifacts defined in the manifest. The SPDX spec has a comprehensive list of possible relationships to pick from https://spdx.github.io/spdx-spec/7-relationships-between-SPDX-elements/. Defining "dependencies" and "provides" relationships in the SBoM takes the burden away from defining it here.

My working assumption how SBoMs would be represented in a registry would be a new artifact, with a dependency to the thing artifact it's declaring it's an SBoM of. Essentially, it would be the same as a Notary v2 signature.
The SBoM artifact would have its own blobs, optionally a config, and a dependency (manifest entry in the latest example). It would not have a tag, has the tag of an SBoM isn't really interesting as the SBoM is an "enhancement" to an existing tagged artifact in a repo.

nishakm · 2021-01-29T00:23:18Z

artifact-manifest/artifact-manifest.md

+
+To support artifact movement to various registry and namespace structures, the registry and path must not be embedded within the artifact definition. Client CLIs and configurations will provide default locations and mappings for where to find the referenced content.
+
+Artifacts that reference other artifacts must include an OCI Artifact Descriptor which includes the `manifest type`, `digest`, `size` and `repo:tag` of the artifact, however it will defer resolution of the reference to client tools that MAY reconstitute the references from multiple repositories and/or registries.


It sounds like we expect artifacts to be identified by repo:tag. So if we have a container image wordpress:v5 do we expect a related SBoM to be named wordpress-sbom:v5? Can we upload singleton blobs and then reference that in wordpress-bundle:v5?

Artifacts that are things customers want to directly reference would have a :tag. However, I'm trying to identify enhancements (SBoM, Signatures, GPL source, ...) as things that are dependent upon another artifact, but don't really get shown on their own.
Notice the difference with this image: https://github.com/opencontainers/artifacts/raw/8760ac5e42bc7a36802559481b34e2f4e8584492/artifact-manifest/media/repo-listing-flat.svg

and this image:

https://github.com/opencontainers/artifacts/raw/8760ac5e42bc7a36802559481b34e2f4e8584492/artifact-manifest/media/repo-listing-attributed.svg

The first shows the artifacts as an equal collection. The second shows a set of artifacts as attributes to the thing they're enhancing.

If I think about this scheme from the perspective of a user who does not want to see the enhancements until they have to, the query would be something like "do you have the signatures/SBoM/sources for mysql@sha256:digest?". By what mechanism would a client find out when "pull mysql:8" downloads an image manifest with no reference to the other enhancements? Would we need another endpoint for this?

I'll add some examples for the list APIs to get the content. But, yes, there will be a new listing API as we must support push, discover, pull to complete the experiences.
The premise is a client can ask for artifactType of application/vnd.openssf.sbom.v1+json to get just the SBoM documents for the mysql:8 image. You could also query by the digest of the MySQL image.

nishakm · 2021-01-29T00:48:21Z

artifact-manifest/artifact-manifest.md

+
+![notary v2 signature](media/notaryv2-signature.svg)
+
+The Notary v2 signature would reference an artifact, such as the `wordpress:v5` image above. Notice the directionality of the references. One or more signatures may be added to a registry after the image was persisted. While an image knows of it's layers, and a Notary v2 signature knows of its config and blob, the Notary v2 signature declares a dependency to the artifact it's signing. The visualization indicates the references through solid lines as these reference types are said to be hard references. Just as the layers of an OCI Image are deleted (*ref-counted -1*), the blobs of a signature are deleted (*ref-counted -1*) when the signature is deleted. Likewise, when an artifact is deleted, the signature would be deleted (*ref-counted -1*) as the signatures have no value without the artifact they are signing.


In a situation where someone uses an image say debian:bullseye which comes with a list of signatures, and adds dependencies for their middleware, say Go, and then pushes the image as golang-debian:1.16, would the debian signatures still be linked? How can one verify where the non-golang dependencies came from? Do those signatures get added as well? My concern is that each layer in the container image is it's own build and release pipeline, and they can get pretty complicated (the wordpress Dockerfile is actually a good example of how complicated).

I know some early versions of the tern SBoM focused on adding layers to an image. That has some challenges as you're changing the image-spec, and breaking the digest/tag of the thing you're adding.

In this design, you add the SBoM as a new artifact, and add a reference to the other manifest. See the notary.v2 signature of mysql:8 example.
This keep the artifacts separate, but referenced, enabling the current image runtimes to continue, while adding SBoM's that are also signed.

nishakm · 2021-01-29T02:22:00Z

artifact-manifest/artifact-manifest.md

+    "digest": "sha256:b5b2b2c507a0944348e0303114d8d93aaaa081732b86451d9bce1f432a537bc7",
+    "size": 102
+  },
+  "blobs": [


Suppose Notary signatures were used to sign the base OS of mysql:8, I would imagine the blob and dependency list would be appended to in building the final mysql:8 image. Do you think so? In which case, how would a client know which signature is for which image?

We haven't targeted layer signing yet.

nishakm · 2021-01-29T02:25:30Z

artifact-manifest/artifact-manifest.md

+      "size": 32654
+    }
+  ],
+  "dependencies": [


It seems to me the relationship between the artifacts needs more description at this level. I was hoping the SBoM would provide that but perhaps that's something that can be helpful to clients.

I need to add more descriptive text to the examples we've been iterating upon.

jonjohnsonjr · 2021-01-29T19:52:58Z

artifact-manifest/artifact-manifest.md

+
+```json
+{
+  "schemaVersion": 2,


Everything here and below will need to be "schemaVersion": 3, I think, as these are breaking changes?

This is actually a new schema as we're introducing a new oci.artifact.manifest that would rev independently from oci.image and oci.index.

Since the oci.artifact.manifest is versioned in the mediaType=application/oci.artifact.manifest.v1, I've removed the schemaVersion element.

No, that's not how schemaVersion works.

I've added "schemaVersion": 1 back in as the first scheamVersion for oci.artifact.manifest

jonjohnsonjr · 2021-01-29T20:06:14Z

artifact-manifest/artifact-manifest.md

+  ],
+  "references": [
+    {
+      "artifact": "wordpress:5.7",


I don't like tags in here. Not sure about repositories...

Does the repo refer to a sibling repository? Child repository? Top-level?

Can this just be under a org.opencontainers.image.ref.name annotation?

I'll have an update later this week that will use a mix of annotations and descriptors to cleanup the references.

The more I think about the repository thing, the more I dislike it. Repositories are a natural security boundary. You're introducing a multi-repository artifact, which presents a lot of problems in terms of authentication and management.

We removed catalog because it violated these boundaries.

Cross-repository mounting is the only thing similar to this that remains, and the authentication for that is really hard to get right. This worries me.

So, a helm chart, a CNAB and other artifacts are already cross repository artifact types. But, we are describing these as loose references that the client can validate independently. The blobs and manifests that have a depends-on annotation are hard dependencies and must be in the same repo, and must exist to complete the manifest put.
So, I think we're straddling this problem, supporting the artifact types, while not putting an undue burden on auth boundaries that don't already exist today.

What would the client do with a helm cross repository artifact reference?

I ask since the values for the helm template could override the default tag or even use a different registry server, and we can't (or at least shouldn't) template the loose reference in the registry, so I feel like any helm client should ignore this field. If we use the field for signing, does a different helm values invalidate the signed chart? If we use the field for mirroring, would mirroring tools potentially copy images that go unused?

I'm having a hard time seeing the value add for including a potentially templated field into a non-templated registry object that doesn't introduce the risk of breakage.

bergwolf · 2021-02-02T13:00:44Z

The Artifact Manifest idea looks great! It helps to attach artifacts to images without affecting existing container images and thus keeps backward compatibility.

What's more, it looks super easy to add new artifact types based on the artifact manifest. For example, a nydus artifact manifest would look like:

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.artifact.manifest.v1+json",
  "artifactType": "application/vnd.cncf.nydus.v1",
  "config": {
    "mediaType": "application/vnd.oci.image.manifest.v1.config.json",
    "digest": "sha256:9e988712154fcc2ceda5602eb1d98c1f28299ba6fbf0be49d3717c35a2d76674",
    "size": 1102
  },
  "blobs": [
    {
      "mediaType": "application/vnd.cncf.nydus.bootstrap.v1.tar+gzip",
      "digest":
"sha256:f6bb0822fe567c98959bb87aa316a565eb1ae059c46fa8bba65b573b4489b44d",
      "size": 32654
    },
    {
      "mediaType": "application/vnd.cncf.nydus.blob.v1",
      "digest":
"sha256:f6bb0822fe567c98959bb87aa316a565eb1ae059c46fa8bba65b573b4489b44d",
      "size": 72832
    },
    {
      "mediaType": "application/vnd.cncf.nydus.blob.v1",
      "digest":
"sha256:f6bb0822fe567c98959bb87aa316a565eb1ae059c46fa8bba65b573b4489b44d",
      "size": 928324
    }
  ],
  "references": [
    {
      "artifact": "mysql:8",
      "artifactType": "application/vnd.oci.image.manifest.v1.config.json",
      "mediaType": "application/vnd.oci.image.manifest.v1.config.json",
      "digest":
"sha256:3c3a4604a545cdc127456d94e421cd355bca5b528f4a9c1905b15da2eb4a4c6b",
      "size": 16724
    }
  ]
}

For those who are not familiar with nydus, it is an image acceleration service that hugely reduced the time of pulling container image by on demand reading image contents when container starts. It is currently widely used in both Alibaba and Ant Group. nydus is open source and maintained as part of the CNCF incubator project Dragonfly. That's why I'm suggesting the same application/vnd.cncf prefix like the other artifact types.

The nydus artifact manifest follows the same schema for other artifact types, while a new artifact type and two media types are added:

"artifactType": "application/vnd.cncf.nydus.v1"
"mediaType": "application/vnd.cncf.nydus.bootstrap.v1.tar+gzip"
"mediaType": "application/vnd.cncf.nydus.blob.v1"

And It has a references relationship with the original mysql:8 container image. These information would help registry to index and show the relationship between different image, as well as help container runtime to choose if it wants to launch containers with image acceleration.

Currently we have an nydus image annotation/os.feature hack to hide nydus image details from registry. However, with the OCI artifact manifest, we can abandon such hack and have the registry support natively and smoothly.

@SteveLasker would it make sense to list nydus as one of the supported artifact types to show how the artifact manifest spec can help other artifact types?

Signed-off-by: Steve Lasker <[email protected]>

SteveLasker · 2021-02-03T03:40:48Z

would it make sense to list nydus as one of the supported artifact types to show how the artifact manifest spec can help other artifact types?

Yes, it would be great to add the example, as it's a new example I hadn't yet thought of. But, if it fits, even better. Let me digest a bit more and align with the new-new manifests collection with an annotation for references.

shizhMSFT · 2021-02-03T05:51:16Z

artifact-manifest/artifact-manifest.md

+      "digest": "sha256:3c3a4604a545cdc127456d94e421cd355bca5b528f4a9c1905b15da2eb4a4c6b",
+      "size": 16724,
+      "annotations": {
+        "oci.distribution.relationship": "depends-on"


Hard dependencies affecting functions like garbage collections should not in annotations, which is an optional section.

This is a good point. Having annotations for client or even filtering information is super interesting. Having a registry implement garbage collection and ref-counting seems less structured and should be surfaced as a first class property.
I'll make another PR with some comparison examples.

See latest PR commit for A/B options:

bergwolf · 2021-02-03T09:02:10Z

@SteveLasker Thanks a lot! I tried to amend the above nydus artifact manifest example to fit the new-new manifests collection with an annotation for references. Please see if I understand the new schema correctly:

{
  "mediaType": "application/vnd.oci.artifact.manifest.v1+json",
  "artifactType": "application/vnd.cncf.nydus.v1",
  "config": {
    "mediaType": "application/vnd.oci.image.manifest.v1.config.json",
    "digest": "sha256:9e988712154fcc2ceda5602eb1d98c1f28299ba6fbf0be49d3717c35a2d76674",
    "size": 1102
  },
  "blobs": [
    {
      "mediaType": "application/vnd.cncf.nydus.bootstrap.v1.tar+gzip",
      "digest": "sha256:f6bb0822fe567c98959bb87aa316a565eb1ae059c46fa8bba65b573b4489b44d",
      "size": 32654
    },
    {
      "mediaType": "application/vnd.cncf.nydus.blob.v1",
      "digest": "sha256:f6bb0822fe567c98959bb87aa316a565eb1ae059c46fa8bba65b573b4489b44d",
      "size": 72832
    },
    {
      "mediaType": "application/vnd.cncf.nydus.blob.v1",
      "digest": "sha256:f6bb0822fe567c98959bb87aa316a565eb1ae059c46fa8bba65b573b4489b44d",
      "size": 928324
    }
  ],
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:8c3a4604a545cdc127456d94e421cd355bca5b528f4a9c1905b15da2eb4a4c31",
      "size": 1578,
      "annotations": {
        "oci.distribution.relationship": "references",
        "oci.distribution.artifact": "mysql:8",
        "oci.distribution.artifactType": "application/vnd.oci.image.v1"
      }
    }
  ]
}

To explain it in more details:

the nydus artifact manifest is identified by application/vnd.cncf.nydus.v1 artifact type;
each nydus image has a bootstrap layer (which is a container rootfs file system metadata pack) and one or more blob layer (which is compressed/chunked container rootfs data). We use two new media types (application/vnd.cncf.nydus.bootstrap.v1.tar+gzip and application/vnd.cncf.nydus.blob.v1) to describe them;
both the two nydus media type blobs can be refcounted just like other media types
the difference between the two new nydus media type blobs is that, container runtime only need to pull a small nydus bootstrap before starting a container, and the nydus blobs can be fetched from registry in a deferred on-demand manner;
it has a reference relationship with the original mysql:8 image, which can be persisted either within the same registry or in a different registry

With such an artifact manifest,

At the registry side, it can list/show a container image together with its nydus accelerated image.
At container runtime side, when given a nydus artifact manifest as a container image, it can choose to either pull the nydus image to start container quickly, or just pull the original container image if nydus components are not available.

reasonerjt · 2021-02-03T13:25:22Z

artifact-manifest/artifact-manifest.md

+
+### Helm Charts & CNAB
+
+A Helm chart can represent the images it references within the chart. These references are loose references as they may be persisted in different registries, or may change as a values file is updated. However, the chart may also be persisted together as a collection of artifacts in a registry. The lines are dotted to represent the loose reference. Deleting the `wordpress-chart:v5` may, or may not delete the images as the images have value unto themselves.


Does it mean the manifest of the chart in this PR only reflects the case when the images and chart are persisted as a collection and the images are NOT loosely referenced?

This is an area of churn, and I'm going to create two options for how we can support the two distinct scenarios.
A Notary v2 signature, an SBoM are enhancements to an existing artifact. They depend-upon the thing they're enhancing to have meaning. When the artifact they enhance is deleted, these artifacts would also be deleted.

However, in the Helm case, a helm chart can reference other images, which may, or may not be stored in the same registry. The combination of the digest and artifact name are ways to identify it as a unique entity (digest) or stable/updateable tag. The registry will store these values in the manifest, but won't do anything, unless a client tells it to do something.

For instance, compared to the Notary & SBoM example above, a helm chart can be copied to another registry or deleted without impacting the images they reference.
However, the oci-reg CLI (imagined) would be able to read the manifest and do a pull and or copy of the helm chart and its referenced images. If the client says use-digest a registry can identify the image by its digest as it's unique, regardless of where in the registry it lives. If it says use-tag, it would need a registry.config to assist where to find that tag.

reasonerjt · 2021-02-03T13:31:53Z

@SteveLasker For some of the artifacts such as CNAB or Helm charts(v3), there are tools to package them as OCI artifacts and store them in OCI registries such as helm v3, so does this PR mean to introduce a break-change to the existing artifacts?

reasonerjt · 2021-02-03T13:42:48Z

artifact-manifest/artifact-manifest.md

+      "size": 16724
+    }
+  ],
+  "manifests": [


Does it assume the referenced manifest is persisted in the same registry?
This breaks a very common usage pattern that the referenced images are set in values.yaml of a chart, and the deployer of a chart will set the images when he issues helm install

The idea is they can be stored in the same registry and the registry can understand the references, genericially. This enables oci-reg copy scenarios and optional oci-reg delete --with-references scenarios. But, you're correct that this information is duplicated in the helm chart, as this artifact.manifest didn't (still doesn't) exist, yet. If/when this artifact.manifest gets adopted, would a combination of this manifest reference, along with oci-reg.config enable a helm chart to evolve, where the chart references the thing in the manifest. In theory, this would make it far easier to move charts across registries without having to change the chart.

Signed-off-by: Steve Lasker <[email protected]>

SteveLasker

Based on feedback around the types of collections, I've incorporated the following into the latest:

Changed dependencies to manifests. This represents an oci.artifact.manifest to have "dependencies" on a collection of blobs, a collection of manifests, and an optional config blob.
Added A/B options for splitting out the references.
- OPTION A uses the manifests collection with an annotation: "oci.distribution.relationship": "depends-on". The pros are this is a single manifests collection. The cons are it's not a great design for a registry to read an annotation for managing garbage collection of core components.
- OPTION B creates a second references collection of manifests. All descriptors in this collection are lose/weak references that may not be resolved in the registry. And, not subject to garbage collection. A client MAY delete-references, as noted in the oci-reg cli example.

SteveLasker · 2021-02-04T02:44:25Z

@SteveLasker For some of the artifacts such as CNAB or Helm charts(v3), there are tools to package them as OCI artifacts and store them in OCI registries such as helm v3, so does this PR mean to introduce a break-change to the existing artifacts?

You could think of it as a new version. Let's say a new version of CNAB and/or Helm could use this new manifest, but that's a choice for these communities to make. As with anything that's already shipped, with limitations, it's always a choice for whether a change provides enough value. It's my hope the references solves many of the limitations of Artifacts v1, based on the image-manifest that makes it worth the change, and tooled in such a way it adds enough value with minimal breaking change implications.

josephschorr · 2021-02-06T22:55:36Z

@SteveLasker Overall looks good. My one question/slight concern is around references: With above (unless I'm mistaken) it would be possible to form a chain of dependent manifests as large or as long as clients specify. Do we have any concerns about this proving hard to mirror? The one major benefit of the current manifest list design is that it is (relatively) flat: a tag points to a single list and the list points to a set of manifests, but it cannot go beyond that.

What is the expected direction for tooling that will need to mirror the current manifests? Walk the manifests to determine the full set to replicate?

jonjohnsonjr · 2021-02-07T00:30:24Z

The one major benefit of the current manifest list design is that it is (relatively) flat: a tag points to a single list and the list points to a set of manifests, but it cannot go beyond that.

This is not the case:

From https://github.com/opencontainers/image-spec/blob/master/image-index.md#image-index-property-descriptions

This descriptor property has additional restrictions for manifests. Implementations MUST support at least the following media types:

application/vnd.oci.image.manifest.v1+json

Also, implementations SHOULD support the following media types:

application/vnd.oci.image.index.v1+json (nested index)

Image indexes concerned with portability SHOULD use one of the above media types. Future versions of the spec MAY use a different mediatype (i.e. a new versioned format). An encountered mediaType that is unknown to the implementation MUST be ignored.

And the diagram from https://github.com/opencontainers/image-spec/blob/master/media-types.md#relations

sudo-bmitch · 2021-02-07T00:41:20Z

@josephschorr

What is the expected direction for tooling that will need to mirror the current manifests? Walk the manifests to determine the full set to replicate?

My own intention is to support recursive copies with a user configurable max depth on the recursion. And if the max depth is hit, send back a warning or error.

I'd also want to handle the types of artifacts differently (allowing users to filter in/out). For example, they may want to mirror images and notary signatures that point to those images, but may not be interested in mirroring helm charts that point to the image (along with everything else that helm chart points to).

And I'd want some directionality on the references for mirroring, e.g. helm charts may mirror child images, but child images don't mirror parent helm charts.

That assumes it makes sense to link helm charts and images, which I'm still uncertain of (charts can template an image name, and artifacts don't template a reference).

josephschorr · 2021-02-07T01:04:46Z

This is not the case:

Right... I forgot that it was intended that manifest lists could reference other lists. We just never did so because they were (practically speaking) never used.

My own intention is to support recursive copies with a user configurable max depth on the recursion. And if the max depth is hit, send back a warning or error.

Perhaps we should formalize this a bit then? I could see a scenario where someone pushes a large-chain of manifests to a repository and then, when the repository is mirrored, it fails at that point. Unsure if it would be better to fail at push time, though.

sudo-bmitch · 2021-02-07T01:49:28Z

My own intention is to support recursive copies with a user configurable max depth on the recursion. And if the max depth is hit, send back a warning or error.

Perhaps we should formalize this a bit then? I could see a scenario where someone pushes a large-chain of manifests to a repository and then, when the repository is mirrored, it fails at that point. Unsure if it would be better to fail at push time, though.

The tooling I'm working with is strictly client side, so a server decision to fail would be separate. I could see this being enforced by the registry similar to how user namespaces as the first part of the repository path is enforced by many registries, which is separate from the spec. I'd be interested in the http response codes when the server refuses to accept an artifact for reasons like this.

SteveLasker · 2021-02-10T02:53:20Z

I've taken a ton of great feedback that we need more bake-time on the references collection and scenarios, including the registry/repo mapping conversation

I'll be closing this PR, once I revert a few things. I ask folks to please focus on #29 for feedback, as it has the core link-list of manifests required for Notary v2, SBoM and other linked artifact scenarios like IBM and Google signing solutions.

SteveLasker · 2021-02-10T21:52:40Z

What is the expected direction for tooling that will need to mirror the current manifests? Walk the manifests to determine the full set to replicate?

I think we have to first define "mirror". Is a mirror at a registry/repo level? Meaning, whatever is in a given repo is mirrored?
Or, are we talking about gated mirrors, where the user opts-into specific content. Likely at the :tag level?
If at the repo, I suspect the client would pull all content in that repo and keep it current. New events or polling a list API, hopefully with a changedSince type parameter would work.
If at the tag, then it could walk the references and the artifacts that have the target tag referenced in the manifests collection.
Since manifests references must be in the same repo, it's less of a concern, as the repo, or the dependencies can be walked.

Perhaps we should formalize this a bit then?

I worry about the formalization of a dependency count. In some cases, it makes sense, like the 256 registry/namespace character limit. But npm and other package managers have kinda dealt with this. I see this as a client configuration scenario as it just seems hard to know upfront how the dependencies are either circular and closed or endless. I suspect a registry throttling scenario would solve this, but I'd have to think more. With references on-hold, I'm saving my brain cycles for how to best process this till later.

The tooling I'm working with is strictly client side, so a server decision to fail would be separate. I could see this being enforced by the registry similar to how user namespaces as the first part of the repository path is enforced by many registries, which is separate from the spec. I'd be interested in the http response codes when the server refuses to accept an artifact for reasons like this.

This ^ sounds like something to consider as we revisit this scenario. How do all package managers like Pypi, npm, ... manage these types of scenarios?

SteveLasker · 2021-02-10T21:53:47Z

I'm closing this one as it's no longer the active conversation. See #29 for the more focused, iterative approach. Happy to continue the background conversation here if folks want to keep thinking about it.
I've reverted the changes to the latest thinking on a references collection for weak references, supporting a dependency graph.

SteveLasker added 5 commits January 14, 2021 15:53

Staged content for artifact-manifest

e83eedb

Signed-off-by: Steve Lasker <[email protected]>

Staged content for artifact.manifest

7d44093

Signed-off-by: Steve Lasker <[email protected]>

Add artifact types

0e2c607

Signed-off-by: Steve Lasker <[email protected]>

Repo listing examples

d73291f

Cleanup OCI Artifacts descriptions

25b008c

Signed-off-by: Steve Lasker <[email protected]>

SteveLasker force-pushed the artifact-manifest branch from 874d36c to c3b001e Compare January 25, 2021 22:54

Add oci-reg mapping examples

d3e24fb

Signed-off-by: Steve Lasker <[email protected]>

SteveLasker force-pushed the artifact-manifest branch from c3b001e to d3e24fb Compare January 25, 2021 22:56

SteveLasker mentioned this pull request Jan 26, 2021

Initial proposal for changes to the OCI specification notaryproject/notation#30

Closed

Cleanup config and dependencies examples

b13f228

Signed-off-by: Steve Lasker <[email protected]>

nishakm reviewed Jan 29, 2021

View reviewed changes

jonjohnsonjr reviewed Jan 29, 2021

View reviewed changes

SteveLasker added 2 commits February 2, 2021 15:51

Converge dependencies & references to manifests

77cc3be

Signed-off-by: Steve Lasker <[email protected]>

Converge dependencies & references to manifests

8760ac5

Signed-off-by: Steve Lasker <[email protected]>

shizhMSFT reviewed Feb 3, 2021

View reviewed changes

reasonerjt reviewed Feb 3, 2021

View reviewed changes

A/B examples of manifests & references collection

a56aaad

Signed-off-by: Steve Lasker <[email protected]>

SteveLasker commented Feb 4, 2021

View reviewed changes

SteveLasker mentioned this pull request Feb 5, 2021

Added encryption mediatype doc #15

Closed

SteveLasker changed the title ~~OCI Artifact Manifest~~ OCI Artifact Manifest - with weak reference support Feb 10, 2021

SteveLasker force-pushed the artifact-manifest branch from ac5e8fa to a56aaad Compare February 10, 2021 21:36

SteveLasker closed this Feb 10, 2021

SteveLasker mentioned this pull request Feb 26, 2021

add nydus image artifact SteveLasker/artifacts#3

Closed

SteveLasker mentioned this pull request Mar 9, 2021

OCI artifact manifest, Phase 1-Reference Types #29

Closed

SteveLasker mentioned this pull request May 20, 2021

Add Index support for artifact type #25

Closed

SteveLasker mentioned this pull request Jun 10, 2021

Proposal: Add References opencontainers/image-spec#827

Closed

3 tasks

SteveLasker mentioned this pull request Aug 26, 2021

Question: Cross Repo References oras-project/artifacts-spec#26

Closed

SteveLasker mentioned this pull request Oct 20, 2021

Proposal: Working Group for Reference Types opencontainers/tob#96

Closed

SteveLasker mentioned this pull request Nov 29, 2021

Allow for rejection of image indexes with missing references opencontainers/distribution-spec#310

Merged

SteveLasker mentioned this pull request Apr 29, 2023

garbage-collection clarification needed opencontainers/distribution-spec#406

Open

		@@ -0,0 +1,462 @@
		# OCI Artifact Manifest

		The OCI artifact manifest provides a means to define a wide range of artifacts, including a chain of dependencies of related artifacts. It provides a means to define multiple collections of types, including blobs, dependent artifacts and referenced artifacts.


		To support artifact movement to various registry and namespace structures, the registry and path must not be embedded within the artifact definition. Client CLIs and configurations will provide default locations and mappings for where to find the referenced content.

		Artifacts that reference other artifacts must include an OCI Artifact Descriptor which includes the `manifest type`, `digest`, `size` and `repo:tag` of the artifact, however it will defer resolution of the reference to client tools that MAY reconstitute the references from multiple repositories and/or registries.


		![notary v2 signature](media/notaryv2-signature.svg)

		The Notary v2 signature would reference an artifact, such as the `wordpress:v5` image above. Notice the directionality of the references. One or more signatures may be added to a registry after the image was persisted. While an image knows of it's layers, and a Notary v2 signature knows of its config and blob, the Notary v2 signature declares a dependency to the artifact it's signing. The visualization indicates the references through solid lines as these reference types are said to be hard references. Just as the layers of an OCI Image are deleted (ref-counted -1), the blobs of a signature are deleted (ref-counted -1) when the signature is deleted. Likewise, when an artifact is deleted, the signature would be deleted (ref-counted -1) as the signatures have no value without the artifact they are signing.


		### Helm Charts & CNAB

		A Helm chart can represent the images it references within the chart. These references are loose references as they may be persisted in different registries, or may change as a values file is updated. However, the chart may also be persisted together as a collection of artifacts in a registry. The lines are dotted to represent the loose reference. Deleting the `wordpress-chart:v5` may, or may not delete the images as the images have value unto themselves.

OCI Artifact Manifest - with weak reference support #27

OCI Artifact Manifest - with weak reference support #27

Conversation

SteveLasker commented Jan 25, 2021

dmcgowan commented Jan 28, 2021 • edited Loading

SteveLasker commented Jan 28, 2021

sudo-bmitch commented Jan 29, 2021

sudo-bmitch commented Jan 29, 2021

nishakm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bergwolf commented Feb 2, 2021

SteveLasker commented Feb 3, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bergwolf commented Feb 3, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reasonerjt commented Feb 3, 2021

reasonerjt Feb 3, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SteveLasker left a comment

Choose a reason for hiding this comment

SteveLasker commented Feb 4, 2021

josephschorr commented Feb 6, 2021

jonjohnsonjr commented Feb 7, 2021

sudo-bmitch commented Feb 7, 2021 • edited Loading

josephschorr commented Feb 7, 2021

sudo-bmitch commented Feb 7, 2021

SteveLasker commented Feb 10, 2021

SteveLasker commented Feb 10, 2021

SteveLasker commented Feb 10, 2021 • edited Loading

dmcgowan commented Jan 28, 2021 •

edited

Loading

reasonerjt Feb 3, 2021 •

edited

Loading

sudo-bmitch commented Feb 7, 2021 •

edited

Loading

SteveLasker commented Feb 10, 2021 •

edited

Loading