Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR: crane upload #934

Open
jonjohnsonjr opened this issue Feb 4, 2021 · 6 comments
Open

FR: crane upload #934

jonjohnsonjr opened this issue Feb 4, 2021 · 6 comments
Assignees

Comments

@jonjohnsonjr
Copy link
Collaborator

I'd like a new subcommand that uploads blobs.

I need to write up more about this, but crane hits a really sweet spot for most things where we're doing as little as possible to implement container-specific details, then getting out of the way. crane export and crane auth are great examples of this. So are crane {manifest,blob,ls,catalog,delete,digest,tag}.

There are also higher level things that are often useful, but I really like the lower-level things, because they make it possible to do things as a bash one-liner that otherwise require loading up an IDE and pulling in a ton of dependencies.

Uploading blobs is a huge missing piece.

strawman

$ crane upload reg.example.com/my-repo < some-file
sha256:db98fc6f11f08950985a203e07755c3262c680d00084f601e7304b768c83b3b1

The output is the digest of the uploaded thing.

output

Several modes of output would be useful:

A mode that prints the digest as a ref (maybe -q for "qualify?"), for use with crane blob or perhaps crane append:

$ crane upload -q reg.example.com/my-repo < some-file
reg.example.com/my-repo@sha256:db98fc6f11f08950985a203e07755c3262c680d00084f601e7304b768c83b3b1

# Should be a no-op (assuming this doesn't overwrite some-file before we read it):
$ crane blob $(crane upload -q reg.example.com/my-repo < some-file) > some-file

A mode that prints a descriptor, which should include the size:

$ crane upload reg.example.com/my-repo < some-file | jq .
{
  "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
  "size": 843,
  "digest": "sha256:db98fc6f11f08950985a203e07755c3262c680d00084f601e7304b768c83b3b1"
}

(Note the | jq . to pretty-print it -- this should be one line of output, normally.)

flags

There is plenty of opportunity for flags and bikeshedding of those flags.

For example, should compression be on or off by default? Given that it's possible to do this:

$ gzip some-file -c | crane upload reg.example.com/my-repo

I don't see a reason for it? Perhaps a -z flag as a convenience, but if you want to set a specific compression level, that's up to you.

unsolved

diffid

I'm not sure how to represent diffid here, for things that get compressed.

The problem is two-fold:

  1. We don't really have a way to persist metadata like this. If we want to be able to do something interesting with a blob, it would be nice if we didn't have to download the whole thing, check to see if it's gzipped, and hash it to compute the diffid.
  2. How do we actually expose this in a composable way?

Some ideas...

If we are printing a descriptor and we're compressing it, add the diffid as an annotation. We can have other tools understand this annotation when reconstructing a config file (like in pkg/v1/mutate).

$ crane upload -z reg.example.com/my-repo < some-file | jq .
{
  "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
  "size": 843,
  "digest": "sha256:db98fc6f11f08950985a203e07755c3262c680d00084f601e7304b768c83b3b1",
  "annotations": {
      "dev.ggcr.crane.diffid": "sha256:dbf2c0f42a39b60301f6d3936f7f8adb59bb97d31ec11cc4a049ce81155fef89"
   }
}

If we're not printing a descriptor, we could just output two lines, where the first is digest and the second diffid:

$ crane upload reg.example.com/my-repo -z < some-file
sha256:db98fc6f11f08950985a203e07755c3262c680d00084f601e7304b768c83b3b1
sha256:dbf2c0f42a39b60301f6d3936f7f8adb59bb97d31ec11cc4a049ce81155fef89

Or we could have an output file?

$ crane upload --diffid some-file-diffid.sha256 reg.example.com/my-repo -z < some-file
sha256:db98fc6f11f08950985a203e07755c3262c680d00084f601e7304b768c83b3b1

Or just tell people to manually run sha256sum before uploading? That seems reasonable enough to me...

media type

If we're printing a descriptor, I think it's safe to assume a default of the docker layer media types, compressed or not (we should probably sniff the contents to see if it is).

However, we might want to set an arbitrary media type, so that should perhaps just be a flag? Let's say I want to upload some json. The registry doesn't care here, actually, but if we want to compose this with another tool, it would be useful to have an accurate descriptor.

$ crane upload -m "application/json" reg.example.com/my-repo < config.json | jq .
{
  "mediaType": "application/json",
  "size": 809,
  "digest": "sha256:246c399c7f7b05e2a241ce0771456bee9eaa61d5015997237c920d69fd024443"
}
@imjasonh
Copy link
Collaborator

imjasonh commented Feb 5, 2021

A mode that prints the digest as a ref (maybe -q for "qualify?"), for use with crane blob or perhaps crane append:

-q always means --quiet to me, maybe --full (no shorthand)?

I don't see a reason for [a zip flag]?

Neither do I, at least not right away. Documenting how to compress before uploading seems fine, and frankly more flexible for the user.

the first is digest and the second diffid:

I can tell you already I won't remember which is which without looking it up. 😄

Or just tell people to manually run sha256sum before uploading?

When in doubt, making users take extra steps (that we document) using standard tools feels better than adding in behavior we're not sure we'll need.

media type

Having a flag for this makes sense. It doesn't have to be included in the initial release, so long as we have a TODO/issue to track adding so we can remember.


This surface would add a lot of new workflows to crane, which is great, but it means we need good docs and examples (incl. e2e tests?) that demonstrate how it works, when you might want to use it, etc. Crane docs are a bit light now, mostly focused on generated CLI docs, and the top-level README.

@jonjohnsonjr
Copy link
Collaborator Author

but it means we need good docs and examples

I've started working on this and will try to send a PR soon-ish. My goal is to just have a collection of one-liners that demonstrate crane's flexibility, because that's the most effective form of documentation for me, personally.

@jonjohnsonjr
Copy link
Collaborator Author

maybe --full (no shorthand)?

Perhaps just --ref? Short and explicit.

When in doubt...

Cool, I think we're on the same page. I mostly want this for just uploading arbitrary bytes and not for appending stuff to an image, so I don't actually need the diffid for what I care about.

@github-actions
Copy link

github-actions bot commented May 7, 2021

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

@tianon
Copy link
Contributor

tianon commented May 19, 2023

I sent Jon "crane push-blob that gives me back a descriptor" and he sent me back this issue link (so here I am, dutifully commenting that such a thing is interesting to at least one other person besides Jon 😂)

@tianon
Copy link
Contributor

tianon commented May 19, 2023

On the questions posed, I'd suggest probably start with no flags - if you want compression/diffid, you calculate/do them yourself beforehand, and if the command defaults to no compression it's "easy" to add more optional behavior later without affecting any existing users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants