Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubernetes-csi: promote existing releases #934

Merged
merged 3 commits into from
Jun 3, 2020

Conversation

pohly
Copy link
Contributor

@pohly pohly commented Jun 2, 2020

These images were previously published on quay.io.

This is the first time that we are using the image promoter and we still had to define approvers.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. area/artifacts Issues or PRs related to the hosting of release artifacts for subprojects labels Jun 2, 2020
@pohly
Copy link
Contributor Author

pohly commented Jun 2, 2020

@msau42: the images.yaml file was created with this script:

#! /bin/sh

# List of repos under https://console.cloud.google.com/gcr/images/k8s-staging-csi/GLOBAL
repos="
csi-attacher
csi-node-driver-registrar
csi-provisioner
csi-resizer
csi-snapshotter
hostpathplugin
livenessprobe
mock-driver
nfs-provisioner
snapshot-controller
"

for repo in $repos; do
    echo "- name: $repo"
    echo "  dmap:"
    gcloud container images list-tags gcr.io/k8s-staging-csi/$repo --format='get(digest, tags)' --filter='tags~v.* AND NOT tags~v2020.*' |
        sed -e 's/\([^ ]*\)\t\(.*\)/    "\1": [ "\2" ]/'
done

I am not sure how you want to handle this in the future; if this script would be useful, then I can add it as k8s.gcr.io/images/k8s-staging-csi/generate.sh to this repo.

Beware that I have not verified whether the hashes match the ones from quay.io. I am assuming that whoever pushed to staging hasn't made a mistake or maliciously pushed different images.

@pohly
Copy link
Contributor Author

pohly commented Jun 2, 2020

/hold

I think it makes sense to promote these images, but let's hear from @msau42 whether she agrees.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 2, 2020
@pohly
Copy link
Contributor Author

pohly commented Jun 2, 2020

Beware that I have not verified whether the hashes match the ones from quay.io. I am assuming that whoever pushed to staging hasn't made a mistake or maliciously pushed different images.

I think we can trust @dims: kubernetes/kubernetes#91257

😁

@dims
Copy link
Member

dims commented Jun 2, 2020

LOL, i am 100% legit :)

More seriously here's the script i used to move things over - https://gist.github.com/dims/0aef844c22af80ed3637745071c8854c

@msau42
Copy link
Member

msau42 commented Jun 2, 2020

I don't think we should promote these images that were copied over from quay because they were not built using the same infra. Also, we shouldn't promote the "-rc*" images which are not meant for public consumption.

@pohly
Copy link
Contributor Author

pohly commented Jun 2, 2020

I don't think we should promote these images that were copied over from quay because they were not built using the same infra.

Does that matter? We expect consumers of those images to trust that they contain what we claim that they contain, regardless of how they were built or where they are currently hosted.

The advantage of promoting the images is that consumers get a choice between pulling from the potentially flaky quay.io and the (hopefully) more reliable gcr.io; that's my motivation for this PR. We still see test failures from intermittent image pull errors for the sidecars from quay.io. We could do the same as the Kubernetes CI and pull from the staging area, but that feels wrong.

@msau42
Copy link
Member

msau42 commented Jun 2, 2020

Hosting an image in k8s infra means that we've used k8s infra to build them. We can't guarantee that with the images we temporarily copied from quay.

If we don't want to wait until we release the next set of sidecars to use this, then can we manually trigger building and pushing these tags in cloudbuild (and update k8s CI to not use any "rc" tags)?

@justaugustus
Copy link
Member

Hosting an image in k8s infra means that we've used k8s infra to build them. We can't guarantee that with the images we temporarily copied from quay.

Not exactly. AFAIK, it means that an image was pushed to staging and promoted to prod, but some trusted set of people. There are likely a few subprojects that have a staging project and don't yet use GCB for building/pushing.

If we don't want to wait until we release the next set of sidecars to use this, then can we manually trigger building and pushing these tags in cloudbuild (and update k8s CI to not use any "rc" tags)?

Agreed on the fact that you probably don't need the RC tags, but I don't think the images should be rebuilt.

For context, I've backfilled a few images for RelEng.
Here's one of those PRs: #835

I'd suggest backfilling one or two of the most recent images; we did this to ensure that people who might be changing their registry endpoint could move to K8s Infra w/o necessarily needing to move to a newer image at the same time.

One thing to be careful of, as @listx mentions:

@justaugustus Your backfill script should use gcloud container images add-tag in the future to avoid pulling/pushing with docker. Using docker risks changing the digests if your workstation's gzip lib is different than the one used by GCR. I just did a check now and indeed the digest changed, for example from gcr.io/google-containers/debian-base@sha256:ebda8587ec0f49eb88ee3a608ef018484908cbc5aa32556a0d78356088c185d4 to gcr.io/k8s-staging-build-image/debian-base@sha256:d7be39e143d4e6677a28c81c0a84868b40800fc979dea1848bb19d526668a00c for the v2.0.0 tag.

There's a good chance your digests will change, which might be unavoidable, but be sure to check up on any code you have that's doing digest verification.

@pohly
Copy link
Contributor Author

pohly commented Jun 2, 2020

There's a good chance your digests will change, which might be unavoidable, but be sure to check up on any code you have that's doing digest verification.

We have no such code in Kubernetes or the sidecars. We have advocated using the version tags and promised that we will never rebuilt those images, so I suspect most users will use that to pull a deterministic image.

But it wouldn't matter anyway, because it seems that the sha256 digests have not changed. I only checked one image (csi-node-driver-registrar:v1.3.0) and it had the same digest on quay.io (https://quay.io/repository/k8scsi/csi-node-driver-registrar?tab=tags) and gcr.io (https://console.cloud.google.com/gcr/images/k8s-staging-csi/GLOBAL/csi-node-driver-registrar@sha256:9622c6a6dac7499a055a382930f4de82905a3c5735c0753f7094115c9c871309/details?tab=info).

@pohly
Copy link
Contributor Author

pohly commented Jun 2, 2020

If there's consensus that promoting the release images is worthwhile, then I can update the PR to exclude the -rc images.

@msau42
Copy link
Member

msau42 commented Jun 2, 2020

Sounds like there's precedent to backfilling images when migrating over to gcr, so promoting this older tags (minus rc) sgtm. Thanks @justaugustus for the clarification!

These images were previously published on quay.io, except for
nfs-provisioner:v2.2.2.
@pohly
Copy link
Contributor Author

pohly commented Jun 3, 2020

I've used an updated script to re-generate the images.yaml file. The script now compares the checksums or (if those differ) the content of the images to verify that they are the same as the ones previously hosted on quay.io. I know that we can trust @dims, but it doesn't hurt to be extra careful, right? 😛

The result is that newer images were moved entirely unchanged, while older ones were apparently repacked and thus have a different sha256. The content is exactly the same, though. nfs-provisioner wasn't on quay.io; I've included it anyway in the images.yaml:

OKAY: csi-attacher:v2.2.0: sha56 matches
OKAY: csi-node-driver-registrar:v1.3.0: sha56 matches
WARNING: csi-node-driver-registrar:v1.2.0: sha56 does not match.
         Content is the same.
OKAY: csi-provisioner:v1.6.0: sha56 matches
OKAY: csi-resizer:v0.5.0: sha56 matches
WARNING: csi-resizer:v0.4.0: sha56 does not match.
         Content is the same.
OKAY: csi-snapshotter:v2.1.0: sha56 matches
WARNING: csi-snapshotter:v2.0.1: sha56 does not match.
         Content is the same.
WARNING: livenessprobe:v1.1.0: sha56 does not match.
         Content is the same.
OKAY: mock-driver:v3.1.0: sha56 matches
Error response from daemon: unauthorized: access to the requested resource is not authorized
WARNING: quay.io/k8scsi/nfs-provisioner:v2.2.2 not found.

Script:

#! /bin/sh

# List of repos under https://console.cloud.google.com/gcr/images/k8s-staging-csi/GLOBAL
repos="
csi-attacher
csi-node-driver-registrar
csi-provisioner
csi-resizer
csi-snapshotter
hostpathplugin
livenessprobe
mock-driver
nfs-provisioner
snapshot-controller
"

tmp=$(mktemp -d)
trap "rm -rf $tmp" EXIT

for repo in $repos; do
    echo "- name: $repo"
    echo "  dmap:"
    gcloud container images list-tags gcr.io/k8s-staging-csi/$repo --format='get(digest, tags)' --filter='tags~v.* AND NOT tags~v2020.* AND NOT tags~.*-rc.*' | while read sha tag; do
        quay=$(docker pull quay.io/k8scsi/$repo:$tag | grep ^Digest: | sed -e 's/^Digest: //')
        if [ "$sha" != "$quay" ]; then
            if [ "$quay" ]; then
                echo >&2 "WARNING: $repo:$tag: sha56 does not match."
                container=$(docker create gcr.io/k8s-staging-csi/$repo:$tag)
                mkdir $tmp/gcr.io
                docker export $container | tar -C $tmp/gcr.io -xf -
                docker rm $container >/dev/null
                container=$(docker create quay.io/k8scsi/$repo:$tag)
                mkdir $tmp/quay.io
                docker export $container | tar -C $tmp/quay.io -xf -
                docker rm $container >/dev/null
                if diff -r $tmp/quay.io $tmp/gcr.io >&2; then
                    echo >&2 "         Content is the same."
                else
                    echo >&2 "         Content differs."
                fi
                rm -rf $tmp/quay.io $tmp/gcr.io
            else
                echo >&2 "WARNING: quay.io/k8scsi/$repo:$tag not found."
            fi
        else
            echo >&2 "OKAY: $repo:$tag: sha56 matches"
        fi
        echo "    \"$sha\": [ \"$tag\" ]"
    done
done

@pohly
Copy link
Contributor Author

pohly commented Jun 3, 2020

Also note that we need to do something about Kubernetes CI: the staging images will be deleted after 60 days, i.e. around July 19th. Let's promote the images and then switch the CI to use the promoted images.

@dims
Copy link
Member

dims commented Jun 3, 2020

@pohly 100% +1 to "trust but verify" :)

- name: mock-driver
dmap:
"sha256:0b4273abac4c241fa3d70aaf52b0d79a133d2737081f4a5c5dea4949f6c45dc3": [ "v3.1.0" ]
- name: nfs-provisioner
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we should promote nfs-provisioner since it's not related to csi. We should find a better permanent home for it. cc @kmova

@msau42
Copy link
Member

msau42 commented Jun 3, 2020

I think the script would be useful to check in, to make promoting future images easier.

pohly added 2 commits June 3, 2020 22:03
It doesn't belong under the CSI repository.
Kubernetes-CSI produces a lot of different images. This script
simplifies promotion of the release images by printing the images.yaml
content to stdout.
@pohly
Copy link
Contributor Author

pohly commented Jun 3, 2020

Removed nfs-provisioner, added the script in the version without the comparison against quay.io (that was a one-time thing).

@msau42
Copy link
Member

msau42 commented Jun 3, 2020

/lgtm
/approve
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 3, 2020
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 3, 2020
@msau42
Copy link
Member

msau42 commented Jun 3, 2020

/assign @dims

@dims
Copy link
Member

dims commented Jun 3, 2020

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dims, msau42, pohly

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 3, 2020
@k8s-ci-robot k8s-ci-robot merged commit b83fefa into kubernetes:master Jun 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/artifacts Issues or PRs related to the hosting of release artifacts for subprojects cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants