Skip to content

Conversation

@runcom
Copy link
Member

@runcom runcom commented Dec 4, 2018

@mrunalp @mtrmac @bparees @smarterclayton

This is just a wip to test things out as we need the resulting hypershift image from this PR to substitute the original one in a cluster and test this workflow, so, do not merge

Signed-off-by: Antonio Murdaca runcom@linux.com

@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Dec 4, 2018
{
Name: "buildcachedir",
MountPath: "/var/lib/containers/cache/",
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

},
},
{Name: "buildcachedir", VolumeSource: v1.VolumeSource{
HostPath: &v1.HostPathVolumeSource{Path: "/miloslavcache/"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to exist on the host, right? or will k8s create it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cri-o/docker should create it if it doesn't exist on mount time right? or does kube check that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't know, easy enough to test (create a pod that mounts a non-existent hostdir)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to check if we can create this directory on RHCOS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sh-4.2# id
uid=0(root) gid=0(root) groups=0(root) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
sh-4.2# 
sh-4.2# 
sh-4.2# id
uid=0(root) gid=0(root) groups=0(root) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
sh-4.2# pwd
/
sh-4.2# mkdir miloslavcache
mkdir: cannot create directory ‘miloslavcache’: Permission denied
sh-4.2# mkdir mycache
mkdir: cannot create directory ‘mycache’: Permission denied

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to create /var/lib/containers/.cache

{
Name: "buildcachedir",
MountPath: "/var/lib/containers/cache/",
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need this in the git-clone container.

{
Name: "buildcachedir",
MountPath: "/var/lib/containers/cache/",
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need this in the manage-dockerfile container.

{
Name: "buildcachedir",
MountPath: "/var/lib/containers/cache/",
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed

{
Name: "buildcachedir",
MountPath: "/var/lib/containers/cache/",
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one is needed (we will pull images from within this container). And docker.go should have it too.

assuming you care about this cache for pulling images (not pushing them).

{
Name: "buildcachedir",
MountPath: "/var/lib/containers/cache/",
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed.

{
Name: "buildcachedir",
MountPath: "/var/lib/containers/cache/",
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one is correct/needed.

@runcom runcom force-pushed the blob-mount-from-cache branch 4 times, most recently from ed07245 to 1f00740 Compare December 5, 2018 02:42
@smarterclayton
Copy link
Contributor

We’re going back to a shared cache? Doesn’t that cance the security benefits we got?

Wasn’t clear to me why shared cache is required for mount from - mount from should work even for a local user?

@bparees
Copy link
Contributor

bparees commented Dec 5, 2018

We’re going back to a shared cache? Doesn’t that cance the security benefits we got?

as @mtrmac explained it to me, it's not a blob cache, it's just metadata/ids about the blobs.

Wasn’t clear to me why shared cache is required for mount from - mount from should work even for a local user?

the issue here is subsequent builds. We need some minimal cached information so a subsequent push can benefit from previous pushes that have happened.

@mtrmac gave me a nice explanation but that's the extent of what i understood of it. Maybe he can put it in writing here to help.

@mtrmac
Copy link
Contributor

mtrmac commented Dec 5, 2018

We’re going back to a shared cache?

Wasn’t clear to me why shared cache is required for mount from - mount from should work even for a local user?

(Mounting itself does not require extra privileges.)

This is just a metadata cache.

For builds that have a base image pulled from the same registry that the build results are pushed to (e.g. if both the base image and the build result are stored in the internal registry), we don’t need a long-term metadata cache: we just need the information from the pull to be available to the push, and that will allow mounting the layers from the base image repo to the build result image repo.

We do need a long-term metadata cache if the base image is located in a different registry than we are pushing the build result to, e.g. if the base image is located at registry.access.redhat.com or the Docker Hub, and the destination is the internal registry (I was told this is very common.)

In that case, during the very first build, we necessarily pull all the layers of the base image from the external registry, and push all of them again (+ the new ones created during the build) to the internal one; that is unavoidable. On subsequent builds, though, we can avoid re-pushing the base layers to the internal registry again, if:

  • We have a record of the compressed digest used when pushing the layer to the internal registry (which is not necessarily the same as the digest used on the external registry, if the compression implementation differs); this needs to be recorded from the previous build (re-computing the value by compressing the layers would negate most of the benefit, because the compression is usually what takes the majority of the time during pushes).
  • For builds that share the base image with the previous build, but use a different destination repo, we are aware of the repo used for the initial build, so that we can mount from it. This also needs to be recorded from the previous build.

(Reordered)

Doesn’t that cance the security benefits we got?

Yes, the shared state does add a bit of risk, if the code running within one build somehow broke out (AFAICS the build commands are isolated from the builder in a separate OCI/runc context) and were able to write to the long-term cache: it could cause subsequent builds to incorrectly substitute malicious blobs in build results.

With schema2 / OCI images, this would only be a DoS because the image configuration contains DiffID (uncompressed digest) values, and is not affected by the blob substitution: so, the substituted blobs would be downloaded and rejected as non-matching (both by Docker and CRI-O), making the corrupted build result unusable.

With schema1 (e.g. when pushing to a schema1-only registry), the substitutions would not be noticed, though.

@runcom runcom force-pushed the blob-mount-from-cache branch from 1f00740 to 4a3c9bf Compare December 19, 2018 11:15
@runcom runcom force-pushed the blob-mount-from-cache branch 2 times, most recently from bbaeb4c to aa3139d Compare January 5, 2019 11:41
@runcom
Copy link
Member Author

runcom commented Jan 5, 2019

/retest

@runcom runcom force-pushed the blob-mount-from-cache branch from aa3139d to 541f2f4 Compare January 5, 2019 14:28
@bparees
Copy link
Contributor

bparees commented Jan 5, 2019

looks like you broke the formatting:

[WARNING] !!! gofmt needs to be run on the listed files
./pkg/build/controller/strategy/docker.go
./pkg/build/controller/strategy/sti.go

Looks like there are also a few more places where you've changed the expected number of volumes. (unit test failures)

otherwise it lgtm.

@runcom runcom force-pushed the blob-mount-from-cache branch from 541f2f4 to b94a8c8 Compare January 5, 2019 23:41
@runcom runcom changed the title WIP: wire in blob mount-from functionality wire in blob mount-from functionality Jan 5, 2019
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 5, 2019
@runcom runcom force-pushed the blob-mount-from-cache branch from 377f127 to 5cc1bac Compare January 8, 2019 18:32
@bparees
Copy link
Contributor

bparees commented Jan 8, 2019

/lgtm
/hold cancel

@openshift-ci-robot openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Jan 8, 2019
Signed-off-by: Antonio Murdaca <runcom@linux.com>
@runcom runcom force-pushed the blob-mount-from-cache branch from 5cc1bac to 36017bf Compare January 8, 2019 18:52
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jan 8, 2019
@bparees
Copy link
Contributor

bparees commented Jan 8, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 8, 2019
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bparees, runcom

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mrunalp
Copy link
Member

mrunalp commented Jan 8, 2019

/retest

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

3 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@runcom
Copy link
Member Author

runcom commented Jan 9, 2019

/retest

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@mrunalp
Copy link
Member

mrunalp commented Jan 9, 2019

/test unit

@mrunalp
Copy link
Member

mrunalp commented Jan 9, 2019

/test e2e-aws

@mrunalp
Copy link
Member

mrunalp commented Jan 9, 2019

/test e2e-aws-serial

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@runcom
Copy link
Member Author

runcom commented Jan 9, 2019

/retest

1 similar comment
@mrunalp
Copy link
Member

mrunalp commented Jan 10, 2019

/retest

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@runcom
Copy link
Member Author

runcom commented Jan 10, 2019

this is green now 🤔

@openshift-merge-robot openshift-merge-robot merged commit b4b6535 into openshift:master Jan 10, 2019
@runcom runcom deleted the blob-mount-from-cache branch January 10, 2019 14:21
@openshift-ci-robot
Copy link

@runcom: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-aws-builds 36017bf link /test e2e-aws-builds

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants