Skip to content

Conversation

@gabemontero
Copy link
Contributor

@openshift-ci-robot
Copy link
Contributor

@gabemontero: This pull request references Bugzilla bug 1916897, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.7.0) matches configured target release for branch (4.7.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
Details

In response to this:

Bug 1916897: narrow scope of rhsm transient bind mount

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Jan 19, 2021
@nalind
Copy link
Member

nalind commented Jan 19, 2021

Would we get the same effect by removing the code block the patch is modifying, and adding the spec to imagecontent/etc/containers/mounts.conf?

// Add a bind of /run/secrets/rhsm, to pass along anything that the
// runtime mounted from the node into our /run/secrets/rhsm.
log.V(0).Infof("Adding transient rw bind mount for /run/secrets/rhsm")
transientMounts = append(transientMounts, "/run/secrets/rhsm:/run/secrets/rhsm:rw,nodev,noexec,nosuid")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nalind with this approach will writes to /run/secrets/rhsm from within the build container propagate to /run/secrets/rhsm in the build pod? (we do not want that)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bind mount will let it modify content in the build container. Unless I'm misreading things, the node's /usr/share/containers/mounts.conf includes /usr/share/rhel/secrets:/run/secrets on a 4.6.6 cluster, so cri-o creates a copy of the contents of /usr/share/rhel/secrets and bind mounts it for each container that it creates. While modifying the /run/secrets in the build container would modify its copy of the content, each container in the build pod has its own copy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think we want to allow any actions in the build container to affect content in the build pod containers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the other containers in the pod have their own copies of that content.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK we have all green tests with this current form. Given the discussion above, and @nalind 's #206 (comment) do we want to try a separate commit where I remove this piece of code and re-introduce a master branch version of https://github.com/openshift/builder/blob/release-4.6/imagecontent/etc/containers/mounts.conf where we change /run/secrets to /run/secrets/rhsm ?

I'm also curious on seeing the ls -R /run dump from the new test in openshift/origin#25810 to make sure there is nothing else we need to consider

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic that controls where the copy of the secrets goes is currently hard-coded in buildah, so the patch would at least include a change there. Changing it in a way that required any changes in openshift/builder would also require similar changes in podman and in any other consumers of the library to keep the behavior consistent between them, so my inclination would be to keep the decision making around that as an implementation detail in buildah.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think i need to take back what i said earlier about it being ok if each RUN/layer gets a fresh copy of /run/secrets/rhsm, because the way people use that dir is probably:

COPY somesecret /run/secrets/rhsm
RUN yum install something-that-needs-the-above-secret

so either we fix buildah to persist the writes between layers (even when using using the layer optimization mode) or we add logic that copies the content to a new location before invoking buildah, and then bindmounts it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think i need to take back what i said earlier about it being ok if each RUN/layer gets a fresh copy of /run/secrets/rhsm, because the way people use that dir is probably:

COPY somesecret /run/secrets/rhsm
RUN yum install something-that-needs-the-above-secret

so either we fix buildah to persist the writes between layers (even when using using the layer optimization mode)

so does @bparees 's comment ^^ lead you to modifying your comment at #206 (comment) @nalind ?

or we add logic that copies the content to a new location before invoking buildah, and then bindmounts it.

So if possible @bparees I'd like to get a little more specific on how ^^ would be done to make sure we have a common understanding.

I'm going to attempt a mixture of real code and pseudo code to get a good detail vs. length mix.

startingDir := "/tmp"   // or some other location we agree upon
tmpDir, err := ioutil.TempDir(startingDir, "rhsm-creds")
if err != nil {
  ...
}

// logic to do recursive copy of /run/secrets/rhsm to tmpDir
...

// reintroduce transient mounts path 
...
transientMounts = append(transientMounts, fmt.Sprintf("%s:/run/secrets/rhsm:rw,nodev,noexec,nosuid", tmpDir))

how far off is that ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that looks pretty close to what i had in mind.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK @bparees @nalind @xiuwang I've dropped the re-introduction of mounts.conf and pushed a new commit that does the tmpdir/compy/transient mount @bparees and I just discussed

Let's see how the e2e's go.

I'll try to run dptp's BC against a cluster-bot cluster with this PR (hopefully I have better luck with cluster-bot today), but if you get cycles @xiuwang by all means give it a go as well

@adambkaplan FYI (to the whole PR :-) )

@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 19, 2021
@gabemontero gabemontero force-pushed the fix-run-secrets-mount branch from cd80fc3 to 145e651 Compare January 19, 2021 20:53
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 19, 2021
@gabemontero
Copy link
Contributor Author

OK @nalind @bparees I've pushed in a separate commit the alternate approach of removing the code in daemonless.go and re-introducing the mounts.conf file, but using /run/secrets/rhsm

I'm still trying to get a cluster with these change to try the DPTP Dockerfile with the chmod's ( a couple of clusterbot attempts have failed)

@@ -0,0 +1 @@
/run/secrets/rhsm:/run/secrets/rhsm No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think my concern that this results in writing changes through to the build pod's container remains.

@xiuwang
Copy link

xiuwang commented Jan 20, 2021

Launched cluster from this cluster, and trigger build from DPTP bc http://file.rdu.redhat.com/~hongkliu/dptp2021/bz1916897/content-mirror/app.ci.bc.content-mirror.yaml.

Build content-mirror-2 go to complete without /run/secret read only error.
$ oc get builds -n ci
NAME TYPE FROM STATUS STARTED DURATION
content-mirror-2 Docker Git@083b29e Complete About an hour ago 1m16s

Build httpd-ex-2 with source secret go to complete too.
$oc get builds
NAME TYPE FROM STATUS STARTED DURATION
httpd-ex-1 Source Git Failed (FetchSourceFailed) 5 minutes ago 4s
httpd-ex-2 Source Git@092fbaa Complete 40 seconds ago 36s

@gabemontero
Copy link
Contributor Author

gabemontero commented Jan 20, 2021

Launched cluster from this cluster, and trigger build from DPTP bc http://file.rdu.redhat.com/~hongkliu/dptp2021/bz1916897/content-mirror/app.ci.bc.content-mirror.yaml.

Build content-mirror-2 go to complete without /run/secret read only error.
$ oc get builds -n ci
NAME TYPE FROM STATUS STARTED DURATION
content-mirror-2 Docker Git@083b29e Complete About an hour ago 1m16s

Build httpd-ex-2 with source secret go to complete too.
$oc get builds
NAME TYPE FROM STATUS STARTED DURATION
httpd-ex-1 Source Git Failed (FetchSourceFailed) 5 minutes ago 4s
httpd-ex-2 Source Git@092fbaa Complete 40 seconds ago 36s

thanks for the confirmation @xiuwang ... I had trouble either launching or updating clusters yesterday and was unable to do this

@nalind @bparees and I now just need to get consensus on the correct way to apply this fix.

@gabemontero
Copy link
Contributor Author

sig-api-machinery flakes in latest e2e-aws

/test e2e-aws

@gabemontero gabemontero force-pushed the fix-run-secrets-mount branch from 145e651 to 8b3db1b Compare January 21, 2021 13:49
@gabemontero gabemontero force-pushed the fix-run-secrets-mount branch from 8b3db1b to b66b684 Compare January 21, 2021 15:35
return defaultProcessLimits
}

func copyRHSMData(source, destination string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have similar "copy directory contents" in https://github.com/openshift/builder/blob/master/cmd/main.go#L74-L126. Can we consolidate into a utility package?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have similar "copy directory contents" in https://github.com/openshift/builder/blob/master/cmd/main.go#L74-L126. Can we consolidate into a utility package?

update for this ^^ pushed @adambkaplan

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we test this with directories that contain symlinks to directories? I think ioutil.ReadDir() uses lstat(), which would return os.FileInfos that returns false from IsDir().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this aspect came up when researching golang recursive copy, but I stopped short of adding that complexity since /run/secret/rhsm does not have symlinks based on my testing, so I stopped short of changing this existing code path.

But I'll defer to you @nalind and @adambkaplan on if we want to add this in just in case.

@gabemontero gabemontero force-pushed the fix-run-secrets-mount branch from b66b684 to 6a34d69 Compare January 21, 2021 17:24
@gabemontero
Copy link
Contributor Author

api server connection refused on image eco

/test e2e-aws-image-ecosystem

@gabemontero
Copy link
Contributor Author

OK tests are passing, and I was able to verifiy DPTP's BC using an image build from this commit

There is the one remaining discussion thread on the copy and symlinks, and then squashing the commits

@gabemontero
Copy link
Contributor Author

I've pushed a separate commit for symlinks @nalind @adambkaplan ... used https://stackoverflow.com/questions/51779243/copy-a-folder-in-go as a ref

@gabemontero
Copy link
Contributor Author

sig-network failures e2e-aws

/test e2e-aws

@gabemontero
Copy link
Contributor Author

sig-network flakes in e2e-aws

/retest

@gabemontero
Copy link
Contributor Author

/retest

@gabemontero
Copy link
Contributor Author

/assign @adambkaplan
/assign @nalind

I'll defer to @bparees if he wants to now be unassigned

Copy link
Contributor

@adambkaplan adambkaplan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something else to consider - we already have a "copy files & directories" library within source-to-image. Do we use that instead?

The default implementation copies symlinks by following the link and preserving the content.

https://github.com/openshift/source-to-image/tree/2ed02f2351b48858b5cc358bbccf6fc4a2a284ef/pkg/util/fs

// If the source directory does not exist, no error is returned.
// If the destination directory exists, any contents with matching file names
// will be overwritten.
func CopyDirIfExists(src, dst string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add unit tests for this function, covering in particular:

  1. Source directory not existing
  2. Destination already existing (confirm files overwritten
  3. Regular files copied
  4. Symlinks copied.


// CopySymLink preserves symlinks as the form of copy
func CopySymLink(src, dst string) error {
link, err := os.Readlink(src)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the symlink refers to an absolute path, instead of a relative path? Do we keep as is?

See also https://github.com/openshift/source-to-image/blob/2ed02f2351b48858b5cc358bbccf6fc4a2a284ef/pkg/util/fs/fs.go#L208-L229


// CopyFileIfExists copies the source file to the given destination, if the source file exists.
// If the destination file exists, it will be overwritten and will not copy file attributes.
func CopyFileIfExists(src, dst string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a unit test for this function.

@gabemontero gabemontero force-pushed the fix-run-secrets-mount branch from ecfe662 to 7901cb3 Compare January 25, 2021 20:11
@gabemontero
Copy link
Contributor Author

PR updated @adambkaplan to using s2i libs ... also went ahead and squashed commits - thanks!

@gabemontero
Copy link
Contributor Author

All green tests on our readonly /run PR @adambkaplan

Reminder: follow up e2e's in openshift/origin#25810 once this merges

@adambkaplan
Copy link
Contributor

/test e2e-aws-proxy

Want to make sure we didn't break the proxy CA generation, which I believe is utilized by git clone.

@gabemontero
Copy link
Contributor Author

e2e-aws-proxy succeeded @adambkaplan

Copy link
Contributor

@adambkaplan adambkaplan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Once this merges we can verify with openshift/origin#25810

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 26, 2021
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adambkaplan, gabemontero

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 26, 2021
@openshift-merge-robot openshift-merge-robot merged commit d9d9f89 into openshift:master Jan 26, 2021
@openshift-ci-robot
Copy link
Contributor

@gabemontero: Some pull requests linked via external trackers have merged:

The following pull requests linked via external trackers have not merged:

These pull request must merge or be unlinked from the Bugzilla bug in order for it to move to the next state. Once unlinked, request a bug refresh with /bugzilla refresh.

Bugzilla bug 1916897 has not been moved to the MODIFIED state.

Details

In response to this:

Bug 1916897: narrow scope of rhsm transient bind mount

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@gabemontero gabemontero deleted the fix-run-secrets-mount branch January 26, 2021 18:39
@gabemontero
Copy link
Contributor Author

/cherrypick release-4.6

@openshift-cherrypick-robot

@gabemontero: new pull request created: #211

Details

In response to this:

/cherrypick release-4.6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@gabemontero
Copy link
Contributor Author

/bugzilla refresh

@openshift-ci-robot
Copy link
Contributor

@gabemontero: Bugzilla bug 1916897 is in an unrecognized state (VERIFIED) and will not be moved to the MODIFIED state.

Details

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants