Skip to content

Conversation

@mtrmac
Copy link
Contributor

@mtrmac mtrmac commented May 30, 2019

- What I did
Implemented support for operator/v1alpha1.ImageContentSourcePolicy, added by openshift/api#354 , to support multiple mirrors for pulling OpenShift images.

NOTE: The CRI interface does not support passing more than one set of credentials for pulling an image, i.e. the pull secret, if any, only applies to the primary image location (the one explicitly included in the pull spec). If any of the mirrors need credentials, the credentials must be available on the node, probably via ControllerConfig.spec.pullSecret. That implies that configuring per-tenant credentials for accessing the mirrors is not currently possible.

To make this work at least for the OpenShift images, this PR also hard-codes CRI-O’s global_auth_file option to point to the pull secret file also used by kubelet, which provides access to OpenShift mirrors.

- How to verify it
WIP. There are fairly extensive unit tests, but I didn’t try this end-to end yet at all.

- Description for the changelog
(Depending on which variant is chosen):

  • Added support for operator/v1alpha1.ImageContentSourcePolicy, allowing to configure mirrors for image registries

@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels May 30, 2019
@cgwalters
Copy link
Member

This is totally fine but it'd also be nice to update the PR description with at least a sentence or two for what the end goal is - think of it like the "seed" upon which an eventual proper commit message could grow 😄 - offhand I'm guessing something like:

We're working on supporting fetching container images from mirrors.

@mtrmac
Copy link
Contributor Author

mtrmac commented Jun 1, 2019

This is totally fine but it'd also be nice to update the PR description with at least a sentence or two for what the end goal is - think of it like the "seed" upon which an eventual proper commit message could grow 😄

Updated now. Still WIP, notably completely untestedonly with a unit test.

I really don’t know much about what I’m doing, any pointers would be appreciated.

Specific questions, for now:

  • What should the eventual API look like: this adds two separate commits adding members to vendor/github.com/openshift/api/config/v1/types_image.go, we need one of them only.
  • I suppose those members will have to be added to the openshift/api repository. Or should this instead introduce a new object / some other kind of change in the machineconfiguration.openshift.io namespace?
  • The mirror support requires using (a current version of) the v2 registries.conf format, which implies requiring a new enough version of CRI-O (and Podman?) (Neither of which exist at this moment.). Can we just assume that the version will be new enough (and enforce that in some repository by updating MCO and CRI-O/… in lockstep), or does this code need to detect and transparently support older versions of the two tools as well (degrading to no-mirror operation)?

@mtrmac mtrmac force-pushed the mirrors branch 2 times, most recently from fdeab41 to f1dbcf9 Compare June 1, 2019 11:37
@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 4, 2019
@openshift-ci-robot openshift-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jun 4, 2019
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 28, 2019
@kikisdeliveryservice kikisdeliveryservice removed their request for review June 28, 2019 19:46
@umohnani8
Copy link
Contributor

/retest

1 similar comment
@mtrmac
Copy link
Contributor Author

mtrmac commented Jul 3, 2019

/retest

@cgwalters
Copy link
Member

This work is targeted at 4.2 right?

Can you link to any more information on the dependent crio work?

It looks like this depends on openshift/api#354 - may be good to do a separate PR to update our vendored openshift/api as a prep step.

@mtrmac mtrmac force-pushed the mirrors branch 5 times, most recently from 5287e23 to 21ebcdd Compare July 4, 2019 20:02
@mtrmac
Copy link
Contributor Author

mtrmac commented Jul 8, 2019

This work is targeted at 4.2 right?

Yes.

Can you link to any more information on the dependent crio work?

cri-o/cri-o#2494 + cri-o/cri-o#2510 for 1.14,
cri-o/cri-o#2439 + cri-o/cri-o#2511 on master.

It looks like this depends on openshift/api#354 - may be good to do a separate PR to update our vendored openshift/api as a prep step.

Will do, at least to get the other effects of dep ensure -update out of this PR — but note that the API repo changes are pure additions, and only really happen when this PR adds the uses of those subpackages.

@mtrmac mtrmac changed the title WIP DO NOT MERGE: Mirrors support WIP: Mirrors support Jul 8, 2019
@mtrmac mtrmac mentioned this pull request Jul 8, 2019
@mtrmac
Copy link
Contributor Author

mtrmac commented Jul 8, 2019

It looks like this depends on openshift/api#354 - may be good to do a separate PR to update our vendored openshift/api as a prep step.

#936 . This PR now includes code from that branch, so don’t merge it as it is; after #936 is merged, I’ll rebase on top of master again.

@mtrmac
Copy link
Contributor Author

mtrmac commented Jul 10, 2019

/retest

mtrmac added 4 commits July 11, 2019 00:36
Update both the templates and updateRegistriesConfig.

Also add tests for updateRegistriesConfig; the existing tests
only test that the controller makes the right kinds of API calls, but does not
check the content at all.

Signed-off-by: Miloslav Trmač <[email protected]>
… registry configuration

This is not currently necessary because such configurations are not created
by the code, but it will be useful momentarily, after mirror configuration
is added.

Signed-off-by: Miloslav Trmač <[email protected]>
This should not change behavior in most cases, because the kubelet is
already reading that file for credentials whenever the user does not provide
any others.

It matters only for mirrors (configured using ImageContentSourcePolicy),
which are transparent to the kubelet.

Because any users without credentials were already able to use the
kubelet's config.json secrets, this should not give unprivileged users
any more rights than they have had previously.  (In addition,
only cluster administrators can configure mirrors.)

Signed-off-by: Miloslav Trmač <[email protected]>
mtrmac added 2 commits July 11, 2019 02:04
... to make the update visible (in both), and to actually apply the created update
at all (in TestImageConfigUpdate).

Signed-off-by: Miloslav Trmač <[email protected]>
Signed-off-by: Miloslav Trmač <[email protected]>
@mtrmac mtrmac changed the title WIP: Mirrors support Mirrors support Jul 11, 2019
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 11, 2019
@mtrmac
Copy link
Contributor Author

mtrmac commented Jul 11, 2019

Dropping the WIP marker, reportedly this works, and the new behavior now has tests at least as good as any other aspect of the controller (in particular, the tests now actually verify expected contents of the generated registries.conf).

Still, I don’t yet fully understand the test scaffolding, so, please, make sure to review that the “Fix TestContainerRuntimeConfigUpdate and TestImageConfigUpdate” commit makes sense.

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Jul 11, 2019

@mtrmac: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-etcd-quorum-loss 69bab1e939794ed5914f2c73601ac2a279d5d8a9 link /test e2e-etcd-quorum-loss

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@umohnani8
Copy link
Contributor

/retest

@umohnani8
Copy link
Contributor

Tested this out on a cluster and everything works fine. Tried out adding multiple ImageContentSourcePolicy objects with overlaps in the mirrors defined. Tested deleting the CRs one by one and registries.conf was populates as expected. And it plays well with any insecure and blocked registries added to the cluster wide Image CR.
The only thing I think that is potentially missing is checking whether there was an actual change to all the ImageContentSourcePolicy CRs when a resync happens. This way we can prevent rebooting the nodes when a resync happens every few minutes if nothing with the registries.conf file has changed. We do a check for the Image CR, which was added here #461. cc @mtrmac

@mtrmac
Copy link
Contributor Author

mtrmac commented Jul 11, 2019

The only thing I think that is potentially missing is checking whether there was an actual change to all the ImageContentSourcePolicy CRs when a resync happens. This way we can prevent rebooting the nodes when a resync happens every few minutes if nothing with the registries.conf file has changed. We do a check for the Image CR, which was added here #461. cc @mtrmac

I thought exactly that path, in particular the

if !isNotFound && equality.Semantic.DeepEqual(registriesIgn, mc.Spec.Config) {
check, should already work prevent no-op changes. Is that not enough?

OTOH, I can’t get TestICSPUpdate to stop the MCS object updates just by not modifying ICSP, but so far that seems to be more likely to be a test issue (the difference is actually in OSImageURL being "" vs. "dummy:///").

@umohnani8
Copy link
Contributor

@mtrmac ah yes, that should do it. I missed it, my bad.
Everything LGTM then, this should be ready to merge once tests pass @runcom @kikisdeliveryservice :)

@umohnani8
Copy link
Contributor

@runcom @cgwalters tests are green!

@kikisdeliveryservice
Copy link
Contributor

this LGTM

will let walters/runcom give the final approval

@cgwalters
Copy link
Member

Because any users without credentials were already able to use the
kubelet's config.json secrets

Hmm...not as of #827 right?

@mtrmac
Copy link
Contributor Author

mtrmac commented Jul 11, 2019

Because any users without credentials were already able to use the
kubelet's config.json secrets

Hmm...not as of #827 right?

The above is not talking about non-container users with local accounts on the node, but about users of the cluster with the ability to create pods (who should, typically, have no access outside of their containers at all). For such users, the kubelet automatically uses the kubelet’s secret file if the user does not explicitly provide a pull secret (and the kubelet is, before this PR, the only process on the node that actually needs access to the cluster pull secret file), and sends the credentials to CRI-O to cause the image to be pulled.

@cgwalters
Copy link
Member

/approve

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 12, 2019
@cgwalters
Copy link
Member

So...this is a lot to take in. I'd like to see a higher level design doc if one exists. But the feature is critical, and I think we can learn more after it lands in master.
/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 12, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, mtrmac

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kikisdeliveryservice
Copy link
Contributor

So...this is a lot to take in. I'd like to see a higher level design doc if one exists. But the feature is critical, and I think we can learn more after it lands in master.

+1

@openshift-merge-robot openshift-merge-robot merged commit e702009 into openshift:master Jul 12, 2019
@mtrmac mtrmac deleted the mirrors branch July 17, 2019 20:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants