-
Notifications
You must be signed in to change notification settings - Fork 426
Bug 1823143: wire ICSP lookups to oc image info #829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@soltysh: This pull request references Bugzilla bug 1823143, which is valid. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
| uniqueMirrors := make([]reference.DockerImageReference, 0, len(imageSources)) | ||
| uniqueMap := make(map[reference.DockerImageReference]bool) | ||
| for _, imageSourceMirror := range imageSources { | ||
| if _, ok := uniqueMap[imageSourceMirror]; !ok { | ||
| uniqueMap[imageSourceMirror] = true | ||
| uniqueMirrors = append(uniqueMirrors, imageSourceMirror) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not follow https://github.com/openshift/api/blob/a99ffa1cac6709edf8f502b16890b16f9a557e00/operator/v1alpha1/types_image_content_source_policy.go#L34-L39 .
If at all possible, this should share the implementation in https://github.com/openshift/runtime-utils/blob/master/pkg/registries/registries.go (used e.g. in machine-config-* to generate CRI-O configuration from ICSP), and perhaps even the consumer from https://github.com/containers/image/blob/97b3ffa7bb92a7778238b58010454031b0c2cbee/pkg/sysregistriesv2/system_registries_v2.go#L815 + https://github.com/containers/image/blob/97b3ffa7bb92a7778238b58010454031b0c2cbee/pkg/sysregistriesv2/system_registries_v2.go#L130 , OTOH that would require API additions to make it usable with in-memory configs) instead of maintaining an independent implementation. There are RFEs to add hostname wildcards and the like to the configuration; of course it’s possible to maintain independent implementations but there are enough corner cases that sharing an implementation is the easiest way to remain consistent.
(I haven’t read the code in detail, but the behavior of ICSP is not “fallback alternatives”; the mirrors, if any are tried before the primary location, e.g. so that in a disconnected environment there isn’t a firewall/DNS noise, and delays, for attempts to reach the global internet for images that are available on mirrors.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is intentional, oc is not a cluster. For cli commands not tied to the behavior of a cluster the appropriate behavior is fallback. Not the least of which is that ICSP defines alternatives, and the tradeoffs different clients make drives different benefits. A cli is low latency on the happy path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(To be explicit, my primary concern is about sharing the implementation: the primary vs. mirrors ordering is far less important to me.)
If the user wants to hit the primary location directly, it’s trivial not to submit an ICSP to oc; all of this is only invoked on explicit user action to involve ICSP.
The linked bug also motivates this with a situation where the primary location is not available, and even suggests that oc extract is used during cluster installation. Both of these would benefit from using the mirrors first.
The --cluster-icsp option documentation also says:
honor the ordering of those sources
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect a user to provide --use-cluster-icsp in scripting, and for it not to fail if one exists, and for us to "do the right thing for the user". Since ICSP lookup on some commands (anything not doing bulk retrieval) is better suited for latency, I think we're reserving the right to "do the right thing" by underspecifying.
The clause honor the ordering should be removed from the flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The clause
honor the orderingshould be removed from the flag.
Removed
| if err != nil { | ||
| return nil, err | ||
| } | ||
| if imageRef.AsRepository() != rdmSourceRef.AsRepository() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way ICSP actually works when pulling, configuring {Source: quay.io/foo, Mirrors: example.local/mirror-of-quay-foo} will also match quay.io/foo/bar and rewrite to example.local/mirror-of-quay-foo/bar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a TODO there, we'll tackle this in the next revisions.
|
Just providing the link. This PR is blocked by openshift/library-go#1084 |
pkg/cli/image/info/info.go
Outdated
| o.SecurityOptions.Bind(flags) | ||
| flags.StringVarP(&o.Output, "output", "o", o.Output, "Print the image in an alternative format: json") | ||
| flags.StringVar(&o.FileDir, "dir", o.FileDir, "The directory on disk that file:// images will be read from.") | ||
| flags.BoolVar(&o.ClusterICSP, "cluster-icsp", o.ClusterICSP, "When set to true, look for alternative image sources from ImageContentSourcePolicy objects in cluster, honor the ordering of those sources, and fail if an ImageContentSourcePolicy is not found in cluster.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like that this fails if there is no ICSP. As a user, if I want ICSP to be used, I'll specify the flag in scripts and tolerate when a cluster doesn't have an ICSP. The flag should be use-cluster-icsp. There's no real disadvantage for a user using an ICSP, or there not being one on some clusters. If we had an explicit ICSP locator flag, then I would agree we would fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Roger that, actually the code reading that from cluster didn't fail, it was only the comment that says so. I've dropped the latter part of this flag's description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although we do fail when you specify a file with --icsp-file and we have issues reading the file.
pkg/cli/image/info/info.go
Outdated
| flags.StringVarP(&o.Output, "output", "o", o.Output, "Print the image in an alternative format: json") | ||
| flags.StringVar(&o.FileDir, "dir", o.FileDir, "The directory on disk that file:// images will be read from.") | ||
| flags.BoolVar(&o.ClusterICSP, "cluster-icsp", o.ClusterICSP, "When set to true, look for alternative image sources from ImageContentSourcePolicy objects in cluster, honor the ordering of those sources, and fail if an ImageContentSourcePolicy is not found in cluster.") | ||
| flags.StringVar(&o.ICSPFile, "icsp-file", o.ICSPFile, "Path to an ImageContentSourcePolicy file. If set, data from this file will be used to set alternative image sources.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data from this file will be used to find alternative locations for images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
pkg/cli/image/info/info.go
Outdated
| } | ||
| if len(o.ICSPFile) > 0 || o.ClusterICSP { | ||
| registryContext = registryContext.WithAlternateBlobSourceStrategy( | ||
| strategy.NewSimpleLookupICSPStrategy(o.ICSPFile, o.operatorClient.ImageContentSourcePolicies())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to have two strategies - one for file (NewExplicitICSPStrategy) and one for cluster lookup (NewICSPFromClusterOnError).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should both of them have similar behavior or they will behave differently, as in:
NewExplicitICSPStrategyreads always first from alternative (just like Miroslav is saying)NewICSPFromClusterOnErroris similar to currentSimpleLookupin that it fails first and tries alternates only then?
|
Some things we missed (and need to fix before this is correct):
we don't fail over correctly (we just exit). This is a bug in the Repository ping behavior and is going to require changes to library-go
we don't fail over. For both of these we need to move "ping" into the actual repository loader lazily, which I believe was an early comment on the PR and I forgot to bring it back up. I think it's potentially ok if our test impl doesn't have it, but it will absolutely block 'oc' being used with ICSP to solve the original bugs here (where you are running oc in a disabled context). I'll suggest a fix later. |
|
Tested this locally, didn't seem to work: Did not try the alternate at all: Also, did this PR add extra logging to the client stack? Seeing this: We should not be printing that... |
|
Not sure why docker.io wasn't selected as an alternate, but it could be because of the legacy lookup behavior (in which case that's a workaround that should be solved when the ICSP is loaded |
|
Failover did work, but didn't prefer the calculated ordering on subsequent calls. It's possible the error lookup strategy should be returning an order that prefers the alternate first (if you get an error, on subsequent calls change the order). But need to think about it a bit. |
| if err := s.resolve(ctx, locator); err != nil { | ||
| return nil, err | ||
| } | ||
| if len(s.alternates) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having the resolve method set a global is weird (just return the list). Also, why is this a list? If two calls are made to the same strategy with different locators, then the strategy has to get a new alternates list. So if you want to cache, you have to cache by locator, and you probably need to maintain the size of the cache (at least bound it to something high like 1024 entries).
In general, for this strategy you want to add the locator as the last option, not the first (also if it's not duplicated) when you're handling the "on error" strategy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
In general, for this strategy you want to add the locator as the last option, not the first (also if it's not duplicated) when you're handling the "on error" strategy.
I'm not quite sure I understand what you mean by that?
This might have been introduced in kubernetes/kubernetes#102217
Yeah, I've noticed that when testing against missing registry.
I guess we want a similar behavior as above.
It will test alternates only if it fails, not by default, although it should read the ICSP in. |
Yeah, |
|
We need openshift/library-go#1096 too to land this |
a6fcfbb to
3cc524f
Compare
|
@smarterclayton split into 2 separate strategies:
This should be ready for final review. |
|
@soltysh: This pull request references Bugzilla bug 1823143, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
(meanwhile, openshift/api#874 has added two new CRDs representing mirrors — see openshift/runtime-utils#15 and openshift/machine-config-operator#3037 ) |
I'll update that in a followup, since we will want to support both. |
|
/test e2e-aws-upgrade |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: deejross, soltysh The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
7 similar comments
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
@soltysh: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
1 similar comment
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
@soltysh: Some pull requests linked via external trackers have merged: The following pull requests linked via external trackers have not merged:
These pull request must merge or be unlinked from the Bugzilla bug in order for it to move to the next state. Once unlinked, request a bug refresh with Bugzilla bug 1823143 has not been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
The oc in this link contain the ICSP lookup feature openshift/oc#829 The lack of ICSP support is the reason the assisted-service didn't upstream oc since commit 317f122
The oc in this link contain the ICSP lookup feature openshift/oc#829 The lack of ICSP support is the reason the assisted-service didn't upstream oc since commit 317f122
Allow the user to specify an ImageContentSourcePolicy file to fetch from a mirror. This uses the implementation added in openshift#829 for the 'oc image info' command.
Allow the user to specify an ImageContentSourcePolicy file to fetch from a mirror. This uses the implementation added in openshift#829 for the 'oc image info' command.
This shows openshift/library-go#1084 applied to
oc image info/assign @smarterclayton
/hold
for real bump