Bug 1874106: Split work of oc image mirror to avoid AuthHeaderTooLong error from registry#761
Conversation
|
@sallyom: This pull request references Bugzilla bug 1874106, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@sallyom: This pull request references Bugzilla bug 1874106, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/bugzilla refresh |
|
@sallyom: This pull request references Bugzilla bug 1874106, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
pkg/cli/image/mirror/mappings.go
Outdated
| func buildTargetTrees(mappings []Mapping) []targetTree { | ||
| var trees []targetTree | ||
| // split targetTrees into groups of 10 | ||
| splitMappings := split(mappings, 10) |
There was a problem hiding this comment.
Can you make that 10 a constant so it's easier to change, if needed and it's documented for free this way 😉
pkg/cli/image/mirror/mappings.go
Outdated
| m := mappings[i:min(i+size, len(mappings))] | ||
| splitMappings = append(splitMappings, m) | ||
| } | ||
| klog.Infof("LEN SPLIT MAPPING: %d", len(splitMappings)) |
pkg/cli/image/mirror/mirror.go
Outdated
| return fmt.Errorf("SRC and DST may not be the same") | ||
| } | ||
| } | ||
| klog.Infof("LENGTH OF MAPPINGS: %v", len(o.Mappings)) |
pkg/cli/image/mirror/mirror.go
Outdated
| if err == nil { | ||
| // blob exists, skip | ||
| klog.V(5).Infof("Server reports blob exists %#v", blob) | ||
| klog.V(0).Infof("Server reports blob exists %#v", blob) |
|
/cherry-pick release-4.7 /cherry-pick release-4.6 |
|
@soltysh: once the present PR merges, I will cherry-pick it on top of release-4.7 in a new PR and assign it to you. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/cherry-pick release-4.6 |
|
@soltysh: once the present PR merges, I will cherry-pick it on top of release-4.6 in a new PR and assign it to you. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
a200066 to
0d3108f
Compare
|
/retest |
pkg/cli/image/mirror/mirror.go
Outdated
| start := time.Now() | ||
| p, err := o.plan() | ||
| plans, err := o.plan() | ||
| //p, err := o.plan() |
There was a problem hiding this comment.
nit: probably drop this commented-out line
…s from registry server While mirroring multiple images, quay.io sends "http2: server sent GOAWAY and closed the connection; LastStreamID=1, ErrCode=ENHANCE_YOUR_CALM, debug="""" To avoid this, split the mappings into small chunks that the registry server can easily handle.
0d3108f to
6886461
Compare
|
/retest |
1 similar comment
|
/retest |
|
|
||
| // As length of mappings increases, so does Authorization Header size, and this causes upload failures with | ||
| // registries that have a header size limit (quay.io) | ||
| var maxLenMappings = 10 |
There was a problem hiding this comment.
nit: this could be just plain old const 😉
|
/retest |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sallyom, soltysh The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
6 similar comments
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
@sallyom: All pull requests linked via external trackers have merged: Bugzilla bug 1874106 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@soltysh: #761 failed to apply on top of branch "release-4.6": DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
I'm a bit concerned about this change. I don't see how splitting the plan up has anything to do with a quay level issue. Are you hitting a rate limit? If so, rate limiting the speed of uploads or parallelism is the approach. Can you explain why splitting up the tree has The whole point of a plan is that it's doing minimal work. I don't think splitting up the target tree is the right approach. You should instead be rate limiting how many chunks get executed. |
|
I'm super concerned that this doesn't have anything to do with the actual problem - if we're executing requests that have too much data in query parameters, that's different than splitting up the plan. |
|
Splitting up auth header too long (the actual error referenced in the bug) is about how many tokens we send. We had an earlier workaround for that that had to do with the fact that a single token can't authenticate more than once. This PR really should be reverted, and instead if you see the error while we're getting authorization we should recommend to the user they use --skip-multiple-scopes OR we should automatically default this flag to true when we see this error, OR we should try to create scopes in batches (not all at once). |
@sallyom can you revert that and we need to sync with @smarterclayton about batching, the temporary workaround is to use |
|
we should be able to select the appropriate transport based on which tokens we need already, so i think this is just about avoiding excessively long sets of scopes on large mirrors (if the theory is correct, that would be library-go) |
|
--skip-multiple-skopes resolves https://bugzilla.redhat.com/show_bug.cgi?id=1874106, I'll update the bug and we can revert |
|
I think the probably fix is to identify when the scope size triggers the issue (i.e. there is some length of the header that will break quay). we should identify what that limit is, and if it's reasonable we should investigate whether we can split the tokens up so that when you access repo A we find the connection that has the right perm on repo A instead of just assuming any connection to the api. I think that's tractable, although the nuance is that for cross mounting we need push on the target repo and pull on the source repo on the same target. So we could instead just look at how many scopes we will need and if > limit we'd simply default to skip-multiple-scopes true. |
|
opened #780 to revert this change |
|
I checked there's no mention of size limit for that header, but it was mentioned that most setups limit that at 4k/8k. We can probably make a safe bet with 4k limit and switch to skip-multiple-scopes at that, wdyt @smarterclayton |
While mirroring multiple images, quay.io sends
http2: server sent GOAWAY and closed the connection; LastStreamID=1, ErrCode=ENHANCE_YOUR_CALM, debug="""",Request Header Or Cookie Too Large</center>\r\n<hr><center>nginx/1.14.1To avoid this, split the mappings into groups - []Mappings length 10
This PR sets a hard-coded length of 10 for []Mapping from which the imageTrees and plan are built. I've seen from local testing that oc image mirror fails when given > ~20 SRC=DST mappings.
When sending more that ~20 mappings with
oc image mirror -f file-w-more-than-20and/or w/oc adm catalog mirrortheRequest Header Or Cookie Too Large</center>\r\n<hr><center>nginx/1.14.1or thehttp2 ErrCode=ENHANCE_YOUR_CALMerror is encountered. This PR splits large groups of mappings into smaller groups.From discussions, this probably is not going to be fixed in quay, ie quay would rather not disable header size checks as this is considered insecure.