-
Notifications
You must be signed in to change notification settings - Fork 4.8k
Bug 1970975: upgrade test: expect upgrades to take longer on AWS #26219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1970975: upgrade test: expect upgrades to take longer on AWS #26219
Conversation
test/e2e/upgrade/upgrade.go
Outdated
| infra, err := c.ConfigV1().Infrastructures().Get(context.Background(), "cluster", metav1.GetOptions{}) | ||
| framework.ExpectNoError(err) | ||
| if infra.Status.PlatformStatus.Type == configv1.AWSPlatformType { | ||
| // due to AWS LB bug upgrades take longer on AWS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a bug in our AWS LB code, or in AWS itself?
are we tracking a resolution? this would also impact customers, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(in that their upgrades will take longer. obviously they don't have the same timeout to worry about)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bug in AWS NLB healthchecks, which we worked around by making kubeapi-operator be resilient. This makes us wait twice - ~12m during reboot and ~12m during apiserver rollout.
IIUC its this issue, but @smarterclayton might have better reference for this.
/cc @smarterclayton
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not the nu26 issue. It's https://bugzilla.redhat.com/show_bug.cgi?id=1943804. As a result, we take 12 extra minutes to rollout apiserver twice (120s -> 240s per machine, 3 machines, 2 separate rollouts, machines and apiserver)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go ahead and change the JUnit to be "Cluster upgrade should complete within 60m (90m on AWS)"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Added bug reference to the comment
|
/lgtm note, this might push us over the timeout limits on the multistep upgrade tests in the future (e.g. 4.8->4.9->4.10) |
Right, we'll work with DPTP to extend those |
|
There is also #26202 in this space with a blanket raise to 90m, vs. this PR's platform-specific raise. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
18 similar comments
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
Due to AWS LB issue apiserver rollout takes twice as much. This bumps upgrade duration on AWS to 90 mins
|
@vrutkovs: This pull request references Bugzilla bug 1970315, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/bugzilla refresh |
|
@vrutkovs: This pull request references Bugzilla bug 1970315, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/bugzilla refresh |
|
@vrutkovs: This pull request references Bugzilla bug 1970315, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@vrutkovs: No Bugzilla bug is referenced in the title of this pull request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@vrutkovs Do you have a new bug to use for this PR? Looks like the other one which you removed was the wrong bug. /lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bparees, stbenjam, vrutkovs The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
1 similar comment
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
@vrutkovs: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
@vrutkovs: This pull request references Bugzilla bug 1970975, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@vrutkovs: All pull requests linked via external trackers have merged: Bugzilla bug 1970975 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Due to AWS LB issue apiserver rollout takes twice as much. This bumps upgrade duration on AWS to 90 mins.
Upgrade tests: