-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Test metal ipi upgrade #13120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test metal ipi upgrade #13120
Conversation
|
Hi @gbenhaim. Thanks for your PR. I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/ok-to-test |
|
/retest |
c71a06e to
f689350
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better probably to use OPENSHIFT_UPGRADE_RELEASE_IMAGE, see changes currently in progress on #12959
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check it's ok, anyhow I'd suggest to follow an approach similar to the one in #12959 (to be as much as possible similar to the ones adopted by the other profiles) where possible, ie making a check directly against OPENSHIFT_UPGRADE_RELEASE_IMAGE as in https://github.com/openshift/release/blob/d4f03b5848e88d409f6b0cac7ec205f47530e07b/ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh#L75 (we'll save one var)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this approach, the problem is that OPENSHIFT_UPGRADE_RELEASE_IMAGE will always be non-empty since the step specifies it as a dependency, so I've needed a new variable.
From where the TEST_COMMAND variable comes from? I didn't see it in any of the templates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think that's a sanity check (always expected to be not empty). They've defined TEST_COMMAND as a normal env var in the openshift-e2e-test.
In our case we've the additional requirement to filter our some of the conformance tests (in the case of installation test), checking the current value of TEST_COMMAND could help in crafting properly the various pieces required by openshift-tests (the command itself, additional options such as --to-image and the filter list)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, https://github.com/openshift/release/pull/13120/files#diff-7b4946fc316154f2f0fe2beac2371cf82bc14d8071298890e8a014417e88159f doesn't use the TEST_COMMAND variable. If we want to use it, I would request to do it in a different PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to adopt it if possible in the current PR, so that also the code duplication will be removed. If not, it's ok to address it in another PR.
f689350 to
8bed02b
Compare
a0661e7 to
2217245
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be removed here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
2217245 to
107cfd7
Compare
|
/test pj-rehearse |
1 similar comment
|
/test pj-rehearse |
|
The upgrade test fails on |
c0d6a94 to
1d73149
Compare
|
Failed on an unrelated issue |
|
/retest ci/rehearse/openshift-metal3/dev-scripts/master/e2e-metal-ipi-upgrade |
|
/test pj-rehearse |
a711a46 to
661f560
Compare
|
/test pj-rehearse |
1 similar comment
|
/test pj-rehearse |
661f560 to
f0e0601
Compare
|
@andfasano the test runs and fails on: can you please take a look? |
Hi @gbenhaim it seems that this test is failing because currently the baremetal platform does not support LoadBalancer service type (see [1]). Can you please try to filter it out (you can follow a similar way to the one adopted for the installation e2e test) [1] openshift/enhancements#356 cc @hardys |
|
@andfasano from looking at the code I can see it's not possible to filter this test You can also see that tests that shouldn't run were commented out https://github.com/openshift/openshift-tests/blob/release-4.6/test/e2e/upgrade/upgrade.go#L32 |
- Add a test for metal ipi upgrade - Extend "baremetalds-e2e-test-commands.sh" to run only the upgrade tests if the workflow requested it. - Choose the cluster images based on the type of the job. For regular jobs the cluster will use release:latest. For upgrade jobs the cluster will use release:initial and will be upgraded to release:latest. Signed-off-by: Eran Israeli <[email protected]> Signed-off-by: gbenhaim <[email protected]>
We would like to collect the logs from the remote server also in the case the script was termintated by a signal. Signed-off-by: gbenhaim <[email protected]>
Because of a bug in "ci-operator" when the job is canceled because of a timeout logs are not collected [1]. In order to bypass this issue, add an internal timeout to the upgrade command. [1] openshift/ci-tools#1340 Signed-off-by: gbenhaim <[email protected]>
Upgrades are not supported for IPv6 only clusters, since quay.io and api.openshift.com can't serve IPv6 requests. Also, add the env var the marks the test as an upgrade test to the ci-operator config, since the env map in it overrides the env map in the workflow. Signed-off-by: gbenhaim <[email protected]>
The entire test suite contains tests that are not suitable for BM deployment. Signed-off-by: gbenhaim <[email protected]>
f0e0601 to
e56263a
Compare
|
@andfasano the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some minor cleanups, for the rest looks fine for me, thanks!
| echo "### Running Upgrade tests" | ||
| # In case of a timeout we don't get the trace log of this script | ||
| # Let's verify that the we started the upgrade process | ||
| touch "${ARTIFACT_DIR}/upgrade_started" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| timeout \ | ||
| --kill-after 10m \ | ||
| 120m \ | ||
| ssh \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If not required anymore, could this timeout be removed (there will be also the CI default one for the job)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to keep it just in case other issues will cause the code to get stuck. In this case, it would be good to have the logs.
Since the logs will be collected even if there is a timeout, we don't need additional debug code. Signed-off-by: gbenhaim <[email protected]>
|
/test pj-rehearse |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andfasano, gbenhaim The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
@gbenhaim: Updated the following 3 configmaps:
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@gbenhaim: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
Small related follow-up to this: #13769 |
a destination cluster image is specified using
"OPENSHIFT_UPGRADE_RELEASE_IMAGE_OVERRIDE".
Signed-off-by: Eran Israeli [email protected]
Signed-off-by: gbenhaim [email protected]