-
Notifications
You must be signed in to change notification settings - Fork 1.5k
data/bootstrap: Check if release image architecture matches host architecture #4592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
data/bootstrap: Check if release image architecture matches host architecture #4592
Conversation
638fe0b to
2d0102b
Compare
|
/test e2e-crc |
|
/retest |
staebler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are very close to getting the boot image included in the release image (see #4760). Will these changes still be helpful at that point?
we error right away rather than downloading the associated images and failing in
say crio when downloading the incorrect pause image.
The user would still have to wait the entire 20 minutes for the installer to time out waiting for the bootstrap control plane to be reachable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the ppc64le and s390x architectures not included here because they do not need any mangling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct. x86_64 and aarch64 map to amd64 and arm64 resp. it is a cause of confusion in many places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A failing release-image.service won't prevent other services from starting. So is the purpose of this just to have a failing unit in the bootstrap gather?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
essential services like crio and kubelet won't start if release-image.service throws an error. so in that sense, we have caught this error early and let the user know that there is an arch mismatch in the specified release image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A release-image.service that fails will not prevent crio.service or kubelet.service from starting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah..right ..yeah. but it does prevent crio-configure.service from starting which in turn disables bootkube.service when there is an error, so in that sense , this error is visible right away after running journalctl -b -f -u release-image.service -u bootkube.service
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, my point is that it does not prevent any of those services from starting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, my point is that it does not prevent any of those services from starting.
Ah, my bad. You are correct. Sorry for the noise. I looked at crio.service and kubelet.service but not bootkube.service.
that is true unfortunately. This cannot be done outside of the bootstrap process early on when the installer runs as we would need to download the release image to check the arch metadata. these changes will help a little in that the error will be obvious on looking at the bootstrap logs rather than seeing a crio error and trying to figure out what the issue could be. |
Answering the first part of the question which i missed - yes these changes will be useful irrespective of the boot image changes, because this sanity checks the Openshift release image (which can be overridden through the OPENSHIFT_INSTALL_RELEASE_IMAGE_OVERRIDE and which is done in local testing and multi-arch CI as well) to see if it matches the target architecture. |
…itecture This is a simple sanity check to see if release image architecture matches the bootstrap node's architecture. This is useful in cases of user error where an incorrect release image is specified and we error right away rather than downloading the associated images and failing in say crio when downloading the incorrect pause image.
2d0102b to
175738e
Compare
|
rebased to latest |
|
/retest |
staebler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With #4751 as well, then the architecture mismatch will be displayed to the user in the terminal when the installation fails.
/lgtm
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: staebler The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
These changes have no effect on upgrades. |
|
@staebler: Overrode contexts on behalf of staebler: ci/prow/e2e-aws-upgrade DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/hold for libvirt and openstack tests to at least get a passing release image download |
|
/retest |
|
The libvirt job has a successful install. |
|
/override ci/prow/e2e-aws-upgrade |
|
@staebler: Overrode contexts on behalf of staebler: ci/prow/e2e-aws-upgrade DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
1 similar comment
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
@staebler could you override |
|
/override ci/prow/e2e-aws-upgrade |
|
@staebler: Overrode contexts on behalf of staebler: ci/prow/e2e-aws-upgrade DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@Prashanth684: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/retest |
|
/override ci/prow/e2e-aws-upgrade |
|
@staebler: Overrode contexts on behalf of staebler: ci/prow/e2e-aws-upgrade DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This is a simple sanity check to see if release image architecture matches the bootstrap node's architecture. This is useful in cases of user error where an incorrect release image is specified and we error right away rather than downloading the associated images and failing in say crio when downloading the incorrect pause image.