Bug 1936407: Backport of BMO code to 4.7 to support different reboot modes#130
Bug 1936407: Backport of BMO code to 4.7 to support different reboot modes#130rdoxenham wants to merge 7 commits intoopenshift:release-4.7from
Conversation
When inspect.metal3.io=disabled is specified as an annotation we skip inspection and return complete immediately from the Inspecting state. Partially-Implements: metal3-io/metal3-docs#155
This adds the required changes to the provisioner to support
customised reboot annotations, allowing a node to be powered
off (or fenced) more quickly when required. The default
behaviour prior was to attempt a softPowerOff() first and only
attempt a hardPowerOff() if that failed. With this commit and
its counterparts for the baremetal-operator, by setting the
annotation to have an additional mode, e.g. {"mode": "hard"},
the provisioner will immediately power down the node. If this
isn't listed, or is malformed, we still retain the softPowerOff()
behaviour.
The default reboot-interface behaviour is to attempt a soft power off, and if this fails, revert to a hard power off (PR openshift#294). For high availability use-cases we require the ability to immediately power-off a node. This PR attempts to address that requirement and is part of a wider solution requiring the CAPBM to set the annotation that we have detailed and implemented in this commit. The baseline provisioner API changes have been provided in an earlier commit. CAPBM PR: openshift/cluster-api-provider-baremetal#138 Also see: https://bugzilla.redhat.com/show_bug.cgi?id=1927678
In this commit we add further integration for the RebootMode type and no longer rely on a boolean for understanding whether the reboot request was for a hardPowerOff() or softPowerOff(). This will allow us to expand the modes we support later down the line if required without any significant modifications required to the provisioner API.
|
@rdoxenham: No Bugzilla bug is referenced in the title of this pull request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: rdoxenham The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/retitle Bug 1936407: Backport of BMO code to 4.7 to support different reboot modes Misplaced |
|
@rdoxenham: This pull request references Bugzilla bug 1936407, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
…en't available in 4.7
|
/bugzilla refresh |
|
@rdoxenham: This pull request references Bugzilla bug 1936407, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Yea that's a good point, we previously back ported some changes already (https://github.com/openshift/baremetal-operator/pull/127/files). I don't have a good sense of what backporting this Delayed change means. It's been tough since some late arriving bug fixes were built on top of substantial BMO changes that landed upstream around the time 4.7 was cut. |
|
@stbenjam @hardys @honza - I raised #132 which deals with the conflict resolution, and it's much cleaner. I don't think this will lead to any significant conflict management later down the line. I suggest we close this one and go with the other, and I will retitle it to coincide with what this PR was attempting to be. I'll let you decide how to proceed. Thx! |
Ok sounds good, thanks, lets go with #132 /close |
|
@hardys: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@rdoxenham: This pull request references Bugzilla bug 1936407. The bug has been updated to no longer refer to the pull request using the external bug tracker. All external bug links have been closed. The bug has been moved to the NEW state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Here we're pulling in the BMO code that supports flexible reboot modes to be issued by clients, allowing soft reboots when required, e.g. maintenance, and hard reboots when remediation and workload recovery needs to take place. This code originally landed in metal3-io/baremetal-operator/795, backported to OpenShift in openshift/baremetal-operator/128, and now back to 4.7.z. We had to pull in a few additional commits that landed in 4.8 as dependencies for the three commits to land.
BZ to track: https://bugzilla.redhat.com/show_bug.cgi?id=1936407