Fix fallback for ironic drivers that don't support soft power off#985
Fix fallback for ironic drivers that don't support soft power off#985metal3-io-bot merged 6 commits intometal3-io:masterfrom
Conversation
The code to check whether a previous PowerOff command succeeded is essentially the same between the hard and soft power off modes, so check it once at the start instead of duplicating it.
If the ironic driver does not support soft power off, we need to fall back to hard power off. Due to an oversight in 67a27dc we stopped checking for this and just returned a transient error in this case. While returning a failure would have resulted in retrying with a hard power off (albeit after reporting an error to the user), returning a transient error means that it will retry the soft power off forever. Restore the fallback and add tests to cover this case. Because we can't set the dummy ironic server to return different responses on subsequent calls we will always get an error, so check that it is from the fallback hard power off. Fixes metal3-io#984
If we publish the event on success inside changePower, we don't have to have three different methods calling it that all have to check the return values for success. This allows us to eliminate the HostLockedError type. It also fixes a bug where on Power Off we were publishing an event saying the power was turned off in the case where we actually were waiting for a change in the provisioning state.
This is only used for internal flow control, so there is no need to export it from the package. Given that it is the only thing left in the errors.go file, it's also better to define it where it is used.
Since we are already using errors.As() to detect softPowerOffUnsupported, avoid duplicating code by wrapping the error.
|
/test-integration |
|
/lgtm |
sadasu
left a comment
There was a problem hiding this comment.
/lgtm
I see the regression in the previous changes. Thanks for the fix.
|
@sadasu: adding LGTM is restricted to approvers and reviewers in OWNERS files. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
| t.Fatalf("could not create provisioner: %s", err) | ||
| } | ||
|
|
||
| _, err = prov.PowerOff(metal3v1alpha1.RebootModeSoft, false) |
There was a problem hiding this comment.
Unfortunately the current mock server does not allow to specify multiple, different responses for the same API call, so I don't think there is an easy way currently to test such scenario without enhancing properly the mock (right now, when a response call is configured, like here in .WithNodeStatesPowerUpdate(nodeUUID, http.StatusBadRequest) for example, the mock server will always provide the same answer for all the requests).
I think we could discuss this change in a separate PR as it may be useful also for other tests.
There was a problem hiding this comment.
Yeah, I agree that would be a helpful feature to have in the mocks. If that were possible then this could have been implemented in TestPowerOff (and the regression would not have happened) without having to write this separate TestSoftPowerOffFallback test with its own logic.
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andfasano The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Fix a regression in #841 that caused ironic drivers that don't support soft power off (such as Fujitsu when the agent is not available) to fail in an infinite loop instead of falling back to a hard power off.
Also clean up the code so it is much less arcane.
Fixes #984