Skip to content

Conversation

@hardys
Copy link

@hardys hardys commented Mar 10, 2021

Rebase on upstream metal3-io/baremetal-operator to pick up the most recent changes

rdoxenham and others added 23 commits February 17, 2021 14:54
This adds the required changes to the provisioner to support
customised reboot annotations, allowing a node to be powered
off (or fenced) more quickly when required. The default
behaviour prior was to attempt a softPowerOff() first and only
attempt a hardPowerOff() if that failed. With this commit and
its counterparts for the baremetal-operator, by setting the
annotation to have an additional mode, e.g. {"mode": "hard"},
the provisioner will immediately power down the node. If this
isn't listed, or is malformed, we still retain the softPowerOff()
behaviour.
The default reboot-interface behaviour is to attempt a soft power
off, and if this fails, revert to a hard power off (PR metal3-io#294). For high
availability use-cases we require the ability to immediately power-off
a node. This PR attempts to address that requirement and is part of a
wider solution requiring the CAPBM to set the annotation that we have
detailed and implemented in this commit. The baseline provisioner API
changes have been provided in an earlier commit.

CAPBM PR: openshift/cluster-api-provider-baremetal#138
Also see: https://bugzilla.redhat.com/show_bug.cgi?id=1927678
In this commit we add further integration for the RebootMode type
and no longer rely on a boolean for understanding whether the reboot
request was for a hardPowerOff() or softPowerOff(). This will allow
us to expand the modes we support later down the line if required
without any significant modifications required to the provisioner
API.
docs: clarifies power management for externally managed hosts
Implement explicit reboot mode options
Currently we force the status annotation to provide every
field, but when metal3-io#796 lands it may be preferable to allow
a subset, so it's possible to set only a subset of the
hardware status fields when inspection is disabled.
Make BMH HardwareDetails fields optional
Signed-off-by: zouyu <[email protected]>
Support RAID configuration for baremetal server
This variable has a well known default, just use it.
Development: account for unset GOPATH
As described in metal3-io/metal3-docs#155

This enables addition of only the hardware part of the status,
unlike the status annotation which can only be set on the first
reconcile, since it allows all fields of the status to be modified.
Adds documentation for these interfaces, more information can be found in
metal3-io/metal3-docs#155
- update go.mod
- port to controller-runtime/pkg/log
- Add ctx to Reconcile
- fix UpdateEvent
- change Scheme to GetScheme
- edit unit tests
- add leases rbac permissions
Add inspect.metal3.io/hardwaredetails annotation
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 10, 2021
@hardys hardys requested a review from honza March 10, 2021 10:46
hardys pushed a commit to hardys/cluster-baremetal-operator that referenced this pull request Mar 10, 2021
This is to align with the rebase proposed via:
openshift/baremetal-operator#133
@hardys hardys changed the title Merge upstream 20210310 103910 Merge upstream 20210310 Mar 10, 2021
@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

13 similar comments
@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

@Gal-Zaidman
Copy link

@hardys e2e-metal-ipi-ovn-ipv6 looks very broken (last time it passed 02-03) is there a different job e2e you can trigger to test the changes?

@andfasano
Copy link

/test e2e-metal-ipi-ovn-ipv6

@andfasano
Copy link

@hardys e2e-metal-ipi-ovn-ipv6 looks very broken (last time it passed 02-03) is there a different job e2e you can trigger to test the changes?

Such job is a mandatory one and it is not recommended to be skipped. On the nightly release branch - even though it looks a little bit flaky - it passed few hours ago https://amd64.ocp.releases.ci.openshift.org/releasestream/4.8.0-0.nightly/release/4.8.0-0.nightly-2021-03-14-134919

@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

@hardys
Copy link
Author

hardys commented Mar 15, 2021

@hardys e2e-metal-ipi-ovn-ipv6 looks very broken (last time it passed 02-03) is there a different job e2e you can trigger to test the changes?

I think as @honza mentioned we need openshift/cluster-baremetal-operator#118 to merge, but I forgot to put a hold on this PR

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 15, 2021
@hardys
Copy link
Author

hardys commented Mar 15, 2021

openshift/cluster-baremetal-operator#118 is now passing CI, @honza PTAL, thanks!

@honza
Copy link
Member

honza commented Mar 15, 2021

I just LGTM'd openshift/cluster-baremetal-operator#118 so now it's up to CI.

@honza
Copy link
Member

honza commented Mar 15, 2021

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 15, 2021
@honza
Copy link
Member

honza commented Mar 15, 2021

/test e2e-metal-ipi-ovn-ipv6

@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

@hardys
Copy link
Author

hardys commented Mar 15, 2021

Ok we now need the CBO change to make it into a 4.8.0-0.ci Accepted build, then hopefully the retests will work

@openshift-merge-robot openshift-merge-robot merged commit b37e044 into openshift:master Mar 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.