Add detached annotation proposal by hardys · Pull Request #168 · metal3-io/metal3-docs

hardys · 2021-03-11T16:15:15Z

First draft of a proposal to enable control of maintenance mode via the BMH API.

hardys · 2021-03-11T16:15:35Z

/cc @dhellmann @dtantsur @avishayt

metal3-io-bot · 2021-03-11T16:15:38Z

@hardys: GitHub didn't allow me to request PR reviews from the following users: avishayt.

Note that only metal3-io members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

/cc @dhellmann @dtantsur @avishayt

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

dhellmann · 2021-03-11T16:20:39Z

+```
+
+TODO - we don't expose any reason here, is it enough to have
+a standard reason string e.g "disabled by BaremetalHost maintenance field"?


I assume the caller knows why the host is in maintenance mode. We could ask for a string reason, but I'm not sure what we would do with it. A UI or something could always use an annotation to store a message. Unless maybe we would show the reason as part of the default kubectl output for hosts?

The underlying Ironic API does accept a reason, but I guess the user isn't exposed to that - I was thinking maybe generating an event with the reason could be useful, but it could also just be a generic event saying the host got put into maintenance mode (which simplifies the API as it's just a boolean flag)

Ah, yeah, I wasn't really concerned with the value in Ironic so much as what we ask for or show on the host resource. Let's leave the reason out for now, under the theory that it's easier to add the field later if we decide we need it.

dhellmann · 2021-03-11T16:21:22Z

+a standard reason string e.g "disabled by BaremetalHost maintenance field"?
+
+TODO - what happens if a BMH is deleted with `maintenance: true` - do we
+skip deprovisioning or should that be an independent annotation/field?


We should look at what Ironic does and see if we want that to be our default behavior.

@dtantsur thoughts?

From local testing I'm fairly sure that when you've set the maintenance flag you can just delete the node without any deprovisioning, so aligning with that behavior seems reasonable to me.

That matches what I remember.

romfreiman · 2021-03-11T19:34:59Z

+
+## Motivation
+
+In a multi-tier deployment where one cluster deploys another, the "parent"


I think in general - transfer ownership between clusters, right?

There have been some discussions about moving hosts between clusters as part of auto-scaling, but there are some different implications when doing that (we would absolutely need to deprovision those hosts as part of the transfer, to avoid leaking data from one cluster to another, for example). So, I think for this proposal let's stay focused on the multi-tier deployment case.

kashifest

+1 for this. This will be really helpful. I think also as a non-goal it should be made clear that when ironic is putting a node on maintenance mode, we won't want to sync that back to BMH resource, right or would we?
/cc @maelk

hardys · 2021-03-12T09:48:37Z

+1 for this. This will be really helpful. I think also as a non-goal it should be made clear that when ironic is putting a node on maintenance mode, we won't want to sync that back to BMH resource, right or would we?

This is a good point - I was thinking that the BMO sync would remain active (since reconciliation is not paused), but since maintenance mode flag in Ironic pauses all periodic tasks inside Ironic, it's unlikely anything in the BMH will change from Ironic->BMH while in this state.

Another option (mentioned in the alternatives) would be to modify the behavior of the baremetalhost.metal3.io/paused annotation to set the Ironic host to maintenance mode (or even remove it from Ironic completely), but just exposing the maintenance mode flag seems a little cleaner, and this approach would prevent some desirable workflows (in particular the ability to delete a BMH in maintenance mode, skipping deprovisioning)

hardys · 2021-03-15T16:53:40Z

Ok so to summarize, we have three options which could satisfy the underlying requirement:

Add `spec.maintenance` flag

This would set the Ironic maintenance mode flag, as described in the current draft proposal

Advantages:

Simple to implement
Well understood behavior based on the underlying Ironic API

Disadvantages:

Node still exists in Ironic, which could be confusing, and may not be desirable for the multi-tier use-case (where in reality it's not "maintenance" but a potentially permanent change in ownership of the physical server management)
In the case where the parent cluster deploys $many child clusters, this solution might not scale as well as some of the others?

Modify `baremetalhost.metal3.io/paused` behavior

In this case we'd modify the current "paused" behavior, such that:

While paused, the host is removed from Ironic
It is possible to delete a BMH in paused state (without any deprovisioning actions)

Advantages:

Should be fairly simple to implement
Avoids adding a net-new API

Disadvantages:

Modifies existing API behavior (are we aware of any users of this other than the openshift installer atm?)

Add new "unmanaged" annotation

In this case we'd add a new annotation which either behaves as described above for the modifed-pause approach, or we add logic to every state which skips management operations.

Advantages:

Avoids changing existing API behavior

Disadvantages:

Potentially more complex to implement (in the "every state" case)
Could be confusing since it's similar in behavior to the existing "paused" annotation

Currently my preference is to go with modifying the behavior of the paused annotation - what do others think about that?

celebdor · 2021-03-15T17:11:27Z

Currently my preference is to go with modifying the behavior of the paused annotation - what do others think about that?

I also think that's a good way to go.

Can you expand on Add new "unmanaged" annotation's disadvantage of "Potentially more complex to implement (in the "every state" case)"

hardys · 2021-03-15T17:17:06Z

Can you expand on Add new "unmanaged" annotation's disadvantage of "Potentially more complex to implement (in the "every state" case)"

With the paused annotation approach, we just exit from the Reconcile() early

If we introduced some new annotation which didn't do that, it would likely mean we need to update many/all states in the controller to look for the new annotation and do nothing when it's set - just exiting the Reconcile early (but adding some new logic to deal with the removal from Ironic, and to enable deletion) seems like a simpler and less error-prone approach to me.

dhellmann · 2021-03-15T17:18:35Z

+1 for this. This will be really helpful. I think also as a non-goal it should be made clear that when ironic is putting a node on maintenance mode, we won't want to sync that back to BMH resource, right or would we?
/cc @maelk

This is an interesting point. Does Ironic set this flag on hosts by itself under some conditions, @dtantsur?

dhellmann · 2021-03-15T17:21:31Z

Modify baremetalhost.metal3.io/paused behavior

In this case we'd modify the current "paused" behavior, such that:

While paused, the host is removed from Ironic

It is possible to delete a BMH in paused state (without any deprovisioning actions)

Advantages:

Should be fairly simple to implement

Avoids adding a net-new API

Disadvantages:

Modifies existing API behavior (are we aware of any users of this other than the openshift installer atm?)

I believe the feature was added to make it easier for upstream to safely pivot and copy CR content from the provisioning cluster to the new cluster. I don't think the change in behavior will conflict with that use case, but we should confirm that. @kashifest, @fmuyassarov, @maelk what do you think?

Currently my preference is to go with modifying the behavior of the paused annotation - what do others think about that?

+1 for extending the behavior of the annotation we already have.

andfasano · 2021-03-15T17:32:18Z

Currently my preference is to go with modifying the behavior of the paused annotation - what do others think about that?

👍 for that. I think that the potential ambiguity of a new - similar but slightly different - annotation is worse than modifying the existing behavior of baremetalhost.metal3.io/paused.

The only part that I guess still requires some more clarification it's about the "unpausing" step: what is the expected behavior when the annotation will be required, considering that the underlying Ironic node has been previously deleted?

hardys · 2021-03-15T17:48:35Z

The only part that I guess still requires some more clarification it's about the "unpausing" step: what is the expected behavior when the annotation will be required, considering that the underlying Ironic node has been previously deleted?

I'm thinking it's essentially the same process as recovering from a pod reschedule where the Ironic DB is lost - we'd create the Ironic host again and resume management of the host?

Provided we're confident that process is robust, I think we should be OK to delete the Ironic host, but we'll need to check there aren't unhandled corner cases.

hardys · 2021-03-16T10:43:19Z

+1 for this. This will be really helpful. I think also as a non-goal it should be made clear that when ironic is putting a node on maintenance mode, we won't want to sync that back to BMH resource, right or would we?
/cc @maelk

This is an interesting point. Does Ironic set this flag on hosts by itself under some conditions, @dtantsur?

Oh, I think I misinterpreted the question from @kashifest in my previous reply, apologies!

Yes, in some circumstances Ironic does put a host into maintenance mode - IIRC it happens when there are issues connecting to the BMC, but @dtantsur can perhaps advise if there are other conditions we should be aware of.

Regardless IMO that shouldn't be exposed directly via the BMH, it's an internal state which we might reflect via some error status or event, but not expose directly (if we added a maintenance flag, which it seems is not the likely direction for the requirement I'm trying to solve)

hardys · 2021-03-16T10:45:13Z

Ok thanks all for the feedback, it seems that extending the paused annotation is the preferred approach, so I'll rework this proposal to reflect that.

andfasano · 2021-03-16T11:29:41Z

The only part that I guess still requires some more clarification it's about the "unpausing" step: what is the expected behavior when the annotation will be required, considering that the underlying Ironic node has been previously deleted?

I'm thinking it's essentially the same process as recovering from a pod reschedule where the Ironic DB is lost - we'd create the Ironic host again and resume management of the host?

Provided we're confident that process is robust, I think we should be OK to delete the Ironic host, but we'll need to check there aren't unhandled corner cases.

Agree, I think it could be useful to have such behavior tracked down in the proposal for clarity

zaneb

Reusing the paused annotation seems like the tidiest user interface.
My only worry would be that it makes the implementation more complex, since the pause annotation is checked very early but we will now need to make calls to the provisioner when it is present (an obvious place to put the code for e.g. an unmanaged annotation would be in the vicinity of ensureRegistered()).
A key point about the paused annotation is that it is used in pivoting to make sure the Status subresource is not written to (to make sure that it can be regenerated from the status annotation). So that precludes us ever including in the Status any indication that the Host is in maintenance mode and that all of the data (power state &c.) is meaningless.

dhellmann · 2021-03-16T16:28:02Z

Reusing the paused annotation seems like the tidiest user interface.
My only worry would be that it makes the implementation more complex, since the pause annotation is checked very early but we will now need to make calls to the provisioner when it is present (an obvious place to put the code for e.g. an unmanaged annotation would be in the vicinity of ensureRegistered()).
A key point about the paused annotation is that it is used in pivoting to make sure the Status subresource is not written to (to make sure that it can be regenerated from the status annotation). So that precludes us ever including in the Status any indication that the Host is in maintenance mode and that all of the data (power state &c.) is meaningless.

These are good points. I especially hadn't considered the fact that we couldn't write to the status fields. I am starting to think that adding a new annotation, while expanding the UI, would make the implementation easier to follow and verify. Given the complexity of managing the host state, I would err on the side of simplifying the implementation in this instance.

hardys · 2021-03-16T17:24:35Z

These are good points. I especially hadn't considered the fact that we couldn't write to the status fields

Do we need to write to the status fields? The only time I can think that might be necessary is if we encounter an unrecoverable error deleting the underlying host from Ironic (when first encountering the paused annotation), but we could generate an event without updating the host status in that case?

dhellmann · 2021-03-16T18:41:36Z

These are good points. I especially hadn't considered the fact that we couldn't write to the status fields

Do we need to write to the status fields? The only time I can think that might be necessary is if we encounter an unrecoverable error deleting the underlying host from Ironic (when first encountering the paused annotation), but we could generate an event without updating the host status in that case?

The error handling relies on a counter in the status field to manage the retry interval backoff. We wouldn't be able to update that field. Maybe we don't need to? Talking to Ironic can result in errors, even if they are just 409s, though. If we can ensure that all of the pause handling happens outside of the state machine, we should be able to avoid the backoff logic and maybe that would be OK.

It would be useful to see if the amount of change in Reconcile() matches what @zaneb describes, too. The pause annotation happens very early. Between that point and where we have a Provisioner, we handle deletion, the status annotation, and the hardware annotation. I guess we're talking about allowing delete operations while paused? So maybe we only need to update the logic around the other annotations to skip them if we're paused. And then when we get to the point where we have a provisioner, if we're paused we would handle that instead of entering the state machine. We wouldn't be able to update the status fields on the host to record that we have handled it (by clearing the provisioning ID field), though, so every time the host is reconciled we would look at Ironic to verify that the host is not present. I guess once that's done, we don't need to keep reconciling the host, so it wouldn't be too expensive.

So, I think we could still do it with the pause annotation. It's going to complicate Reconcile(), but I don't think there's a way to do this that won't add some complexity there.

hardys · 2021-03-17T16:16:43Z

Ok this was discussed today in the community meeting, in summary:

@maelk pointed out that the CAPI pivot relies on us not updating the status, and as mentioned above by @dhellmann that may not be ideal in the case where there is some error deleting the host from Ironic
@zaneb mentioned it may be simpler to implement if we go with a new annotation
Some other use-cases for maintenance mode were discussed, but I think for the multi-tier use-case we're agreed that removing the host from Ironic is probably preferable as it's simpler to implement and avoids potential corner-cases around adoption.

Accordingly I'll update this proposal to describe a new annotation and attempt to capture the feedback so far, thanks!

hardys · 2021-03-17T17:24:26Z

/retitle Add unmanaged mode proposal

dtantsur

👍🏽

metal3-io-bot · 2021-03-23T14:32:08Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dtantsur, hardys

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [hardys]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

hardys · 2021-03-23T15:57:33Z

WIP implementation ref metal3-io/baremetal-operator#827 - feedback welcome

dhellmann · 2021-03-23T18:47:07Z

+  triggering deprovisioning
+* Ensure it is possible to restart management of BMH resources (in the case
+  where they are not deleted from the parent cluster)
+* Avoid changing behavior of the existing `paused` annotation since


This sentence wasn't completed.

dhellmann · 2021-03-23T18:50:57Z

+include context from the system/user which applies the annotation.
+
+This annotation will be evaluated early in the `Reconcile()` loop, but after the
+`paused` `status` and `hardwaredetails` annotations are evaluated.


Why do we want to do it after those other annotations are handled? Does it actually matter, or is that just what we're doing?

I think we want to do it after the hardware details and status annotations are handled to avoid triggering inspection when the unmanaged annotation is set?

It seems like want to handle it before the paused annotation, since unmanaged is a more comprehensive way to say "stop paying attention to this host" than paused is.

Would it be safer for this new annotation to only apply to hosts in a steady state? What happens if provisioning starts and this annotation is added, for example? Or deprovisioning?

It definitely must be handled after paused and status. I think in the case of hardwaredetails that might just be where we're planning on doing it rather than a strict requirement.

The question of what happens in a non-steady state is a good one. This might be a good reason to use the Unmanaged provisioning state as we discussed, since it means the host will always return to the Registering or Provisoned states afterwards. We might want to ignore the annotation if we are in the Deprovisioning state though, because AFAIR there's no way to distinguish whether the host was Provisioned or Deprovisioning before going to Unmanaged.

I think in the case of hardwaredetails that might just be where we're planning on doing it rather than a strict requirement

Yes it's not a strict requirement, but I was thinking there's not really any reason to prevent consuming hardwaredetails while in the unmanaged mode, thus what I proposed in metal3-io/baremetal-operator#827

The question of what happens in a non-steady state is a good one. This might be a good reason to use the Unmanaged provisioning state as we discussed, since it means the host will always return to the Registering or Provisoned states afterwards

Agreed I'd not really considered this case - previous feedback from @dhellmann was that we wouldn't want to add a new transition to the Unmanaged state, but it seems like the cleanest way to enforce that is to handle this mode via the state machine?

The other option is to only allow this annotation to function when in a specific steady state (e.g for the pivot use-case, only Provisioned would be enough)

It definitely must be handled after paused and status. I think in the case of hardwaredetails that might just be where we're planning on doing it rather than a strict requirement.

If we handle this after the paused annotation, that means pausing has higher precedence than marking it as unmanaged. I'm not sure that makes sense.

The question of what happens in a non-steady state is a good one. This might be a good reason to use the Unmanaged provisioning state as we discussed, since it means the host will always return to the Registering or Provisoned states afterwards

Agreed I'd not really considered this case - previous feedback from @dhellmann was that we wouldn't want to add a new transition to the Unmanaged state, but it seems like the cleanest way to enforce that is to handle this mode via the state machine?

The other option is to only allow this annotation to function when in a specific steady state (e.g for the pivot use-case, only Provisioned would be enough)

Can we handle it in actionManageSteadyState in the controller?

If we handle this after the paused annotation, that means pausing has higher precedence than marking it as unmanaged. I'm not sure that makes sense.

Pausing is meant to appear to stop reconciliation, so the only way it does make sense is if it has higher precedence than everything.

Can we handle it in actionManageSteadyState in the controller?

No, because it would get re-registered first.

Yes, it is really a requirement for pivoting as done with CAPI and CAPM3, that the pausing annotation is the first thing handled and has the highest precendence

kashifest · 2021-03-25T20:21:12Z

@hardys if I understand correctly, this proposal then wouldn't sync BMH api with ironic's maintenance mode? I mean there wouldn't be any BMH spec field from which we would be able to set maintenance mode in ironic?

hardys · 2021-03-31T17:12:04Z

@hardys if I understand correctly, this proposal then wouldn't sync BMH api with ironic's maintenance mode? I mean there wouldn't be any BMH spec field from which we would be able to set maintenance mode in ironic?

@kashifest right, that was my initial proposal, but now the host is removed completely from Ironic ref metal3-io/baremetal-operator#827 - we could still add a spec.maintenance field, but based on the discussion here it doesn't sound like the best fit for the use-case I'm currently trying to solve.

hardys · 2021-04-01T16:27:59Z

/retitle Add detached annotation proposal

hardys · 2021-04-01T16:28:32Z

Updated to align with metal3-io/baremetal-operator#827

andfasano · 2021-04-02T12:40:30Z

+
+When the `detached` annotation is set, we will check the `status.provisioning.ID`
+and if necessary delete the corresponding host from Ironic (without triggering
+deprovisioning)


Maybe it could help to add that until the annotation remains then BMO will ignore any event for that resource (but the annotation removal for resuming the operations)?

andfasano · 2021-04-23T09:56:31Z

@hardys the current version LGTM, anything else missing?

hardys · 2021-05-12T09:29:20Z

@hardys the current version LGTM, anything else missing?

@andfasano no I think this can merge now, particularly since the related BMO implementation landed some time ago ;)

andfasano · 2021-05-12T11:07:14Z

Ok thanks

/lgtm

metal3-io-bot requested review from dhellmann and dtantsur March 11, 2021 16:15

dhellmann reviewed Mar 11, 2021

View reviewed changes

hardys force-pushed the maintenance_mode branch from 67f25c0 to 0e1b364 Compare March 11, 2021 16:23

romfreiman reviewed Mar 11, 2021

View reviewed changes

kashifest reviewed Mar 12, 2021

View reviewed changes

metal3-io-bot requested a review from maelk March 12, 2021 06:53

dtantsur requested changes Mar 12, 2021

View reviewed changes

Comment thread design/baremetal-operator/maintenance-mode.md Outdated

avishayt reviewed Mar 14, 2021

View reviewed changes

Comment thread design/baremetal-operator/maintenance-mode.md Outdated

zaneb reviewed Mar 16, 2021

View reviewed changes

Comment thread design/baremetal-operator/maintenance-mode.md Outdated

Comment thread design/baremetal-operator/maintenance-mode.md Outdated

metal3-io-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 17, 2021

metal3-io-bot changed the title ~~WIP: Add maintenance mode proposal~~ Add unmanaged mode proposal Mar 17, 2021

metal3-io-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 17, 2021

hardys force-pushed the maintenance_mode branch 2 times, most recently from d73befc to eaab0ae Compare March 18, 2021 12:03

zaneb reviewed Mar 18, 2021

View reviewed changes

Comment thread design/baremetal-operator/unmanaged-annotation.md Outdated

dtantsur approved these changes Mar 23, 2021

View reviewed changes

hardys mentioned this pull request Mar 23, 2021

Add support for detached annotation metal3-io/baremetal-operator#827

Merged

dhellmann reviewed Mar 23, 2021

View reviewed changes

sadasu reviewed Mar 31, 2021

View reviewed changes

Comment thread design/baremetal-operator/unmanaged-annotation.md Outdated

hardys closed this Mar 31, 2021

hardys reopened this Mar 31, 2021

hardys force-pushed the maintenance_mode branch from dfbccf5 to f5b716f Compare April 1, 2021 16:24

Add detached annotation proposal

a82e5bb

hardys force-pushed the maintenance_mode branch from f5b716f to a82e5bb Compare April 1, 2021 16:25

metal3-io-bot changed the title ~~Add unmanaged mode proposal~~ Add detached annotation proposal Apr 1, 2021

andfasano reviewed Apr 2, 2021

View reviewed changes

metal3-io-bot added the lgtm Indicates that a PR is ready to be merged. label May 12, 2021

metal3-io-bot merged commit 59d5125 into metal3-io:master May 12, 2021


		## Motivation

		In a multi-tier deployment where one cluster deploys another, the "parent"

Conversation

hardys commented Mar 11, 2021

Uh oh!

hardys commented Mar 11, 2021

Uh oh!

metal3-io-bot commented Mar 11, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kashifest left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hardys commented Mar 12, 2021

Uh oh!

Uh oh!

Uh oh!

hardys commented Mar 15, 2021

Add spec.maintenance flag

Modify baremetalhost.metal3.io/paused behavior

Add new "unmanaged" annotation

Uh oh!

celebdor commented Mar 15, 2021

Uh oh!

hardys commented Mar 15, 2021

Uh oh!

dhellmann commented Mar 15, 2021

Uh oh!

dhellmann commented Mar 15, 2021

Modify baremetalhost.metal3.io/paused behavior

Uh oh!

andfasano commented Mar 15, 2021

Uh oh!

hardys commented Mar 15, 2021

Uh oh!

hardys commented Mar 16, 2021

Uh oh!

hardys commented Mar 16, 2021

Uh oh!

andfasano commented Mar 16, 2021

Uh oh!

zaneb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dhellmann commented Mar 16, 2021

Uh oh!

hardys commented Mar 16, 2021

Uh oh!

dhellmann commented Mar 16, 2021

Uh oh!

hardys commented Mar 17, 2021

Uh oh!

hardys commented Mar 17, 2021

Uh oh!

Uh oh!

dtantsur left a comment

Choose a reason for hiding this comment

Uh oh!

metal3-io-bot commented Mar 23, 2021

Uh oh!

hardys commented Mar 23, 2021

Uh oh!

Uh oh!

Choose a reason for hiding this comment

kashifest left a comment •

edited

Loading

Add `spec.maintenance` flag

Modify `baremetalhost.metal3.io/paused` behavior

Modify `baremetalhost.metal3.io/paused` behavior

andfasano Apr 2, 2021 •

edited

Loading