-
Notifications
You must be signed in to change notification settings - Fork 310
[WIP] Limit the number of hosts simultaneously provisioned (BMO only version) #730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -22,6 +22,7 @@ import ( | |
| "os" | ||
| "strconv" | ||
| "strings" | ||
| "sync" | ||
| "time" | ||
|
|
||
| "github.com/go-logr/logr" | ||
|
|
@@ -50,9 +51,21 @@ const ( | |
| unmanagedRetryDelay = time.Minute * 10 | ||
| provisionerNotReadyRetryDelay = time.Second * 30 | ||
| rebootAnnotationPrefix = "reboot.metal3.io" | ||
| maxProvisioningHostsDefault = 20 | ||
| ) | ||
|
|
||
| var maxProvisioningHosts int | ||
|
andfasano marked this conversation as resolved.
|
||
|
|
||
| func init() { | ||
| maxProvisioningHosts = maxProvisioningHostsDefault | ||
| if maxHostsStr := os.Getenv("MAX_CONCURRENT_PROVISIONING_HOSTS"); maxHostsStr != "" { | ||
| value, err := strconv.Atoi(maxHostsStr) | ||
| if err != nil { | ||
| fmt.Fprintf(os.Stderr, "Cannot start: Invalid value set for variable MAX_CONCURRENT_PROVISIONING_HOSTS=%s", maxHostsStr) | ||
| os.Exit(1) | ||
| } | ||
| maxProvisioningHosts = value | ||
| } | ||
| } | ||
|
|
||
| // BareMetalHostReconciler reconciles a BareMetalHost object | ||
|
|
@@ -61,6 +74,22 @@ type BareMetalHostReconciler struct { | |
| Log logr.Logger | ||
| Scheme *runtime.Scheme | ||
| ProvisionerFactory provisioner.Factory | ||
|
|
||
| currentProvisioningHosts map[string]struct{} | ||
| mu sync.Mutex | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be preferable to have a name that describes what the mutex is protecting.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok. I can move them in a separate struct to make it even more clear, and it could help a little bit in testing |
||
| } | ||
|
|
||
| // NewBareMetalHostReconciler creates a new reconciler instance | ||
| func NewBareMetalHostReconciler(client client.Client, scheme *runtime.Scheme, factory provisioner.Factory) *BareMetalHostReconciler { | ||
| r := BareMetalHostReconciler{ | ||
| Client: client, | ||
| Log: ctrl.Log.WithName("controllers").WithName("BareMetalHost"), | ||
| Scheme: scheme, | ||
| ProvisionerFactory: factory, | ||
|
|
||
| currentProvisioningHosts: make(map[string]struct{}), | ||
| } | ||
| return &r | ||
| } | ||
|
|
||
| // Instead of passing a zillion arguments to the action of a phase, | ||
|
|
@@ -914,9 +943,62 @@ func hostHasFinalizer(host *metal3v1alpha1.BareMetalHost) bool { | |
| return utils.StringInList(host.Finalizers, metal3v1alpha1.BareMetalHostFinalizer) | ||
| } | ||
|
|
||
| // Resync the set at the startup | ||
| func (r *BareMetalHostReconciler) syncProvisioningHosts(mgr ctrl.Manager) { | ||
| hosts := &metal3v1alpha1.BareMetalHostList{} | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The controller runtime uses a caching client, which means when we ask for resources that we're subscribed to we look at the cache and get a list back quickly. If we eliminate the mutex management and just write a function that returns a number, it should be plenty fast enough to call every time we start to reconcile.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got your point and will give it a try, but I'm not sure it could work, since the cache is not updated immediately, thus still risking a surge (while using a sync set every thread has always an effectively updated view). Anyhow I'll check
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This PR uses the same cache to initialize the data structure, right?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not really, I've the APIReader() returned by the manager to access directly the API server |
||
| err := mgr.GetAPIReader().List(context.TODO(), hosts) | ||
| if err != nil { | ||
| os.Exit(1) | ||
| } | ||
|
|
||
| for _, host := range hosts.Items { | ||
| r.checkProvisioningHost(host, host.Status.Provisioning.State) | ||
| } | ||
| } | ||
|
|
||
| // Check if there's a free slot for hosts to be provisioned | ||
| func (r *BareMetalHostReconciler) checkProvisioningHost(host metal3v1alpha1.BareMetalHost, state metal3v1alpha1.ProvisioningState) bool { | ||
| r.mu.Lock() | ||
| defer r.mu.Unlock() | ||
|
|
||
| switch state { | ||
| case metal3v1alpha1.StateInspecting, metal3v1alpha1.StateProvisioning: | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What about deprovisioning? Isn't that just as hard on ironic-conductor? I don't know that we ever want to block deprovisioning but we probably don't want to start provisioning a bunch of stuff when we already have a large amount of deprovisioning in-flight.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's a good question, and in reality deprovisioning wasn't part of initial proposal, which focused only on the provisioning phase. Maybe some Ironic expert could comment better (@dtantsur ?), anyhow the deleting part appeared to be not so hard as the provisioning part. |
||
| if len(r.currentProvisioningHosts) >= maxProvisioningHosts { | ||
| return false | ||
| } | ||
| r.currentProvisioningHosts[host.Name] = struct{}{} | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This assumes that we will successfully write the state change. If writing that fails we will end up back here checking for capacity again against a map that already contains the host. The good news is that this is a map not a list so we can only store one copy, and a single host repeatedly failing will not fill up our quota and block everything. Nevertheless, this exposes us to systemic risk: e.g. if we try to provision 20 hosts and then the k8s API becomes briefly unavailable for some (presumably not unrelated) reason, the bmo is now deadlocked and cannot provision anything ever again until it is restarted. We have a mechanism in the info struct for deferring function calls until after the write succeeds, and I think we need to use that here.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice catch, I'll check how to use that |
||
| default: | ||
| delete(r.currentProvisioningHosts, host.Name) | ||
| } | ||
|
|
||
| return true | ||
| } | ||
|
|
||
| // Check if there's a free slot for hosts that have been previously delayed | ||
| func (r *BareMetalHostReconciler) checkDelayedHost(info *reconcileInfo) (bool, actionResult) { | ||
| r.mu.Lock() | ||
| defer r.mu.Unlock() | ||
|
|
||
| switch info.host.Status.ErrorType { | ||
| case metal3v1alpha1.TooManyHostsError: | ||
| if len(r.currentProvisioningHosts) >= maxProvisioningHosts { | ||
| // No available slots, current action delayed | ||
| return true, recordActionFailure(info, metal3v1alpha1.TooManyHostsError, "Delayed, too many hosts") | ||
| } | ||
|
|
||
| // A slot could be available, let's cleanup the host and retry | ||
| info.host.ClearError() | ||
| return true, actionContinue{} | ||
| } | ||
|
|
||
| return false, nil | ||
| } | ||
|
|
||
| // SetupWithManager reigsters the reconciler to be run by the manager | ||
| func (r *BareMetalHostReconciler) SetupWithManager(mgr ctrl.Manager) error { | ||
|
|
||
| r.syncProvisioningHosts(mgr) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe there is a race here; we haven't yet started watching resources or obtained the leader lock, so Hosts can be modified after we build our initial state without us seeing the changes (potentially ever, since our updates are edge-triggered).
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there any other initialization point then that could be used (before reconciling)? Otherwise the other approach could be to keep in sync the set at the beginning of every reconcile loop, in line with @dhellmann's suggestion |
||
|
|
||
| maxConcurrentReconciles := 3 | ||
| if mcrEnv, ok := os.LookupEnv("BMO_CONCURRENCY"); ok { | ||
| mcr, err := strconv.Atoi(mcrEnv) | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -73,8 +73,13 @@ func recordStateEnd(info *reconcileInfo, host *metal3v1alpha1.BareMetalHost, sta | |
| } | ||
|
|
||
| func (hsm *hostStateMachine) updateHostStateFrom(initialState metal3v1alpha1.ProvisioningState, | ||
| info *reconcileInfo) { | ||
| info *reconcileInfo) bool { | ||
| if hsm.NextState != initialState { | ||
|
|
||
| if !hsm.Reconciler.checkProvisioningHost(*info.host, hsm.NextState) { | ||
| return false | ||
| } | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's a little strange that we will be stuck in e.g. the Ready state with an error, rather than the Provisioning state. This does make things simpler though. |
||
|
|
||
| info.log.Info("changing provisioning state", | ||
| "old", initialState, | ||
| "new", hsm.NextState) | ||
|
|
@@ -104,11 +109,22 @@ func (hsm *hostStateMachine) updateHostStateFrom(initialState metal3v1alpha1.Pro | |
| } | ||
| } | ||
| } | ||
|
|
||
| return true | ||
| } | ||
|
|
||
| func (hsm *hostStateMachine) ReconcileState(info *reconcileInfo) actionResult { | ||
| func (hsm *hostStateMachine) ReconcileState(info *reconcileInfo) (actionRes actionResult) { | ||
| initialState := hsm.Host.Status.Provisioning.State | ||
| defer hsm.updateHostStateFrom(initialState, info) | ||
| defer func() { | ||
| if !hsm.updateHostStateFrom(initialState, info) { | ||
| actionRes = recordActionFailure(info, metal3v1alpha1.TooManyHostsError, "Delayed, too many hosts") | ||
| } | ||
| }() | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At one stage @dhellmann had a proposal for a generic way to attach actions to particular state transitions, but I don't remember the context...
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I suggested adding a |
||
|
|
||
| // Check if any immediate action is required for a delayed host | ||
| if delayed, actionRes := hsm.Reconciler.checkDelayedHost(info); delayed { | ||
| return actionRes | ||
| } | ||
|
|
||
| if hsm.checkInitiateDelete() { | ||
| info.log.Info("Initiating host deletion") | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.