-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Machine controller: drain node before machine deletion #1096
✨ Machine controller: drain node before machine deletion #1096
Conversation
5531a5c
to
e55ab5e
Compare
e55ab5e
to
396a671
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass, overall looks like a good step forward
One note though is that this only targets v1alpha2. For v1alpha1 we'd need to open a different PR to adapt the changes against the release-0.1 branch.
@michaelgugino do you plan to add cordoning to this PR too? |
@ncdc cordoning is part of the library being added here already. |
396a671
to
aeac80b
Compare
aeac80b
to
53b2cfc
Compare
I think we should be using the same drain code as kubectl does, and not adding an external dependency that will drift. That may well require a k/k dependency or code duplication while we shuffle things around. |
/hold For v1alpha2 we shouldn't use any external libraries, instead prefer waiting for drain code to be out of k/k or vendor k/k as last resort |
I continued the drain refactoring in k/k that @errordeveloper started kubernetes/kubernetes#80045 We could aim to get that merged into k/k, and then copy-paste pkg/kubectl/drain into our code until we are based on a version of k/k that includes the PR. (Or we can look at splitting pkg/kubectl/drain somewhere in staging e.g. client-go to make it even easier to adopt, but there would still be a lag) Either way, the dependencies of pkg/kubectl/drain are pretty reasonable ( https://github.com/kubernetes/kubernetes/pull/80045/files#diff-70b6e8155bf662f65f49c4656d3be7bb ) |
@justinsb that PR looks good. I can try to mock something up using that file. |
53b2cfc
to
fccaae7
Compare
} | ||
|
||
// Requestor might not have permissions to get DaemonSets when ignoring | ||
if apierrors.IsForbidden(err) && d.IgnoreAllDaemonSets { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this here: kubernetes/kubernetes#80129
If it's not accepted, can remove. Not necessary if you have what most would argue are proper permissions.
controllers/machine_controller.go
Outdated
var err error | ||
kubeClient, err = kubernetes.NewForConfig(r.config) | ||
if err != nil { | ||
return fmt.Errorf("unable to build kube client: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return fmt.Errorf("unable to build kube client: %v", err) | |
return errors.Errorf("unable to build kube client: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to record an event here against the Machine so the user is informed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't record an event for this in deleteNode; We should probably record the event there, since we'll fall-through to that method anyway.
@detiber I have some questions in here about recording events - ptal |
Re all my questions on events: I think it would be useful to have a single place where we record either the success or failure of the entire cordon + drain operation. Perhaps after we call |
If some kubelet is not responding (e.g. the machine failed permanently), what do we recommend the if I delete the Machine object? Pods scheduled to the node will eventually be evicted (unless they are exempt from eviction) by the cluster's control plane, so after that point, I think the drain (as currently implemented) will be a no-op. To be able to delete the Machine prior to this point, will the |
The only edge case right now is if there are pods with local-storage. They won't be successfully evicted, they'll just hang around forever because the api server can't tell the kubelet to cleanup. In that case, you'd need to use the exclude-draining annotation. In all other cases (machine has been stopped/down hard, or there are no pods with local storage), there is nothing special you need to do. Just delete the machine object, once all the pods are no longer on the node, you're good. |
+1 to adding events for success/failure with cordoning/draining |
r.recorder.Eventf(m, corev1.EventTypeWarning, "FailedDrainNode", "error draining Machine's node: %v", err) | ||
return ctrl.Result{}, err | ||
} | ||
r.recorder.Eventf(m, corev1.EventTypeNormal, "SuccessfulDrainNode", "success draining Machine %q node %q", m.Name, m.Status.NodeRef.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r.recorder.Eventf(m, corev1.EventTypeNormal, "SuccessfulDrainNode", "success draining Machine %q node %q", m.Name, m.Status.NodeRef.Name) | |
r.recorder.Eventf(m, corev1.EventTypeNormal, "SuccessfulDrainNode", "success draining Machine's node %q", m.Status.NodeRef.Name) |
The event is against m
(the machine), so we can omit the machine's name from the text.
IgnoreAllDaemonSets: true, | ||
DeleteLocalData: true, | ||
GracePeriodSeconds: -1, | ||
// If a pod is not evicted in 20 second, retry the eviction next time the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// If a pod is not evicted in 20 second, retry the eviction next time the | |
// If a pod is not evicted in 20 seconds, retry the eviction next time the |
} else { | ||
verbStr = "deleted" | ||
} | ||
klog.Infof("pod %s/%s %s\n", pod.Namespace, pod.Name, verbStr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need the \n
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to define this function within drainNode
so that we can add the Machine and Node names into the output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few nits, but otherwise lgtm. Thank you for getting this together.
} | ||
|
||
if err := kubedrain.RunCordonOrUncordon(drainer, node, true); err != nil { | ||
// Machine still tries to terminate after drain failure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Machine still tries to terminate after drain failure | |
// Machine still tries to terminate after cordon failure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment accurate? Maybe something more along the lines of:
// Machine will be re-reconciled after a cordon failure
would be more accurate, since we are returning any errors returned from drainNode
back to controller-runtime.
} | ||
|
||
if err := kubedrain.RunNodeDrain(drainer, node.Name); err != nil { | ||
// Machine still tries to terminate after drain failure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment accurate? Maybe something more along the lines of:
// Machine will be re-reconciled after a drain failure
would be more accurate, since we are returning any errors returned from drainNode
back to controller-runtime.
} else { | ||
verbStr = "deleted" | ||
} | ||
klog.Infof("pod %s/%s %s\n", pod.Namespace, pod.Name, verbStr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to define this function within drainNode
so that we can add the Machine and Node names into the output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All minor comments, non blocking and can be in another PR :)
Thanks for doing this, looks great!
@@ -184,6 +189,15 @@ func (r *MachineReconciler) reconcileDelete(ctx context.Context, cluster *cluste | |||
return ctrl.Result{}, err | |||
} | |||
} else { | |||
// Drain node before deletion | |||
if _, exists := m.ObjectMeta.Annotations[clusterv1.ExcludeNodeDrainingAnnotation]; !exists { | |||
klog.Infof("Draining node %q for machine %q", m.Status.NodeRef.Name, m.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
klog.Infof("Draining node %q for machine %q", m.Status.NodeRef.Name, m.Name) | |
klog.Infof("Draining node %q for machine %q in namespace %q", m.Status.NodeRef.Name, m.Name, m.Namespace) |
cluster.Name, nodeName, err) | ||
return nil | ||
} | ||
var err2 error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Reuse err
?
if err != nil { | ||
if apierrors.IsNotFound(err) { | ||
// If an admin deletes the node directly, we'll end up here. | ||
klog.Infof("Machine %v: Could not find node %v from noderef, it may have already been deleted: %v", machineName, nodeName, cluster.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
klog.Infof("Machine %v: Could not find node %v from noderef, it may have already been deleted: %v", machineName, nodeName, cluster.Name) | |
klog.Infof("Machine %v: Could not find node %v from noderef, it may have already been deleted", machineName, nodeName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or restructure so the Cluster is next to the machine? It seems a little weird to have the cluster at the end
verbStr = "evicted" | ||
} else { | ||
verbStr = "deleted" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider
verbStr := "deleted"
if usingEviction {
verbStr = "evicted"
}
} else { | ||
verbStr = "deleted" | ||
} | ||
klog.Infof("pod %s/%s %s\n", pod.Namespace, pod.Name, verbStr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
klog.Infof("pod %s/%s %s\n", pod.Namespace, pod.Name, verbStr) | |
klog.Infof("Pod %s/%s has been %s", pod.Namespace, pod.Name, verbStr) |
@@ -0,0 +1,2 @@ | |||
This directory is used to copy code from other projects in-tree. This |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other CAP* project I've seen using /third-party/
to have external copied dependencies, would you consider switching for consistency?
@@ -242,6 +256,73 @@ func (r *MachineReconciler) isDeleteNodeAllowed(ctx context.Context, machine *cl | |||
} | |||
} | |||
|
|||
func (r *MachineReconciler) drainNode(ctx context.Context, cluster *clusterv1.Cluster, nodeName string, machineName string) error { | |||
var kubeClient kubernetes.Interface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also declare err
up here to be reused
Don't need to hold this any more - we can lgtm & approve once it's ready (right? 😄) /hold cancel |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: michaelgugino, ncdc The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I'm happy to merge this as is and I can work on a follow-up PR to address the comments, so we don't have to hold it any longer |
I'm ok with that, but let's let @michaelgugino weigh in, in case he wants to make those changes now? |
dd2a109
to
820907a
Compare
The node draining code is copied from github.com/kubernetes/kubectl/pkg/drain (at) 75fdf29ade9e535ff5801a9321d55d1adf6a996b We copy drain code directly from upstream, however this code does not pass our linters as-is. This commit disables linting for external-libs directory so we don't have to carry local patches.
820907a
to
1ec39a6
Compare
/lgtm |
What this PR does / why we need it:
Centralizes node-drain behavior into cluster-api instead of down-stream actuators.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #994
The node draining code is copied from
github.com/kubernetes/kubectl/pkg/drain
@75fdf29ade9e535ff5801a9321d55d1adf6a996b
This PR also includes a commit to disable linters for the imported drain code.
Special notes for your reviewer:
Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.
Release note: