✨ Machine controller: drain node before machine deletion #1096

michaelgugino · 2019-07-01T18:24:52Z

What this PR does / why we need it:
Centralizes node-drain behavior into cluster-api instead of down-stream actuators.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #994

The node draining code is copied from
github.com/kubernetes/kubectl/pkg/drain
@75fdf29ade9e535ff5801a9321d55d1adf6a996b

This PR also includes a commit to disable linters for the imported drain code.

Special notes for your reviewer:

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

Release note:

action required: machine-controller now attempts to cordon and drain nodes before deletion.  Actuators should be updated to remove this behavior.

vincepri

First pass, overall looks like a good step forward

One note though is that this only targets v1alpha2. For v1alpha1 we'd need to open a different PR to adapt the changes against the release-0.1 branch.

pkg/controller/machine/machine_controller.go

ncdc · 2019-07-01T19:13:38Z

@michaelgugino do you plan to add cordoning to this PR too?

michaelgugino · 2019-07-02T00:19:32Z

@michaelgugino do you plan to add cordoning to this PR too?

@ncdc cordoning is part of the library being added here already.

pkg/controller/machine/machine_controller.go

justinsb · 2019-07-08T02:33:22Z

I think we should be using the same drain code as kubectl does, and not adding an external dependency that will drift. That may well require a k/k dependency or code duplication while we shuffle things around.

vincepri · 2019-07-11T16:24:57Z

/hold

For v1alpha2 we shouldn't use any external libraries, instead prefer waiting for drain code to be out of k/k or vendor k/k as last resort

justinsb · 2019-07-12T12:54:35Z

I continued the drain refactoring in k/k that @errordeveloper started kubernetes/kubernetes#80045

We could aim to get that merged into k/k, and then copy-paste pkg/kubectl/drain into our code until we are based on a version of k/k that includes the PR. (Or we can look at splitting pkg/kubectl/drain somewhere in staging e.g. client-go to make it even easier to adopt, but there would still be a lag) Either way, the dependencies of pkg/kubectl/drain are pretty reasonable ( https://github.com/kubernetes/kubernetes/pull/80045/files#diff-70b6e8155bf662f65f49c4656d3be7bb )

michaelgugino · 2019-07-12T15:31:23Z

@justinsb that PR looks good. I can try to mock something up using that file.

michaelgugino · 2019-07-14T00:29:42Z

pkg/controller/machine/drain/filters.go

+		}
+
+        // Requestor might not have permissions to get DaemonSets when ignoring
+        if apierrors.IsForbidden(err) && d.IgnoreAllDaemonSets {


Added this here: kubernetes/kubernetes#80129

If it's not accepted, can remove. Not necessary if you have what most would argue are proper permissions.

pkg/controller/machine/machine_controller.go

controllers/machine_controller.go

ncdc · 2019-10-02T18:38:22Z

controllers/machine_controller.go

+		var err error
+		kubeClient, err = kubernetes.NewForConfig(r.config)
+		if err != nil {
+			return fmt.Errorf("unable to build kube client: %v", err)


Suggested change

return fmt.Errorf("unable to build kube client: %v", err)

return errors.Errorf("unable to build kube client: %v", err)

Do we want to record an event here against the Machine so the user is informed?

We don't record an event for this in deleteNode; We should probably record the event there, since we'll fall-through to that method anyway.

controllers/machine_controller.go

ncdc · 2019-10-02T18:54:40Z

@detiber I have some questions in here about recording events - ptal

ncdc · 2019-10-02T18:59:23Z

Re all my questions on events: I think it would be useful to have a single place where we record either the success or failure of the entire cordon + drain operation. Perhaps after we call r.drainNode()?

dlipovetsky · 2019-10-02T19:09:21Z

If some kubelet is not responding (e.g. the machine failed permanently), what do we recommend the if I delete the Machine object?

Pods scheduled to the node will eventually be evicted (unless they are exempt from eviction) by the cluster's control plane, so after that point, I think the drain (as currently implemented) will be a no-op. To be able to delete the Machine prior to this point, will the ExcludeNodeDrainingAnnotation annotation need to be set?

michaelgugino · 2019-10-02T19:13:45Z

If some kubelet is not responding (e.g. the machine failed permanently), what do we recommend the if I delete the Machine object?

Pods scheduled to the node will eventually be evicted (unless they are exempt from eviction) by the cluster's control plane, so after that point, I think the drain (as currently implemented) will be a no-op. To be able to delete the Machine prior to this point, will the ExcludeNodeDrainingAnnotation annotation need to be set?

@dlipovetsky

The only edge case right now is if there are pods with local-storage. They won't be successfully evicted, they'll just hang around forever because the api server can't tell the kubelet to cleanup. In that case, you'd need to use the exclude-draining annotation. In all other cases (machine has been stopped/down hard, or there are no pods with local storage), there is nothing special you need to do. Just delete the machine object, once all the pods are no longer on the node, you're good.

.golangci.yml

detiber · 2019-10-02T19:15:00Z

Re all my questions on events: I think it would be useful to have a single place where we record either the success or failure of the entire cordon + drain operation. Perhaps after we call r.drainNode()?

+1 to adding events for success/failure with cordoning/draining

ncdc · 2019-10-10T13:21:23Z

controllers/machine_controller.go

+				r.recorder.Eventf(m, corev1.EventTypeWarning, "FailedDrainNode", "error draining Machine's node: %v", err)
+				return ctrl.Result{}, err
+			}
+			r.recorder.Eventf(m, corev1.EventTypeNormal, "SuccessfulDrainNode", "success draining Machine %q node %q", m.Name, m.Status.NodeRef.Name)


Suggested change

r.recorder.Eventf(m, corev1.EventTypeNormal, "SuccessfulDrainNode", "success draining Machine %q node %q", m.Name, m.Status.NodeRef.Name)

r.recorder.Eventf(m, corev1.EventTypeNormal, "SuccessfulDrainNode", "success draining Machine's node %q", m.Status.NodeRef.Name)

The event is against m (the machine), so we can omit the machine's name from the text.

controllers/machine_controller.go

ncdc · 2019-10-10T13:32:02Z

controllers/machine_controller.go

+		IgnoreAllDaemonSets: true,
+		DeleteLocalData:     true,
+		GracePeriodSeconds:  -1,
+		// If a pod is not evicted in 20 second, retry the eviction next time the


Suggested change

// If a pod is not evicted in 20 second, retry the eviction next time the

// If a pod is not evicted in 20 seconds, retry the eviction next time the

ncdc · 2019-10-10T13:34:47Z

controllers/machine_controller.go

+	} else {
+		verbStr = "deleted"
+	}
+	klog.Infof("pod %s/%s %s\n", pod.Namespace, pod.Name, verbStr)


I don't think you need the \n?

Would it make sense to define this function within drainNode so that we can add the Machine and Node names into the output?

detiber

A few nits, but otherwise lgtm. Thank you for getting this together.

detiber · 2019-10-10T13:41:50Z

controllers/machine_controller.go

+	}
+
+	if err := kubedrain.RunCordonOrUncordon(drainer, node, true); err != nil {
+		// Machine still tries to terminate after drain failure


Suggested change

// Machine still tries to terminate after drain failure

// Machine still tries to terminate after cordon failure

Is this comment accurate? Maybe something more along the lines of:

// Machine will be re-reconciled after a cordon failure

would be more accurate, since we are returning any errors returned from drainNode back to controller-runtime.

detiber · 2019-10-10T13:47:11Z

controllers/machine_controller.go

+	}
+
+	if err := kubedrain.RunNodeDrain(drainer, node.Name); err != nil {
+		// Machine still tries to terminate after drain failure


Is this comment accurate? Maybe something more along the lines of:

// Machine will be re-reconciled after a drain failure

would be more accurate, since we are returning any errors returned from drainNode back to controller-runtime.

detiber · 2019-10-10T13:49:50Z

controllers/machine_controller.go

+	} else {
+		verbStr = "deleted"
+	}
+	klog.Infof("pod %s/%s %s\n", pod.Namespace, pod.Name, verbStr)


Would it make sense to define this function within drainNode so that we can add the Machine and Node names into the output?

vincepri

All minor comments, non blocking and can be in another PR :)

Thanks for doing this, looks great!

vincepri · 2019-10-10T14:22:31Z

controllers/machine_controller.go

@@ -184,6 +189,15 @@ func (r *MachineReconciler) reconcileDelete(ctx context.Context, cluster *cluste
 			return ctrl.Result{}, err
 		}
 	} else {
+		// Drain node before deletion
+		if _, exists := m.ObjectMeta.Annotations[clusterv1.ExcludeNodeDrainingAnnotation]; !exists {
+			klog.Infof("Draining node %q for machine %q", m.Status.NodeRef.Name, m.Name)


Suggested change

klog.Infof("Draining node %q for machine %q", m.Status.NodeRef.Name, m.Name)

klog.Infof("Draining node %q for machine %q in namespace %q", m.Status.NodeRef.Name, m.Name, m.Namespace)

vincepri · 2019-10-10T14:23:03Z

controllers/machine_controller.go

+				cluster.Name, nodeName, err)
+			return nil
+		}
+		var err2 error


Nit: Reuse err?

vincepri · 2019-10-10T14:23:58Z

controllers/machine_controller.go

+	if err != nil {
+		if apierrors.IsNotFound(err) {
+			// If an admin deletes the node directly, we'll end up here.
+			klog.Infof("Machine %v: Could not find node %v from noderef, it may have already been deleted: %v", machineName, nodeName, cluster.Name)


Suggested change

klog.Infof("Machine %v: Could not find node %v from noderef, it may have already been deleted: %v", machineName, nodeName, cluster.Name)

klog.Infof("Machine %v: Could not find node %v from noderef, it may have already been deleted", machineName, nodeName)

Or restructure so the Cluster is next to the machine? It seems a little weird to have the cluster at the end

vincepri · 2019-10-10T14:25:43Z

controllers/machine_controller.go

+		verbStr = "evicted"
+	} else {
+		verbStr = "deleted"
+	}


Consider

verbStr := "deleted" if usingEviction { verbStr = "evicted" }

vincepri · 2019-10-10T14:26:00Z

controllers/machine_controller.go

+	} else {
+		verbStr = "deleted"
+	}
+	klog.Infof("pod %s/%s %s\n", pod.Namespace, pod.Name, verbStr)


Suggested change

klog.Infof("pod %s/%s %s\n", pod.Namespace, pod.Name, verbStr)

klog.Infof("Pod %s/%s has been %s", pod.Namespace, pod.Name, verbStr)

vincepri · 2019-10-10T14:27:09Z

external-libs/README.md

@@ -0,0 +1,2 @@
+This directory is used to copy code from other projects in-tree. This


In other CAP* project I've seen using /third-party/ to have external copied dependencies, would you consider switching for consistency?

vincepri · 2019-10-10T14:27:51Z

controllers/machine_controller.go

@@ -242,6 +256,73 @@ func (r *MachineReconciler) isDeleteNodeAllowed(ctx context.Context, machine *cl
 	}
 }

+func (r *MachineReconciler) drainNode(ctx context.Context, cluster *clusterv1.Cluster, nodeName string, machineName string) error {
+	var kubeClient kubernetes.Interface


You can also declare err up here to be reused

ncdc · 2019-10-10T14:42:58Z

Don't need to hold this any more - we can lgtm & approve once it's ready (right? 😄)

/hold cancel

ncdc · 2019-10-10T14:43:12Z

/approve

k8s-ci-robot · 2019-10-10T14:43:31Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: michaelgugino, ncdc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [ncdc]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

vincepri · 2019-10-10T14:44:59Z

I'm happy to merge this as is and I can work on a follow-up PR to address the comments, so we don't have to hold it any longer

ncdc · 2019-10-10T14:46:42Z

I'm ok with that, but let's let @michaelgugino weigh in, in case he wants to make those changes now?

The node draining code is copied from github.com/kubernetes/kubectl/pkg/drain (at) 75fdf29ade9e535ff5801a9321d55d1adf6a996b We copy drain code directly from upstream, however this code does not pass our linters as-is. This commit disables linting for external-libs directory so we don't have to carry local patches.

vincepri · 2019-10-10T16:42:38Z

/lgtm

🏃 followup from #1096

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jul 1, 2019

k8s-ci-robot requested review from detiber and vincepri July 1, 2019 18:25

k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jul 1, 2019

michaelgugino force-pushed the node-drain2 branch from 5531a5c to e55ab5e Compare July 1, 2019 18:26

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 1, 2019

michaelgugino force-pushed the node-drain2 branch from e55ab5e to 396a671 Compare July 1, 2019 18:39

vincepri reviewed Jul 1, 2019

View reviewed changes

kubernetes-sigs deleted a comment from k8s-ci-robot Jul 1, 2019

detiber reviewed Jul 1, 2019

View reviewed changes

pkg/controller/machine/machine_controller.go Outdated Show resolved Hide resolved

pkg/controller/machine/machine_controller.go Outdated Show resolved Hide resolved

pkg/controller/machine/machine_controller.go Outdated Show resolved Hide resolved

ncdc reviewed Jul 1, 2019

View reviewed changes

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 1, 2019

michaelgugino force-pushed the node-drain2 branch from 396a671 to aeac80b Compare July 2, 2019 00:25

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 2, 2019

enxebre reviewed Jul 2, 2019

View reviewed changes

pkg/controller/machine/machine_controller.go Outdated Show resolved Hide resolved

enxebre reviewed Jul 2, 2019

View reviewed changes

pkg/controller/machine/machine_controller.go Outdated Show resolved Hide resolved

michaelgugino force-pushed the node-drain2 branch from aeac80b to 53b2cfc Compare July 2, 2019 14:28

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 10, 2019

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 11, 2019

michaelgugino force-pushed the node-drain2 branch from 53b2cfc to fccaae7 Compare July 14, 2019 00:28

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 14, 2019

michaelgugino commented Jul 14, 2019

View reviewed changes

pkg/controller/machine/machine_controller.go Outdated Show resolved Hide resolved

ncdc reviewed Oct 2, 2019

View reviewed changes

controllers/machine_controller.go Outdated Show resolved Hide resolved

ncdc reviewed Oct 2, 2019

View reviewed changes

detiber reviewed Oct 2, 2019

View reviewed changes

.golangci.yml Outdated Show resolved Hide resolved

.golangci.yml Outdated Show resolved Hide resolved

ncdc mentioned this pull request Oct 7, 2019

[0.1] Machine controller: drain node before machine deletion #1103

Closed

ncdc reviewed Oct 10, 2019

View reviewed changes

detiber reviewed Oct 10, 2019

View reviewed changes

vincepri reviewed Oct 10, 2019

View reviewed changes

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 10, 2019

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 10, 2019

chuckha mentioned this pull request Oct 10, 2019

✨ Merge in CAPD with a single squashed commit #1485

Merged

michaelgugino force-pushed the node-drain2 branch from dd2a109 to 820907a Compare October 10, 2019 16:32

michaelgugino force-pushed the node-drain2 branch from 820907a to 1ec39a6 Compare October 10, 2019 16:38

k8s-ci-robot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Oct 10, 2019

k8s-ci-robot assigned vincepri Oct 10, 2019

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 10, 2019

k8s-ci-robot merged commit 3dd3506 into kubernetes-sigs:master Oct 10, 2019

k8s-ci-robot added a commit that referenced this pull request Oct 10, 2019

Merge pull request #1515 from vincepri/followup=1096

30869b7

🏃 followup from #1096

dippynark mentioned this pull request Dec 27, 2019

Delegate Node drain to Cluster API controller on upgrade #1967

Closed

	return fmt.Errorf("unable to build kube client: %v", err)
	return errors.Errorf("unable to build kube client: %v", err)

	r.recorder.Eventf(m, corev1.EventTypeNormal, "SuccessfulDrainNode", "success draining Machine %q node %q", m.Name, m.Status.NodeRef.Name)
	r.recorder.Eventf(m, corev1.EventTypeNormal, "SuccessfulDrainNode", "success draining Machine's node %q", m.Status.NodeRef.Name)

	// If a pod is not evicted in 20 second, retry the eviction next time the
	// If a pod is not evicted in 20 seconds, retry the eviction next time the

	// Machine still tries to terminate after drain failure
	// Machine still tries to terminate after cordon failure

	klog.Infof("Draining node %q for machine %q", m.Status.NodeRef.Name, m.Name)
	klog.Infof("Draining node %q for machine %q in namespace %q", m.Status.NodeRef.Name, m.Name, m.Namespace)

	klog.Infof("Machine %v: Could not find node %v from noderef, it may have already been deleted: %v", machineName, nodeName, cluster.Name)
	klog.Infof("Machine %v: Could not find node %v from noderef, it may have already been deleted", machineName, nodeName)

	klog.Infof("pod %s/%s %s\n", pod.Namespace, pod.Name, verbStr)
	klog.Infof("Pod %s/%s has been %s", pod.Namespace, pod.Name, verbStr)

		@@ -0,0 +1,2 @@
		This directory is used to copy code from other projects in-tree. This

✨ Machine controller: drain node before machine deletion #1096

✨ Machine controller: drain node before machine deletion #1096

Conversation

michaelgugino commented Jul 1, 2019 • edited Loading

vincepri left a comment

Choose a reason for hiding this comment

ncdc commented Jul 1, 2019

michaelgugino commented Jul 2, 2019

justinsb commented Jul 8, 2019

vincepri commented Jul 11, 2019

justinsb commented Jul 12, 2019

michaelgugino commented Jul 12, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ncdc commented Oct 2, 2019

ncdc commented Oct 2, 2019

dlipovetsky commented Oct 2, 2019

michaelgugino commented Oct 2, 2019

detiber commented Oct 2, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

detiber left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vincepri left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ncdc commented Oct 10, 2019

ncdc commented Oct 10, 2019

k8s-ci-robot commented Oct 10, 2019

vincepri commented Oct 10, 2019

ncdc commented Oct 10, 2019

vincepri commented Oct 10, 2019

michaelgugino commented Jul 1, 2019 •

edited

Loading