-
Notifications
You must be signed in to change notification settings - Fork 41.7k
[KEP-5080]Ordered Namespace Deletion #130035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/sig api-machinery |
747a6c9 to
d5058ba
Compare
pkg/controller/namespace/deletion/namespaced_resources_deleter.go
Outdated
Show resolved
Hide resolved
|
overall looks good, there is some comments to address #130035 (comment) and the e2e test commits may be squashed togeher |
|
/triage accepted |
…Deletion` enabled.
|
/test pull-kubernetes-e2e-gce-cos-alpha-features |
|
Note: The test failure is not related with current PR. Ref: #130339 |
|
@cici37: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
thockin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
| logger.V(5).Info("Namespace controller - OrderedNamespaceDeletion feature gate is enabled", "namespace", namespace) | ||
| // Ensure all pods in the namespace are deleted first | ||
| podsGVR := schema.GroupVersionResource{Group: "", Version: "v1", Resource: "pods"} | ||
| gvrDeletionMetadata, err := d.deleteAllContentForGroupVersionResource(ctx, podsGVR, namespace, namespaceDeletedAt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhere we discussed whether we need to wait for actual delete or just that all pods are "deleting" and have been stopped. That would require actually looking into the Pod object itself and knowing about the kubelet state.
I think this (waiting for deleted) is fine for now, but do you think we should consider the above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In current behavior, the pod are deleting and have been stopped would prevent the pod from serving traffic. I am not quite sure about the necessity of the use case of "must actual delete pod before deleting other resources". Is there a concern of "deleting pod" be exposure to security risk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Users can't permanently stop a pod from being stopped, but they can stop it from being deleted (via finalizers).
It may not matter, since the NS deletion is "stuck" either way.
And we don't have a general way to know that an arbitrary object is "stopped" (it's pod-specific), much less "stopped and guaranteeed not to restart".
OK, you talked me out of it.
|
LGTM label has been added. Git tree hash: 5bb36a49c0381faa7824e43a03104f4dc7a7410c
|
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cici37, thockin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
| // Check if any pods remain before proceeding to delete other resources | ||
| if numRemainingTotals.gvrToNumRemaining[podsGVR] > 0 { | ||
| logger.V(5).Info("Namespace controller - pods still remain, delaying deletion of other resources", "namespace", namespace) | ||
| return estimate, utilerrors.NewAggregate(errs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note that returning early here means we're not reaching the conditionUpdater.Update / nsClient.UpdateStatus calls below, so any errors deleting pods, or any delay in waiting for pods to complete graceful deletion / get finalizers removed / etc, will be silent (not reflected in namespace conditions)
That doesn't impact the namespace deletion, but it does regress usability improvements made in #73405 and #82189 for #82084 / #64002, #60807, #66735 for issues with pod cleanup specifically
It could be enough to duplicate the condition / ns status update into this short-circuit block so we report that pods remain or pod deletion errors occurred if we're blocked on that:
if hasChanged := conditionUpdater.Update(ns); hasChanged {
if _, err = d.nsClient.UpdateStatus(ctx, ns, metav1.UpdateOptions{}); err != nil {
utilruntime.HandleError(fmt.Errorf("couldn't update status condition for namespace %q: %v", namespace, err))
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching! Here is the PR: #130617
…35-upstream-release-1.30 Automated cherry pick of #130035: Add the feature gate `OrderedNamespaceDeletion` for
…35-upstream-release-1.31 Automated cherry pick of #130035: [KEP-5080]Ordered Namespace Deletion
…35-upstream-release-1.32 Automated cherry pick of #130035: [KEP-5080]Ordered Namespace Deletion
What type of PR is this?
/kind feature
What this PR does / why we need it:
Implement the ordered namespace deletion.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: