Skip to content

Conversation

@saurav-agarwalla
Copy link
Contributor

@saurav-agarwalla saurav-agarwalla commented Aug 8, 2025

Fixes #N/A

Description

There was a regression introduced in #2180:

        // Filter out empty candidates. If there was an empty node that wasn't consolidated before this, we should
        // assume that it was due to budgets. If we don't filter out budgets, users who set a budget for `empty`
        // can find their nodes disrupted here, which while that in itself isn't an issue for empty nodes, it could
        // constrain the `drift` budget.
        if len(candidate.reschedulablePods) == 0 {
            continue
        }

This doesn't account for cases where consolidation is disabled (or for that matter has a really long consolidation period). Drift shouldn't be blocked on consolidation since they are independent disruption methods.

So, what happens is that emptiness doesn't disrupt the node since consolidation is disabled. But drift also doesn't terminate the node since the node is empty and it incorrectly assumes that emptiness would always disrupt it.

The original intent makes sense though which is why I am also making a change to prioritize non-empty nodes over empty nodes for drift. Updated this based on the discussion with @jmdeal. Drifting non-empty nodes first could mean that pods from those nodes just reschedule on the empty nodes and will need to move again when those nodes get disrupted. Hence, disrupting empty nodes first reduces the overall churn.

With this change, it is totally possible that nodes which aren't eligible to be consolidated at a moment in time get disrupted due to drift and consume a disruption budget but I think it is more important to still allow drift to happen so that we don't have stale nodes (either with outdated AMI or otherwise) even if they are empty because those empty nodes can still get pods scheduled on them which means that the user runs a risk of having their applications run on outdated AMIs.

This is also why I somehow feel that the original priority order of drift being the first disruption method makes sense since it should be more important for customers to be running a more up-to-date node than being more efficient with their workloads but as the original PR mentions, there are rough edges to that approach especially when customers drift nodes for non-critical reasons.

All of this could've been prevented in the first place if we actually had separate buckets for each disruption reason. Something that #2387 intends to solve.

How was this change tested?

make presubmit

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 8, 2025
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Aug 8, 2025
@coveralls
Copy link

coveralls commented Aug 8, 2025

Pull Request Test Coverage Report for Build 17133705499

Details

  • 4 of 4 (100.0%) changed or added relevant lines in 1 file are covered.
  • 6 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.02%) to 81.842%

Files with Coverage Reduction New Missed Lines %
pkg/controllers/node/termination/controller.go 2 77.78%
pkg/apis/v1/zz_generated.deepcopy.go 4 64.89%
Totals Coverage Status
Change from base Build 17132038986: -0.02%
Covered Lines: 10488
Relevant Lines: 12815

💛 - Coveralls

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 19, 2025
@jmdeal
Copy link
Member

jmdeal commented Aug 21, 2025

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 21, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jmdeal, saurav-agarwalla

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 21, 2025
@k8s-ci-robot k8s-ci-robot merged commit 86f7024 into kubernetes-sigs:main Aug 21, 2025
22 of 24 checks passed
saurav-agarwalla added a commit to saurav-agarwalla/karpenter that referenced this pull request Aug 21, 2025
k8s-ci-robot pushed a commit that referenced this pull request Aug 21, 2025
saurav-agarwalla added a commit to saurav-agarwalla/karpenter that referenced this pull request Aug 26, 2025
saurav-agarwalla added a commit to saurav-agarwalla/karpenter that referenced this pull request Aug 26, 2025
k8s-ci-robot pushed a commit that referenced this pull request Aug 26, 2025
jigisha620 pushed a commit to jigisha620/karpenter that referenced this pull request Sep 19, 2025
harshad3339 added a commit to acquia/karpenter that referenced this pull request Nov 3, 2025
* chore: bump go version to 1.24.4 (kubernetes-sigs#2298)

* chore: Only log that the command succeeded when it actually did (kubernetes-sigs#2302)

* fix: Fix bug with MarkForDeletion before creating replacements (kubernetes-sigs#2300)

* perf: Refactor the eviction queue to be multithreaded (kubernetes-sigs#2252)

* docs: Add Bizfly Cloud provider (kubernetes-sigs#2303)

* chore: Bump lifecycle cache expiration to one hour (kubernetes-sigs#2307)

* chore: Use cluster state to check replacement NodeClaim existence (kubernetes-sigs#2308)

* chore(deps): bump github.com/samber/lo from 1.50.0 to 1.51.0 in the go-deps group (kubernetes-sigs#2315)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore: bump operatorpkg (kubernetes-sigs#2314)

* chore(deps): bump the k8s-go-deps group across 1 directory with 4 updates (kubernetes-sigs#2317)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore: Refactor Orchestration Queue and Handle Mark/Unmark Deletion in Queue (kubernetes-sigs#2305)

* chore(deps): bump the k8s-go-deps group with 7 updates (kubernetes-sigs#2326)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* perf: multithreaded orchestration queue (kubernetes-sigs#2293)

* test: Add nodeclaim name when you have garbage collection (kubernetes-sigs#2333)

* perf: Reduce multiple patch calls in instance termination (kubernetes-sigs#2324)

* fix: add helm rbac for kwok-provider to update finalizers (kubernetes-sigs#2336)

Signed-off-by: Max Cao <[email protected]>

* feat: configure CRD status operator with larger histogram buckets (kubernetes-sigs#2328)

* chore(deps): bump sigs.k8s.io/yaml from 1.4.0 to 1.5.0 in the k8s-go-deps group (kubernetes-sigs#2339)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump github.com/docker/docker from 28.2.2+incompatible to 28.3.0+incompatible in the go-deps group (kubernetes-sigs#2340)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix: Fix re-retrieving object on retry (kubernetes-sigs#2337)

* fix: Fix overriding error with patch call (kubernetes-sigs#2338)

* fix: add missing rlock to disruption queue (kubernetes-sigs#2348)

* test: allow e2e tests to output junit report (kubernetes-sigs#2334)

Signed-off-by: Max Cao <[email protected]>

* docs: Add Oracle Cloud Infrastructure (OCI) provider  (kubernetes-sigs#2342)

* fix: no longer allow the same hostname to take multiple capacity (kubernetes-sigs#2356)

* feat: support auto relaxing min values (kubernetes-sigs#2299)

* fix: update provider ID to ensure that Cloud Provider tests pass (kubernetes-sigs#2363)

* fix: remove unsupported capacity_type label from karpenter_nodeclaims… (kubernetes-sigs#2364)

* fix: update deletionTimestamp on terminating pods when after nodeDeletionTimestamp (kubernetes-sigs#2316)

Co-authored-by: Amanuel Engeda <[email protected]>

* chore: promote ReservedCapacity feature gate to beta (kubernetes-sigs#2365)

* fix: flakiness in expiration tests (kubernetes-sigs#2366)

* test: Bump the termination time for the deletion timestamp (kubernetes-sigs#2367)

* chore(deps): bump github.com/docker/docker from 28.3.0+incompatible to 28.3.1+incompatible in the go-deps group (kubernetes-sigs#2355)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix: pod errors when nodepool requirements filter all instance types (kubernetes-sigs#2341)

* refactor: Create a NopValidator for the disruption testing (kubernetes-sigs#2369)

* chore(deps): bump the go-deps group with 2 updates (kubernetes-sigs#2373)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* refactor: Update disruption testing from PR comments (kubernetes-sigs#2372)

* feat: (BREAKING) addition of launch timeout for nodeclaim lifecycle (kubernetes-sigs#2349)

* chore: Consider node.kubernetes.io/not-ready:NoExecute as ephemeral (kubernetes-sigs#2265)

* perf: Optimistically delete from the cache after launch (kubernetes-sigs#2380)

* docs: Node Overlay RFC (kubernetes-sigs#2166)

* fix: handle multiple PDBs for the same pod more gracefully (kubernetes-sigs#2379)

* docs: Add IBM Cloud provider (kubernetes-sigs#2396)

Signed-off-by: Josephine Pfeiffer <[email protected]>

* fix: rate limit eviction when PDBs are blocking (kubernetes-sigs#2399)

* feat: Add the Node Overlay CRD (kubernetes-sigs#2296)

* chore: ignore pods that use unsupported provisioner in the storageClass (kubernetes-sigs#2400)

* feat: Add a feature flag for Node Overlay (kubernetes-sigs#2404)

* feat: Add StaticCapacity feature flag (kubernetes-sigs#2405)

* fix(BREAKING): update naming of karpenter_pods_drained_total (kubernetes-sigs#2421)

* fix: pod metrics when pod is terminal (kubernetes-sigs#2417)

* chore: ignore pods that have unbound pvc with volumeBindingMode immediate (kubernetes-sigs#2415)

* docs: static capacity RFC (kubernetes-sigs#2309)

* chore: bump go version to 1.24.6 (kubernetes-sigs#2432)

* feat: Create optional operator arguments to leverage leader lease functionality (kubernetes-sigs#2433)

* chore(deps): bump the go-deps group with 5 updates (kubernetes-sigs#2442)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump actions/checkout from 4.2.2 to 5.0.0 in /.github/actions/install-pyroscope in the action-deps group (kubernetes-sigs#2428)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump the actions-deps group across 1 directory with 2 updates (kubernetes-sigs#2443)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump actions/cache from 4.2.3 to 4.2.4 in /.github/actions/install-deps in the action-deps group (kubernetes-sigs#2425)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix: do not block drifted nodes from being terminated if consolidation is disabled (kubernetes-sigs#2423)

* chore: Pin GH action SHAs for run-bench-test (kubernetes-sigs#2448)

* chore: update operatorpkg (kubernetes-sigs#2455)

* chore: Track NodeClaims in NodePoolState (kubernetes-sigs#2449)

* chore(deps): bump the k8s-go-deps group across 1 directory with 7 updates (kubernetes-sigs#2456)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* perf: Add flag to disable costly metrics controllers (kubernetes-sigs#2354)

* perf: concurrent reconciles CPU-based scaling (kubernetes-sigs#2406)

* perf: Disruption Queue Retry Duration Scaling (kubernetes-sigs#2411)

* perf: Typed Bucket Scaling (kubernetes-sigs#2420)

* ci: Include K8s version 1.33 and 1.34 in testing (kubernetes-sigs#2465)

* chore: increase MaxInstanceTypes to give cloud-providers more control over instance type truncation (kubernetes-sigs#2430)

* chore(deps): bump the go-deps group with 2 updates (kubernetes-sigs#2461)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump amannn/action-semantic-pull-request from 6.0.1 to 6.1.1 in the actions-deps group (kubernetes-sigs#2462)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* ci: revert k8s 1.34 addition (kubernetes-sigs#2475)

* fix: Don't schedule a pod with DRA requirements (kubernetes-sigs#2384)

* fix: support arbitrary reserved capacity labels for drift (kubernetes-sigs#2476)

* chore(deps): bump actions/checkout from 4.2.2 to 5.0.0 in /.github/actions/install-prometheus in the action-deps group (kubernetes-sigs#2426)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix: Fix nil pointer exception for multiNodeConsolidation (kubernetes-sigs#2472)

* fix: avoid hash collisions with duplicate match expressions (kubernetes-sigs#2479)

* ci: enable k8s 1.34 tests (kubernetes-sigs#2481)

* fix: Validate unsupported provisioners on bound PVs (kubernetes-sigs#2480)

* refactor: use iterator for iterating state nodes (kubernetes-sigs#2483)

* fix: make toolchain failing due to deletion of asciicheck (kubernetes-sigs#2485)

* fix: Handle PVC edge cases handled by kube-scheduler (kubernetes-sigs#2488)

* chore: Change appName from const to var (kubernetes-sigs#2489)

* fix: Handle unbound volumes with volumeName defined (kubernetes-sigs#2487)

* chore(deps): bump actions/setup-go from 5.5.0 to 6.0.0 in /.github/actions/install-deps in the action-deps group (kubernetes-sigs#2494)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump actions/setup-python from 5.6.0 to 6.0.0 in the actions-deps group (kubernetes-sigs#2493)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump the go-deps group with 6 updates (kubernetes-sigs#2491)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump the k8s-go-deps group with 4 updates (kubernetes-sigs#2492)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore: remove duplicate reconcile logging (kubernetes-sigs#2496)

* chore: bump operatorpkg version (kubernetes-sigs#2500)

* perf: Update the Node Repair Controller for requeue time  (kubernetes-sigs#2286)

* feat: Add NodeOverlay Controller Support (kubernetes-sigs#2306)

* chore(deps): bump the k8s-go-deps group with 3 updates (kubernetes-sigs#2504)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore: rolling back to 1.34 (kubernetes-sigs#2512)

* fix: handle nil selector when hashing in topology (kubernetes-sigs#2511)

* feat: Support Pod Level Resources (kubernetes-sigs#2383)

Signed-off-by: Tsubasa Nagasawa <[email protected]>

* fix: merge limits into requests when constructing ds pods (kubernetes-sigs#2514)

* fix: default CPU_REQUESTS when non-positive value is provided (kubernetes-sigs#2516)

* fix(node): prevent empty providerID causing false NodeClaim matches (kubernetes-sigs#2507)

* feat: Support Static Capacity (kubernetes-sigs#2521)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Jason Deal <[email protected]>
Co-authored-by: Jonathan Innis <[email protected]>
Co-authored-by: Andrew Mitchell <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Ryan Mistretta <[email protected]>

* fix: over provisioning static nodeclaims during controller crashes (kubernetes-sigs#2534)

* chore: drop consistency error to info log (kubernetes-sigs#2542)

* fix: flaky static provisioning unit test (kubernetes-sigs#2546)

* fix: nodepool crd definition should explicitly say replicas field as alpha (kubernetes-sigs#2554)

* chore: Update NodeRegistrationHealthy SC to use a buffer mechanism (kubernetes-sigs#2520)

---------

Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: Max Cao <[email protected]>
Signed-off-by: Josephine Pfeiffer <[email protected]>
Signed-off-by: Tsubasa Nagasawa <[email protected]>
Co-authored-by: Derek Frank <[email protected]>
Co-authored-by: Jonathan Innis <[email protected]>
Co-authored-by: Lê Minh Quân <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jigisha Patil <[email protected]>
Co-authored-by: Amanuel Engeda <[email protected]>
Co-authored-by: Max Cao <[email protected]>
Co-authored-by: Aidan Rowe <[email protected]>
Co-authored-by: Daniel Lopes <[email protected]>
Co-authored-by: Saurav Agarwalla <[email protected]>
Co-authored-by: cosimomeli <[email protected]>
Co-authored-by: Jason Deal <[email protected]>
Co-authored-by: Reed Schalo <[email protected]>
Co-authored-by: Josephine Pfeiffer <[email protected]>
Co-authored-by: Sumukha Radhakrishna <[email protected]>
Co-authored-by: Andy Townsend <[email protected]>
Co-authored-by: Sumukha Radhakrishna <[email protected]>
Co-authored-by: ryan-mist <[email protected]>
Co-authored-by: Brandon Wagner <[email protected]>
Co-authored-by: Alima Azamat <[email protected]>
Co-authored-by: Andrew Mitchell <[email protected]>
Co-authored-by: Tsubasa Nagasawa <[email protected]>
Co-authored-by: Neil <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants