-
Notifications
You must be signed in to change notification settings - Fork 381
perf: Reduce multiple patch calls in instance termination #2324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: Reduce multiple patch calls in instance termination #2324
Conversation
|
Skipping CI for Draft Pull Request. |
3a34d8d to
d55b4ba
Compare
Pull Request Test Coverage Report for Build 15858723414Details
💛 - Coveralls |
d55b4ba to
b5dcd58
Compare
d14c9ab to
82e357f
Compare
82e357f to
c3d2788
Compare
0204bc1 to
e5f6469
Compare
6f4e335 to
011f4af
Compare
0a36e7d to
44b459d
Compare
44b459d to
301ddbe
Compare
301ddbe to
b281aee
Compare
| } | ||
| } | ||
| // If we don't have a NodeClaim, then there's nothing for us to patch here | ||
| if stored != nil && !equality.Semantic.DeepEqual(stored, nodeClaim) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does DeepEqual not handle nil on its own?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want us to generate a patch if both of these are nil
| return reconcile.Result{}, fmt.Errorf("updating nodeclaim, %w", err) | ||
| } | ||
| // We only increment the drained metric after we have ensured that we have patched the status condition onto the NodeClaim | ||
| if !stored.StatusConditions().IsTrue(v1.ConditionTypeDrained) && nodeClaim.StatusConditions().IsTrue(v1.ConditionTypeDrained) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be a check that both are true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I'm only trying to fire a metric when this status condition is changing from not true to true. Otherwise, we get duplicate metrics that are going to skew things
rschalo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jonathan-innis, rschalo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* test: Lower resource requests for NodeClaim test (kubernetes-sigs#2229) * perf: Don't deepcopy inside of watch handler functions (kubernetes-sigs#2232) * test: Add random name string for NodePool and NodeClass (kubernetes-sigs#2231) * test: Update E2E testing suite to be named Regression (kubernetes-sigs#2234) * refactor: convert validation to an interface (kubernetes-sigs#2220) * fix: allow non-churn empty nodes to be disrupted (kubernetes-sigs#2206) * perf: Only deep copy nodes during GetCandidates once (kubernetes-sigs#2233) * feat: add metrics for disruption candidate validation (kubernetes-sigs#2239) * perf: Only call .Available() once which prevents duplicate allocs (kubernetes-sigs#2241) * docs: update issue triage meeting schedule (kubernetes-sigs#2244) * test: deflake NodeClaim and presubmit tests (kubernetes-sigs#2240) * perf: Avoid deepcopy when get nodePool/cluster health (kubernetes-sigs#2247) * perf: Improve OrderByPrice performance (kubernetes-sigs#2250) * test: add validating admission policy for nodeclass status (kubernetes-sigs#2251) Co-authored-by: Jonathan Innis <[email protected]> * feat: drain and volume detachment status conditions (kubernetes-sigs#1876) * fix: show the cron parse error to users to allow them to debug (kubernetes-sigs#2258) * perf: Don't deep-copy nodes and nodeclaims in our synced check (kubernetes-sigs#2260) * chore: Fix getting current script directory in install-kwok.sh (kubernetes-sigs#2262) * perf: Perform quick checks in node health first (kubernetes-sigs#2264) * chore: Update pod metrics when pod is completed (kubernetes-sigs#2259) * fix: Correctly build nodepool mapping for complex clusters (kubernetes-sigs#2263) * fix: fail open for missing nodeclaims in termination (kubernetes-sigs#2266) * perf: Limit GetInstanceTypes() calls per-NodeClaim (kubernetes-sigs#2271) * perf: Parallelize disruption execution actions (kubernetes-sigs#2270) * fix: Fix node owner reference update (kubernetes-sigs#2274) * perf: Be more resilient to deletion failures in disruption controller (kubernetes-sigs#2272) * chore(deps): bump the go-deps group with 2 updates (kubernetes-sigs#2277) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: Ensure we can stand up multiple partitions with kwok (kubernetes-sigs#2283) * chore: Inject resources into Kwok through a patch (kubernetes-sigs#2285) * chore: Update NodeClaim E2E test to only replace one status condition (kubernetes-sigs#2284) * chore: Avoid validating admission policy for clusters older then 1.30 (kubernetes-sigs#2289) * chore(deps): bump the go-deps group with 2 updates (kubernetes-sigs#2295) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: bump go version to 1.24.4 (kubernetes-sigs#2298) * chore: Only log that the command succeeded when it actually did (kubernetes-sigs#2302) * fix: Fix bug with MarkForDeletion before creating replacements (kubernetes-sigs#2300) * perf: Refactor the eviction queue to be multithreaded (kubernetes-sigs#2252) * docs: Add Bizfly Cloud provider (kubernetes-sigs#2303) * chore: Bump lifecycle cache expiration to one hour (kubernetes-sigs#2307) * chore: Use cluster state to check replacement NodeClaim existence (kubernetes-sigs#2308) * chore(deps): bump github.com/samber/lo from 1.50.0 to 1.51.0 in the go-deps group (kubernetes-sigs#2315) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: bump operatorpkg (kubernetes-sigs#2314) * chore(deps): bump the k8s-go-deps group across 1 directory with 4 updates (kubernetes-sigs#2317) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: Refactor Orchestration Queue and Handle Mark/Unmark Deletion in Queue (kubernetes-sigs#2305) * chore(deps): bump the k8s-go-deps group with 7 updates (kubernetes-sigs#2326) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * perf: multithreaded orchestration queue (kubernetes-sigs#2293) * test: Add nodeclaim name when you have garbage collection (kubernetes-sigs#2333) * perf: Reduce multiple patch calls in instance termination (kubernetes-sigs#2324) * fix: add helm rbac for kwok-provider to update finalizers (kubernetes-sigs#2336) Signed-off-by: Max Cao <[email protected]> * feat: configure CRD status operator with larger histogram buckets (kubernetes-sigs#2328) * chore(deps): bump sigs.k8s.io/yaml from 1.4.0 to 1.5.0 in the k8s-go-deps group (kubernetes-sigs#2339) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump github.com/docker/docker from 28.2.2+incompatible to 28.3.0+incompatible in the go-deps group (kubernetes-sigs#2340) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix: Fix re-retrieving object on retry (kubernetes-sigs#2337) * fix: Fix overriding error with patch call (kubernetes-sigs#2338) * fix: add missing rlock to disruption queue (kubernetes-sigs#2348) * test: allow e2e tests to output junit report (kubernetes-sigs#2334) Signed-off-by: Max Cao <[email protected]> * docs: Add Oracle Cloud Infrastructure (OCI) provider (kubernetes-sigs#2342) * fix: no longer allow the same hostname to take multiple capacity (kubernetes-sigs#2356) * feat: support auto relaxing min values (kubernetes-sigs#2299) * fix: update provider ID to ensure that Cloud Provider tests pass (kubernetes-sigs#2363) * fix: remove unsupported capacity_type label from karpenter_nodeclaims… (kubernetes-sigs#2364) * fix: update deletionTimestamp on terminating pods when after nodeDeletionTimestamp (kubernetes-sigs#2316) Co-authored-by: Amanuel Engeda <[email protected]> * chore: promote ReservedCapacity feature gate to beta (kubernetes-sigs#2365) * fix: flakiness in expiration tests (kubernetes-sigs#2366) * test: Bump the termination time for the deletion timestamp (kubernetes-sigs#2367) * chore: cherry-pick kubernetes-sigs#2399 (kubernetes-sigs#2401) --------- Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: Max Cao <[email protected]> Co-authored-by: Amanuel Engeda <[email protected]> Co-authored-by: Jonathan Innis <[email protected]> Co-authored-by: Reed Schalo <[email protected]> Co-authored-by: DerekFrank <[email protected]> Co-authored-by: Jason Deal <[email protected]> Co-authored-by: Reed Schalo <[email protected]> Co-authored-by: Jonathan Innis <[email protected]> Co-authored-by: Todd Neal <[email protected]> Co-authored-by: Jigisha Patil <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Lê Minh Quân <[email protected]> Co-authored-by: Max Cao <[email protected]> Co-authored-by: Aidan Rowe <[email protected]> Co-authored-by: Daniel Lopes <[email protected]> Co-authored-by: Saurav Agarwalla <[email protected]> Co-authored-by: cosimomeli <[email protected]>
* chore: bump go version to 1.24.4 (kubernetes-sigs#2298) * chore: Only log that the command succeeded when it actually did (kubernetes-sigs#2302) * fix: Fix bug with MarkForDeletion before creating replacements (kubernetes-sigs#2300) * perf: Refactor the eviction queue to be multithreaded (kubernetes-sigs#2252) * docs: Add Bizfly Cloud provider (kubernetes-sigs#2303) * chore: Bump lifecycle cache expiration to one hour (kubernetes-sigs#2307) * chore: Use cluster state to check replacement NodeClaim existence (kubernetes-sigs#2308) * chore(deps): bump github.com/samber/lo from 1.50.0 to 1.51.0 in the go-deps group (kubernetes-sigs#2315) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: bump operatorpkg (kubernetes-sigs#2314) * chore(deps): bump the k8s-go-deps group across 1 directory with 4 updates (kubernetes-sigs#2317) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: Refactor Orchestration Queue and Handle Mark/Unmark Deletion in Queue (kubernetes-sigs#2305) * chore(deps): bump the k8s-go-deps group with 7 updates (kubernetes-sigs#2326) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * perf: multithreaded orchestration queue (kubernetes-sigs#2293) * test: Add nodeclaim name when you have garbage collection (kubernetes-sigs#2333) * perf: Reduce multiple patch calls in instance termination (kubernetes-sigs#2324) * fix: add helm rbac for kwok-provider to update finalizers (kubernetes-sigs#2336) Signed-off-by: Max Cao <[email protected]> * feat: configure CRD status operator with larger histogram buckets (kubernetes-sigs#2328) * chore(deps): bump sigs.k8s.io/yaml from 1.4.0 to 1.5.0 in the k8s-go-deps group (kubernetes-sigs#2339) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump github.com/docker/docker from 28.2.2+incompatible to 28.3.0+incompatible in the go-deps group (kubernetes-sigs#2340) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix: Fix re-retrieving object on retry (kubernetes-sigs#2337) * fix: Fix overriding error with patch call (kubernetes-sigs#2338) * fix: add missing rlock to disruption queue (kubernetes-sigs#2348) * test: allow e2e tests to output junit report (kubernetes-sigs#2334) Signed-off-by: Max Cao <[email protected]> * docs: Add Oracle Cloud Infrastructure (OCI) provider (kubernetes-sigs#2342) * fix: no longer allow the same hostname to take multiple capacity (kubernetes-sigs#2356) * feat: support auto relaxing min values (kubernetes-sigs#2299) * fix: update provider ID to ensure that Cloud Provider tests pass (kubernetes-sigs#2363) * fix: remove unsupported capacity_type label from karpenter_nodeclaims… (kubernetes-sigs#2364) * fix: update deletionTimestamp on terminating pods when after nodeDeletionTimestamp (kubernetes-sigs#2316) Co-authored-by: Amanuel Engeda <[email protected]> * chore: promote ReservedCapacity feature gate to beta (kubernetes-sigs#2365) * fix: flakiness in expiration tests (kubernetes-sigs#2366) * test: Bump the termination time for the deletion timestamp (kubernetes-sigs#2367) * chore(deps): bump github.com/docker/docker from 28.3.0+incompatible to 28.3.1+incompatible in the go-deps group (kubernetes-sigs#2355) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix: pod errors when nodepool requirements filter all instance types (kubernetes-sigs#2341) * refactor: Create a NopValidator for the disruption testing (kubernetes-sigs#2369) * chore(deps): bump the go-deps group with 2 updates (kubernetes-sigs#2373) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * refactor: Update disruption testing from PR comments (kubernetes-sigs#2372) * feat: (BREAKING) addition of launch timeout for nodeclaim lifecycle (kubernetes-sigs#2349) * chore: Consider node.kubernetes.io/not-ready:NoExecute as ephemeral (kubernetes-sigs#2265) * perf: Optimistically delete from the cache after launch (kubernetes-sigs#2380) * docs: Node Overlay RFC (kubernetes-sigs#2166) * fix: handle multiple PDBs for the same pod more gracefully (kubernetes-sigs#2379) * docs: Add IBM Cloud provider (kubernetes-sigs#2396) Signed-off-by: Josephine Pfeiffer <[email protected]> * fix: rate limit eviction when PDBs are blocking (kubernetes-sigs#2399) * feat: Add the Node Overlay CRD (kubernetes-sigs#2296) * chore: ignore pods that use unsupported provisioner in the storageClass (kubernetes-sigs#2400) * feat: Add a feature flag for Node Overlay (kubernetes-sigs#2404) * feat: Add StaticCapacity feature flag (kubernetes-sigs#2405) * fix(BREAKING): update naming of karpenter_pods_drained_total (kubernetes-sigs#2421) * fix: pod metrics when pod is terminal (kubernetes-sigs#2417) * chore: ignore pods that have unbound pvc with volumeBindingMode immediate (kubernetes-sigs#2415) * docs: static capacity RFC (kubernetes-sigs#2309) * chore: bump go version to 1.24.6 (kubernetes-sigs#2432) * feat: Create optional operator arguments to leverage leader lease functionality (kubernetes-sigs#2433) * chore(deps): bump the go-deps group with 5 updates (kubernetes-sigs#2442) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump actions/checkout from 4.2.2 to 5.0.0 in /.github/actions/install-pyroscope in the action-deps group (kubernetes-sigs#2428) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump the actions-deps group across 1 directory with 2 updates (kubernetes-sigs#2443) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump actions/cache from 4.2.3 to 4.2.4 in /.github/actions/install-deps in the action-deps group (kubernetes-sigs#2425) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix: do not block drifted nodes from being terminated if consolidation is disabled (kubernetes-sigs#2423) * chore: Pin GH action SHAs for run-bench-test (kubernetes-sigs#2448) * chore: update operatorpkg (kubernetes-sigs#2455) * chore: Track NodeClaims in NodePoolState (kubernetes-sigs#2449) * chore(deps): bump the k8s-go-deps group across 1 directory with 7 updates (kubernetes-sigs#2456) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * perf: Add flag to disable costly metrics controllers (kubernetes-sigs#2354) * perf: concurrent reconciles CPU-based scaling (kubernetes-sigs#2406) * perf: Disruption Queue Retry Duration Scaling (kubernetes-sigs#2411) * perf: Typed Bucket Scaling (kubernetes-sigs#2420) * ci: Include K8s version 1.33 and 1.34 in testing (kubernetes-sigs#2465) * chore: increase MaxInstanceTypes to give cloud-providers more control over instance type truncation (kubernetes-sigs#2430) * chore(deps): bump the go-deps group with 2 updates (kubernetes-sigs#2461) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump amannn/action-semantic-pull-request from 6.0.1 to 6.1.1 in the actions-deps group (kubernetes-sigs#2462) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * ci: revert k8s 1.34 addition (kubernetes-sigs#2475) * fix: Don't schedule a pod with DRA requirements (kubernetes-sigs#2384) * fix: support arbitrary reserved capacity labels for drift (kubernetes-sigs#2476) * chore(deps): bump actions/checkout from 4.2.2 to 5.0.0 in /.github/actions/install-prometheus in the action-deps group (kubernetes-sigs#2426) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix: Fix nil pointer exception for multiNodeConsolidation (kubernetes-sigs#2472) * fix: avoid hash collisions with duplicate match expressions (kubernetes-sigs#2479) * ci: enable k8s 1.34 tests (kubernetes-sigs#2481) * fix: Validate unsupported provisioners on bound PVs (kubernetes-sigs#2480) * refactor: use iterator for iterating state nodes (kubernetes-sigs#2483) * fix: make toolchain failing due to deletion of asciicheck (kubernetes-sigs#2485) * fix: Handle PVC edge cases handled by kube-scheduler (kubernetes-sigs#2488) * chore: Change appName from const to var (kubernetes-sigs#2489) * fix: Handle unbound volumes with volumeName defined (kubernetes-sigs#2487) * chore(deps): bump actions/setup-go from 5.5.0 to 6.0.0 in /.github/actions/install-deps in the action-deps group (kubernetes-sigs#2494) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump actions/setup-python from 5.6.0 to 6.0.0 in the actions-deps group (kubernetes-sigs#2493) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump the go-deps group with 6 updates (kubernetes-sigs#2491) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump the k8s-go-deps group with 4 updates (kubernetes-sigs#2492) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: remove duplicate reconcile logging (kubernetes-sigs#2496) * chore: bump operatorpkg version (kubernetes-sigs#2500) * perf: Update the Node Repair Controller for requeue time (kubernetes-sigs#2286) * feat: Add NodeOverlay Controller Support (kubernetes-sigs#2306) * chore(deps): bump the k8s-go-deps group with 3 updates (kubernetes-sigs#2504) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: rolling back to 1.34 (kubernetes-sigs#2512) * fix: handle nil selector when hashing in topology (kubernetes-sigs#2511) * feat: Support Pod Level Resources (kubernetes-sigs#2383) Signed-off-by: Tsubasa Nagasawa <[email protected]> * fix: merge limits into requests when constructing ds pods (kubernetes-sigs#2514) * fix: default CPU_REQUESTS when non-positive value is provided (kubernetes-sigs#2516) * fix(node): prevent empty providerID causing false NodeClaim matches (kubernetes-sigs#2507) * feat: Support Static Capacity (kubernetes-sigs#2521) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Jason Deal <[email protected]> Co-authored-by: Jonathan Innis <[email protected]> Co-authored-by: Andrew Mitchell <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Ryan Mistretta <[email protected]> * fix: over provisioning static nodeclaims during controller crashes (kubernetes-sigs#2534) * chore: drop consistency error to info log (kubernetes-sigs#2542) * fix: flaky static provisioning unit test (kubernetes-sigs#2546) * fix: nodepool crd definition should explicitly say replicas field as alpha (kubernetes-sigs#2554) * chore: Update NodeRegistrationHealthy SC to use a buffer mechanism (kubernetes-sigs#2520) --------- Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: Max Cao <[email protected]> Signed-off-by: Josephine Pfeiffer <[email protected]> Signed-off-by: Tsubasa Nagasawa <[email protected]> Co-authored-by: Derek Frank <[email protected]> Co-authored-by: Jonathan Innis <[email protected]> Co-authored-by: Lê Minh Quân <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Jigisha Patil <[email protected]> Co-authored-by: Amanuel Engeda <[email protected]> Co-authored-by: Max Cao <[email protected]> Co-authored-by: Aidan Rowe <[email protected]> Co-authored-by: Daniel Lopes <[email protected]> Co-authored-by: Saurav Agarwalla <[email protected]> Co-authored-by: cosimomeli <[email protected]> Co-authored-by: Jason Deal <[email protected]> Co-authored-by: Reed Schalo <[email protected]> Co-authored-by: Josephine Pfeiffer <[email protected]> Co-authored-by: Sumukha Radhakrishna <[email protected]> Co-authored-by: Andy Townsend <[email protected]> Co-authored-by: Sumukha Radhakrishna <[email protected]> Co-authored-by: ryan-mist <[email protected]> Co-authored-by: Brandon Wagner <[email protected]> Co-authored-by: Alima Azamat <[email protected]> Co-authored-by: Andrew Mitchell <[email protected]> Co-authored-by: Tsubasa Nagasawa <[email protected]> Co-authored-by: Neil <[email protected]>
Fixes #N/A
Description
This change updates our termination controller logic so that we ensure that we group all of our status condition patch calls together. Up to this point, we were updating them in separate calls, but the requeing that we were doing actually caused a lot of conflicts on large clusters which slowed down our ability to terminate nodes significantly.
This ensures that all patch operations are grouped together to reduce the chance of us hitting conflicts.
Before PR (9m)
After PR (3m10s)
How was this change tested?
make presubmitBy submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.