Skip to content

change deployment service VPAs to InPlaceOrRecreate#10959

Merged
mikkeloscar merged 12 commits into
devfrom
deployment-service-in-place-replacement
Apr 17, 2026
Merged

change deployment service VPAs to InPlaceOrRecreate#10959
mikkeloscar merged 12 commits into
devfrom
deployment-service-in-place-replacement

Conversation

@tcondeixa

@tcondeixa tcondeixa commented Apr 7, 2026

Copy link
Copy Markdown
Contributor

change deployment service VPAs to InPlaceOrRecreate mode

This feature is GA in Kubernetes 1.35 that we are finishing the rollout soon.
The default resizePolicy is restart, so it's not require for cpu and memory (https://kubernetes.io/docs/tasks/configure-pod-container/resize-container-resources/#container-resize-policies)

The deployment-service-controller is not 100% resilient to redeployments (we need to solve this later), so trying to do a in-place replacement should reduce the probability of issues for deployments done by users. The adjustments of memory/cpu should be small and incremental with VPA, so most of the cases should be covered by in-place replacement and we should see a low number of Recreate events.

I'm also changing the default resources from deployment-service-status-service for more sane defaults (only one the first deployment)
I'm also changing the replicas of the deployment-service-status-service to 2 to force VPA to work and see changes, this also makes this deployment aligned with the other deployments (2 replicas)

Tests

I tested this in a pet cluster and it worked by doing a inPlace without creating new pods.
The prometheus metrics also show the total of inPlace done as expected for status-service (vpa_updater_in_place_updated_pods_total = 2)

@tcondeixa tcondeixa added minor Minor changes, e.g. low risk config updates, changes that do not introduce a new API call. do-not-merge labels Apr 7, 2026
@tcondeixa tcondeixa changed the title change deployment service controller VPA to InPlaceOrRecreate change deployment service VPAs to InPlaceOrRecreate Apr 8, 2026
@tcondeixa

Copy link
Copy Markdown
Contributor Author

@linki can you tell me the e2e test that was failing? I did not have any issue when I validate it in my pet cluster

@tcondeixa

Copy link
Copy Markdown
Contributor Author

The e2e test excluded as part of the kubernetes 1.35 will be addressed in a separated PR to unblock the validation with this use-case.

@tcondeixa

Copy link
Copy Markdown
Contributor Author

👍

@tcondeixa

Copy link
Copy Markdown
Contributor Author

👍

@mikkeloscar

Copy link
Copy Markdown
Contributor

👍

@mikkeloscar mikkeloscar merged commit e2dd12a into dev Apr 17, 2026
15 checks passed
@mikkeloscar mikkeloscar deleted the deployment-service-in-place-replacement branch April 17, 2026 09:49
This was referenced Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merged/alpha merged/beta merged/stable minor Minor changes, e.g. low risk config updates, changes that do not introduce a new API call.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants