Skip to content

Commit ee7c37e

Browse files
add abrarsheikh and nick's advices
Signed-off-by: Future-Outlier <[email protected]>
1 parent fd7e2a5 commit ee7c37e

File tree

1 file changed

+11
-4
lines changed

1 file changed

+11
-4
lines changed

doc/source/serve/advanced-guides/incremental-upgrade.md

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ kubectl apply --server-side -f https://github.com/kubernetes-sigs/gateway-api/re
5858
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.14.7/config/manifests/metallb-native.yaml
5959
```
6060

61-
Create an `IPAddressPool` with the following spec for MetalLB [optional]
61+
Create an `IPAddressPool` with the following spec for MetalLB
6262
```yaml
6363
echo "apiVersion: metallb.io/v1beta1
6464
kind: IPAddressPool
@@ -244,6 +244,11 @@ You can monitor the progress of the upgrade by inspecting the `RayService` statu
244244
```
245245
Look at the `spec.rules.backendRefs`. You will see the `weight` for the old and new services change in real-time as the traffic shift (Phase 2) progresses.
246246
247+
For example:
248+
```yaml
249+
250+
```
251+
247252
## How to upgrade safely?
248253
249254
Since this feature is alpha and rollback is not yet supported, we recommend conservative parameter settings to minimize risk during upgrades.
@@ -252,7 +257,7 @@ Since this feature is alpha and rollback is not yet supported, we recommend cons
252257
253258
To upgrade safely, you should:
254259
1. Scale up 1 worker pod in the new cluster and scale down 1 worker pod in the old cluster at a time
255-
2. Make the upgrade process gradual to allow the Ray Serve autoscaler to adapt
260+
2. Make the upgrade process gradual to allow the Ray Serve autoscaler and Ray autoscaler to adapt
256261
257262
Based on these principles, we recommend:
258263
- **maxSurgePercent**: Calculate based on the formula below
@@ -290,9 +295,11 @@ This configuration guarantees you have sufficient resources to run at least one
290295
291296
Set `intervalSeconds` to 60 seconds to give the Ray Serve autoscaler and Ray autoscaler sufficient time to:
292297
- Detect load changes
293-
- Make scaling decisions while respecting upscale/downscale delays
298+
- Immediately scale replicas up or down to enforce new min_replicas and max_replicas limits (via target_capacity)
299+
- Scale down replicas immediately if they exceed the new max_replicas
300+
- Scale up replicas immediately if they fall below the new min_replicas
294301
- Provision resources
295-
- Allow replicas to transition states gracefully to "deploying"
302+
- Allow replicas to transition states gracefully to "UPDATING"
296303
297304
A larger interval prevents the upgrade controller from making changes faster than the autoscaler can react, reducing the risk of service disruption.
298305

0 commit comments

Comments
 (0)