You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Create an `IPAddressPool` with the following spec for MetalLB [optional]
61
+
Create an `IPAddressPool` with the following spec for MetalLB
62
62
```yaml
63
63
echo"apiVersion: metallb.io/v1beta1
64
64
kind: IPAddressPool
@@ -244,6 +244,11 @@ You can monitor the progress of the upgrade by inspecting the `RayService` statu
244
244
```
245
245
Look at the `spec.rules.backendRefs`. You will see the `weight`forthe old and new services changein real-time as the traffic shift (Phase 2) progresses.
246
246
247
+
For example:
248
+
```yaml
249
+
250
+
```
251
+
247
252
## How to upgrade safely?
248
253
249
254
Since this feature is alpha and rollback is not yet supported, we recommend conservative parameter settings to minimize risk during upgrades.
@@ -252,7 +257,7 @@ Since this feature is alpha and rollback is not yet supported, we recommend cons
252
257
253
258
To upgrade safely, you should:
254
259
1. Scale up 1 worker pod in the new cluster and scale down 1 worker pod in the old cluster at a time
255
-
2. Make the upgrade process gradual to allow the Ray Serve autoscaler to adapt
260
+
2. Make the upgrade process gradual to allow the Ray Serve autoscaler and Ray autoscaler to adapt
256
261
257
262
Based on these principles, we recommend:
258
263
- **maxSurgePercent**: Calculate based on the formula below
@@ -290,9 +295,11 @@ This configuration guarantees you have sufficient resources to run at least one
290
295
291
296
Set `intervalSeconds` to 60 seconds to give the Ray Serve autoscaler and Ray autoscaler sufficient time to:
292
297
- Detect load changes
293
-
- Make scaling decisions while respecting upscale/downscale delays
298
+
- Immediately scale replicas up or down to enforce new min_replicas and max_replicas limits (via target_capacity)
299
+
- Scale down replicas immediately if they exceed the new max_replicas
300
+
- Scale up replicas immediately if they fall below the new min_replicas
294
301
- Provision resources
295
-
- Allow replicas to transition states gracefully to "deploying"
302
+
- Allow replicas to transition states gracefully to "UPDATING"
296
303
297
304
A larger interval prevents the upgrade controller from making changes faster than the autoscaler can react, reducing the risk of service disruption.
0 commit comments