feat: make planner use DGD Scaling Adapters#4825
Conversation
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
WalkthroughThe changes introduce a new Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Poem
Pre-merge checks❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
components/src/dynamo/planner/kube.py (1)
135-144: Consider adding a runtime deprecation warning.While the docstring marks this method as deprecated, consider adding a runtime warning to alert callers:
+import warnings + def update_graph_replicas( self, graph_deployment_name: str, component_name: str, replicas: int ) -> None: """ Update replicas for a service. Now uses DGDSA when available. Deprecated: Use update_service_replicas() instead for clarity. This method is kept for backward compatibility. """ + warnings.warn( + "update_graph_replicas() is deprecated, use update_service_replicas() instead", + DeprecationWarning, + stacklevel=2 + ) self.update_service_replicas(graph_deployment_name, component_name, replicas)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
components/src/dynamo/planner/kube.py(2 hunks)tests/planner/unit/kube.py(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: julienmancuso
Repo: ai-dynamo/dynamo PR: 1474
File: deploy/cloud/operator/internal/controller/dynamocomponent_controller.go:1308-1312
Timestamp: 2025-06-11T21:29:28.650Z
Learning: User julienmancuso expects replies in English; avoid switching languages unless explicitly requested.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: trtllm (amd64)
- GitHub Check: sglang (arm64)
- GitHub Check: trtllm (arm64)
- GitHub Check: operator (amd64)
- GitHub Check: sglang (amd64)
- GitHub Check: vllm (amd64)
- GitHub Check: vllm (arm64)
- GitHub Check: operator (arm64)
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (3)
components/src/dynamo/planner/kube.py (2)
118-133: LGTM!The fallback implementation correctly patches the DGD's
spec.servicesstructure, and the logging provides clear visibility into the fallback path.
81-116: The implementation is correct. Thepatch_namespaced_custom_object_scale()method is a legitimate Kubernetes Python client API for patching Scale subresources of custom objects. The body format{"spec": {"replicas": replicas}}is the correct standard for Scale subresource updates, and the adapter naming convention with lowercase service names matches the operator's documented DGDSA naming pattern (e.g.,sglang-agg-decode). The error handling correctly distinguishes between 404 (fallback to DGD) and other errors (propagate).tests/planner/unit/kube.py (1)
79-167: Excellent test coverage!The test suite comprehensively covers:
- Primary DGDSA Scale API path with correct lowercase adapter naming
- 404 fallback to DGD patching
- Non-404 error propagation without fallback
- Backward compatibility through deprecated method delegation
- Direct testing of the internal fallback method
The test assertions correctly verify API call parameters, fallback behavior, and error handling logic.
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
Merged 238 commits from main branch to bring the feature branch up to date. Key conflicts resolved: - Removed lib/kvbm-kernels references (deleted in main) - Kept nova/nova-backend/kvbm workspace members from feature branch - Maintained v2 module API refactoring from feature branch - Updated Cargo.lock files to reflect new dependencies Major updates from main include: - LoRA support for vLLM (#4810) - Multimodal documentation (#4510) - Scaling adapter features (#4699, #4825) - Tool calling support (#4822, #4722) - NIXL connect improvements (#4433) Signed-off-by: Ryan Olson <rolson@nvidia.com>
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
Overview:
make planner scale DGD Scaling Adapters.
Now that DGDSA have been introduced, planner should use these adapters to scale up/down DGD services
Summary by CodeRabbit
Improvements
Tests
✏️ Tip: You can customize this high-level summary in your review settings.