Skip to content

feat: make planner use DGD Scaling Adapters#4825

Merged
julienmancuso merged 2 commits intomainfrom
jsm/dep-685
Dec 9, 2025
Merged

feat: make planner use DGD Scaling Adapters#4825
julienmancuso merged 2 commits intomainfrom
jsm/dep-685

Conversation

@julienmancuso
Copy link
Copy Markdown
Contributor

@julienmancuso julienmancuso commented Dec 9, 2025

Overview:

make planner scale DGD Scaling Adapters.

Now that DGDSA have been introduced, planner should use these adapters to scale up/down DGD services

Summary by CodeRabbit

  • Improvements

    • Service scaling now uses an improved scaling mechanism with automatic fallback support.
  • Tests

    • Expanded test coverage for service scaling scenarios, fallback behavior, and error handling.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
@julienmancuso julienmancuso requested review from a team as code owners December 9, 2025 18:10
@github-actions github-actions bot added the feat label Dec 9, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Dec 9, 2025

Walkthrough

The changes introduce a new update_service_replicas() method that scales services via DGDSA Scale subresource with a fallback to direct DynamoGraphDeployment patching on 404 errors. The existing update_graph_replicas() now delegates to this new method for backward compatibility. Comprehensive unit tests cover the primary DGDSA path, fallback behavior, error propagation, and API delegation.

Changes

Cohort / File(s) Summary
DGDSA-based scaling implementation
components/src/dynamo/planner/kube.py
Added update_service_replicas() that scales via DGDSA Scale subresource with fallback to _update_dgd_replicas() on 404. Updated update_graph_replicas() to delegate to the new method. Added docstrings and logging.
Test coverage expansion
tests/planner/unit/kube.py
Replaced single test with five new test functions: test_update_service_replicas_uses_dgdsa_scale, test_update_service_replicas_fallback_to_dgd, test_update_service_replicas_propagates_other_errors, test_update_graph_replicas_calls_update_service_replicas, and test_update_dgd_replicas_directly.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Fallback logic verification: Ensure the 404 detection and fallback to DGD patching behaves correctly under edge cases
  • Adapter name construction: Verify the lowercase adapter name format (graph_deployment_name-service_name.lower()) matches expected DGDSA naming conventions
  • Test isolation: Confirm mocking of both DGDSA Scale API and DGD patch calls prevents cross-test contamination

Poem

🐰 With DGDSA's might, we scale services bright,
When scales don't exist, to the DGD we fight!
Five tests now dance in this scaling ballet,
Fallbacks and errors handled—hooray, hooray! 🎪

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The description is missing required sections: Details about the changes are vague, no specific files are identified for review, and no related issues are linked. Add a detailed 'Details' section explaining the implementation changes, specify files to review in 'Where should the reviewer start', and link related GitHub issues using action keywords.
✅ Passed checks (2 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Title check ✅ Passed The title 'feat: make planner use DGD Scaling Adapters' clearly describes the main change: updating the planner to use DGD Scaling Adapters for scaling services.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
components/src/dynamo/planner/kube.py (1)

135-144: Consider adding a runtime deprecation warning.

While the docstring marks this method as deprecated, consider adding a runtime warning to alert callers:

+import warnings
+
 def update_graph_replicas(
     self, graph_deployment_name: str, component_name: str, replicas: int
 ) -> None:
     """
     Update replicas for a service. Now uses DGDSA when available.

     Deprecated: Use update_service_replicas() instead for clarity.
     This method is kept for backward compatibility.
     """
+    warnings.warn(
+        "update_graph_replicas() is deprecated, use update_service_replicas() instead",
+        DeprecationWarning,
+        stacklevel=2
+    )
     self.update_service_replicas(graph_deployment_name, component_name, replicas)
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a473402 and e2cd6a1.

📒 Files selected for processing (2)
  • components/src/dynamo/planner/kube.py (2 hunks)
  • tests/planner/unit/kube.py (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: julienmancuso
Repo: ai-dynamo/dynamo PR: 1474
File: deploy/cloud/operator/internal/controller/dynamocomponent_controller.go:1308-1312
Timestamp: 2025-06-11T21:29:28.650Z
Learning: User julienmancuso expects replies in English; avoid switching languages unless explicitly requested.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: trtllm (amd64)
  • GitHub Check: sglang (arm64)
  • GitHub Check: trtllm (arm64)
  • GitHub Check: operator (amd64)
  • GitHub Check: sglang (amd64)
  • GitHub Check: vllm (amd64)
  • GitHub Check: vllm (arm64)
  • GitHub Check: operator (arm64)
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (3)
components/src/dynamo/planner/kube.py (2)

118-133: LGTM!

The fallback implementation correctly patches the DGD's spec.services structure, and the logging provides clear visibility into the fallback path.


81-116: The implementation is correct. The patch_namespaced_custom_object_scale() method is a legitimate Kubernetes Python client API for patching Scale subresources of custom objects. The body format {"spec": {"replicas": replicas}} is the correct standard for Scale subresource updates, and the adapter naming convention with lowercase service names matches the operator's documented DGDSA naming pattern (e.g., sglang-agg-decode). The error handling correctly distinguishes between 404 (fallback to DGD) and other errors (propagate).

tests/planner/unit/kube.py (1)

79-167: Excellent test coverage!

The test suite comprehensively covers:

  • Primary DGDSA Scale API path with correct lowercase adapter naming
  • 404 fallback to DGD patching
  • Non-404 error propagation without fallback
  • Backward compatibility through deprecated method delegation
  • Direct testing of the internal fallback method

The test assertions correctly verify API call parameters, fallback behavior, and error handling logic.

Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
@julienmancuso julienmancuso requested a review from a team as a code owner December 9, 2025 18:20
@julienmancuso julienmancuso changed the title feat: make planner scale DGD Scaling Adapters feat: make planner use DGD Scaling Adapters Dec 9, 2025
Copy link
Copy Markdown
Contributor

@tmonty12 tmonty12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@julienmancuso julienmancuso merged commit 9573d34 into main Dec 9, 2025
30 of 31 checks passed
@julienmancuso julienmancuso deleted the jsm/dep-685 branch December 9, 2025 21:51
ryanolson added a commit that referenced this pull request Dec 10, 2025
Merged 238 commits from main branch to bring the feature branch up to date.

Key conflicts resolved:
- Removed lib/kvbm-kernels references (deleted in main)
- Kept nova/nova-backend/kvbm workspace members from feature branch
- Maintained v2 module API refactoring from feature branch
- Updated Cargo.lock files to reflect new dependencies

Major updates from main include:
- LoRA support for vLLM (#4810)
- Multimodal documentation (#4510)
- Scaling adapter features (#4699, #4825)
- Tool calling support (#4822, #4722)
- NIXL connect improvements (#4433)

Signed-off-by: Ryan Olson <rolson@nvidia.com>
zxue2 pushed a commit to zxue2/dynamo that referenced this pull request Dec 11, 2025
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
smatta-star pushed a commit to smatta-star/dynamo that referenced this pull request Dec 19, 2025
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants