-
Notifications
You must be signed in to change notification settings - Fork 690
feat: support SGLang in pre-deployment sweeping #2360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds SGLang backend support across profiler tooling and deployment configs: introduces backend-specific worker names, per-deployment naming, dynamic log paths, an SGLang config modifier, CLI backend flag, shell command wrappers, image updates, and a Grove-disabling annotation in deployment creation. Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as profile_sla.py
participant Config as SGLangConfigModifier
participant Planner as WORKER_COMPONENT_NAMES
participant Deploy as DynamoDeploymentClient
participant K8s as Kubernetes
CLI->>Config: convert_config(target=prefill/decode)
Config-->>CLI: transformed spec (sglang)
CLI->>Deploy: create_deployment(deployment_name, spec, backend=sglang)
Deploy->>K8s: Apply CR with annotations (disable Grove)
K8s-->>Deploy: Deployment ready
CLI->>Planner: resolve decode worker K8s name
Planner-->>CLI: SGLangDecodeWorker
CLI->>K8s: Read logs for KV cache (per-deployment path)
CLI-->>CLI: Run SLA profiling with prefill/decode clients
CLI->>Deploy: cleanup
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
benchmarks/profiler/deploy/profile_sla_job.yaml(1 hunks)benchmarks/profiler/profile_sla.py(8 hunks)benchmarks/profiler/utils/config.py(3 hunks)benchmarks/profiler/utils/dynamo_deployment.py(1 hunks)components/backends/sglang/deploy/agg.yaml(1 hunks)components/backends/sglang/deploy/agg_router.yaml(1 hunks)components/backends/sglang/deploy/disagg.yaml(3 hunks)components/planner/src/dynamo/planner/defaults.py(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-07-28T17:00:07.968Z
Learnt from: biswapanda
PR: ai-dynamo/dynamo#2137
File: components/backends/sglang/deploy/agg_router.yaml:0-0
Timestamp: 2025-07-28T17:00:07.968Z
Learning: In components/backends/sglang/deploy/agg_router.yaml, the clear_namespace command is intentionally designed to block the router from starting if it fails (using &&). This is a deliberate design decision where namespace clearing is a critical prerequisite and the router should not start with an uncleared namespace.
Applied to files:
components/backends/sglang/deploy/agg_router.yaml
🔇 Additional comments (12)
components/backends/sglang/deploy/agg_router.yaml (1)
92-94: Consistent shell wrapper pattern appliedThe addition of the shell wrapper for SGLangDecodeWorker follows the same pattern as the Frontend container, ensuring consistency across components.
components/backends/sglang/deploy/agg.yaml (1)
92-94: Shell wrapper consistently appliedThe shell wrapper addition maintains consistency with the Frontend container and other SGLang deployment configurations.
components/planner/src/dynamo/planner/defaults.py (1)
85-92: Verify component naming asymmetry is intentionalThe SGLangComponentName defines
prefill_worker_component_name = "worker"anddecode_worker_component_name = "decode". This differs from the VllmComponentName pattern where prefill uses "prefill" and decode uses "backend".Please confirm this asymmetric naming (worker/decode) aligns with SGLang's architectural design and component naming conventions.
benchmarks/profiler/deploy/profile_sla_job.yaml (1)
37-38: LGTM! Clean backend specification for SGLang.The hardcoded
--backend sglangparameter correctly enables SGLang backend selection for this specific job configuration. The placement maintains logical argument ordering.benchmarks/profiler/profile_sla.py (8)
24-24: LGTM! Essential import for backend-specific worker name resolution.The import of
WORKER_COMPONENT_NAMESenables dynamic log path construction based on the selected backend, which is crucial for SGLang support.
145-145: LGTM! Per-deployment naming for prefill configuration.Adding
deployment_nameparameter enables proper per-deployment resource tracking and cleanup, derived appropriately from the config metadata.
253-253: LGTM! Consistent per-deployment naming for decode configuration.The deployment_name parameter maintains consistency with the prefill configuration approach.
268-268: Excellent dynamic log path construction for backend compatibility.The dynamic path construction using
WORKER_COMPONENT_NAMES[args.backend].decode_worker_k8s_name.lower()elegantly handles backend-specific worker naming differences between vLLM and SGLang. This replaces hardcoded paths with flexible, backend-aware resolution.
403-403: LGTM! Per-deployment naming for selected prefill interpolation.Consistent application of the deployment_name pattern for interpolation profiling.
495-498: Clean parameter organization for decode interpolation.The parameter ordering maintains consistency with other DynamoDeploymentClient initializations while adding the required deployment_name.
513-513: LGTM! Consistent dynamic log path for interpolation phase.The dynamic log path construction maintains consistency with the earlier decode profiling phase, ensuring proper backend-specific worker log access.
598-599: LGTM! CLI backend support expansion.The addition of "sglang" to the choices list with updated help text properly extends the CLI interface to support the new backend option.
384e3c1 to
b8ae707
Compare
b8ae707 to
f41fdae
Compare
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Hongkuan Zhou <[email protected]>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Hongkuan Zhou <[email protected]>
Summary by CodeRabbit