Skip to content

[BUG]: SGLang Planner Cannot Detect Workers Due to Component Name Mismatch #3106

@GavinZhu-GMI

Description

@GavinZhu-GMI

Describe the Bug

Problem Description

The SGLang planner fails to detect and manage SGLang workers during autoscaling operations, causing division by zero errors and preventing proper scaling functionality.

Root Cause

There's a mismatch between the component names that SGLang workers register under and what the planner expects to find:

SGLang workers register as:

  • Default workers: instances/dynamo/backend/generate
  • Prefill workers: instances/dynamo/prefill/generate (when using --disaggregation-mode prefill)

But SGLang planner expects to find:

  • Prefill workers: instances/dynamo/worker/generate
  • Decode workers: instances/dynamo/decode/generate

This is defined in components/planner/src/dynamo/planner/defaults.py:

class SGLangComponentName:
    prefill_worker_component_name = "worker"  # ← Should be "prefill"
    decode_worker_component_name = "decode"   # ← Should be "backend"

Comparison with vLLM

vLLM works correctly because its component names align with worker registration:

class VllmComponentName:
    prefill_worker_component_name = "prefill"  # Matches registration
    decode_worker_component_name = "backend"   # Matches registration

Logs

[INFO] planner_core.observe_metrics: Number of prefill workers: 0, number of decode workers: 0
[ERROR] planner_core.make_adjustments: Failed to correct prediction factors: float division by zero

Meanwhile, SGLang worker logs show successful registration:
[DEBUG] Starting endpoint: instances/dynamo/backend/generate:40239932a4066516

Environment

  • Dynamo version: 0.4.1
  • Backend: SGLang
  • Deployment: Kubernetes with DynamoGraphDeployment
  • Planner: SLA-based autoscaling

Workaround

Currently requires manually specifying --endpoint arguments in SGLang worker configurations to override default component names.

Steps to Reproduce

  1. use canonical sglang dockerfile to build for 0.4.1
  2. custom disagg_planner.yaml to run with planner
  3. you will find that the planner is missing the workers.

Expected Behavior

SGLang planner should detect workers automatically, just like vLLM planner does.
planner expects to find:

SGLang workers register as:

  • Default workers: instances/dynamo/backend/generate
  • Prefill workers: instances/dynamo/prefill/generate (when using --disaggregation-mode prefill)

Actual Behavior

  1. SGLang workers start and register successfully
  2. Planner starts and looks for workers at wrong endpoints
  3. Planner reports "Number of prefill workers: 0, number of decode workers: 0"
  4. Scaling calculations fail with "Failed to correct prediction factors: float division by zero"
  5. No autoscaling occurs

Environment

dynamo version 0.4.1

Additional Context

No response

Screenshots

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions