-
Notifications
You must be signed in to change notification settings - Fork 689
feat: dynamo namespace isolation for backend component + vllm-fix #2490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: dynamo namespace isolation for backend component + vllm-fix #2490
Conversation
…-for-backend-component--vllm-fixed
|
/ok to test c7b3957 |
WalkthroughIntroduces environment-driven namespaces across components. Adds DYN_NAMESPACE defaults, exposes namespace utilities, removes explicit namespace flags in deploys, updates vLLM arg parsing to fixed endpoint mapping, adds frontend namespace-scoped discovery, implements Rust-side namespace module and watcher filtering, and updates tests. Adds AMD64 precompiled vLLM wheel install path. Changes
Sequence Diagram(s)sequenceDiagram
participant Env as Environment
participant FE as Frontend
participant RT as Runtime
participant ETCD as Discovery (KV)
Env->>FE: DYN_NAMESPACE (optional)
FE->>FE: Parse args (--namespace or env or global)
FE->>RT: Build endpoint_id "<ns>.frontend.http"
RT->>ETCD: Watch models under namespace scope
ETCD-->>RT: Model events (scoped)
RT-->>FE: Available models/endpoints
FE->>Client: Serve HTTP using discovered models
sequenceDiagram
participant Watcher as ModelWatcher
participant ETCD as KV Stream
participant MM as ModelManager
Watcher->>Watcher: target_namespace = None or Some(ns)
ETCD-->>Watcher: Put/Delete events
Watcher->>Watcher: Parse ModelEntry
alt Put event
opt target_namespace=Some(ns) and not global
Watcher->>Watcher: if entry.ns != ns skip
end
Watcher->>MM: Add/Update model (with namespace)
else Delete event
Watcher->>MM: Remove model
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🔭 Outside diff range comments (2)
components/frontend/src/dynamo/frontend/main.py (1)
175-181: Bug: trimmed metrics prefix is computed but not used.You compute
prefix = flags.metrics_prefix.strip()but set the env var to the untrimmed value. Useprefix.Apply this diff:
if flags.metrics_prefix is not None: prefix = flags.metrics_prefix.strip() if prefix: - os.environ["DYN_METRICS_PREFIX"] = flags.metrics_prefix + os.environ["DYN_METRICS_PREFIX"] = prefixcomponents/backends/sglang/deploy/disagg-multinode.yaml (1)
63-69: Prefill and decode are in different namespaces; this likely breaks disaggregation coordination.
prefillis configured withdynamoNamespace: sglang-disaggwhileFrontendanddecodeusesglang-disagg-multinode. Disaggregation workers typically must share the same namespace for discovery and routing.Proposed fix (update
prefillto match the others):# In prefill service dynamoNamespace: sglang-disagg-multinodeIf this divergence is intentional, add a brief comment explaining the cross-namespace design to avoid regressions later.
♻️ Duplicate comments (1)
components/backends/sglang/deploy/agg.yaml (1)
27-27: Same verification as disagg: ensure DYN_NAMESPACE is injected.The observation and verification steps in components/backends/sglang/deploy/disagg.yaml apply here as well.
Run the same script from the disagg.yaml comment to confirm env injection and fallback behavior.
🧹 Nitpick comments (19)
container/deps/vllm/install_vllm.sh (1)
151-156: Add integrity checks and robust cleanup for precompiled wheel; quote paths.
- No checksum verification for the downloaded wheel; add an optional SHA256 gate to prevent tampering/partial downloads.
- Use
trapto ensure the wheel is cleaned up even on installation failures, instead of best-effort cleanup at the end.- Quote paths to avoid word-splitting issues.
If you can provide a trusted sha256 for each wheel in S3, the script can enforce it.
Apply this diff:
- VLLM_TEMP_DIR=/tmp/vllm/wheel/$VLLM_REF - mkdir -p $VLLM_TEMP_DIR - REMOTE_WHEEL_URL=https://vllm-wheels.s3.us-west-2.amazonaws.com/${VLLM_REF}/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl - export VLLM_PRECOMPILED_WHEEL_LOCATION=$VLLM_TEMP_DIR/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl - rm -rf $VLLM_PRECOMPILED_WHEEL_LOCATION || true - curl -fS --retry 3 -L "$REMOTE_WHEEL_URL" -o "$VLLM_PRECOMPILED_WHEEL_LOCATION" + VLLM_TEMP_DIR="/tmp/vllm/wheel/$VLLM_REF" + mkdir -p "$VLLM_TEMP_DIR" + REMOTE_WHEEL_URL="https://vllm-wheels.s3.us-west-2.amazonaws.com/${VLLM_REF}/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl" + export VLLM_PRECOMPILED_WHEEL_LOCATION="$VLLM_TEMP_DIR/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl" + rm -f "$VLLM_PRECOMPILED_WHEEL_LOCATION" || true + curl -fS --retry 3 -L "$REMOTE_WHEEL_URL" -o "$VLLM_PRECOMPILED_WHEEL_LOCATION" + # Basic integrity checks: non-empty file and optional SHA256 verification if provided + test -s "$VLLM_PRECOMPILED_WHEEL_LOCATION" + if [ -n "${VLLM_WHEEL_SHA256:-}" ]; then + echo "$VLLM_WHEEL_SHA256 $VLLM_PRECOMPILED_WHEEL_LOCATION" | sha256sum -c - + fi + # Ensure cleanup on success/failure + trap 'rm -f "$VLLM_PRECOMPILED_WHEEL_LOCATION"' EXIT @@ - rm -rf $VLLM_PRECOMPILED_WHEEL_LOCATION + # Cleanup handled by trap aboveVerification to consider:
- Ensure you have or can generate per-ref SHA256 sums for the S3 wheels and wire them via
VLLM_WHEEL_SHA256.Also applies to: 162-162
components/backends/sglang/src/dynamo/sglang/common/sgl_utils.py (1)
106-107: Normalize DYN_NAMESPACE; treat blank/whitespace as global and add type hintsCurrent implementation returns an empty string if DYN_NAMESPACE is set but blank, which may propagate invalid/empty namespaces into runtime.namespace(...) or dyn:// endpoint construction elsewhere. Align with Rust is_global_namespace and avoid surprises by normalizing.
Apply this diff:
-def get_dynamo_namespace(): - return os.environ.get("DYN_NAMESPACE", "global") +def get_dynamo_namespace() -> str: + """ + Resolve the Dynamo namespace from environment. + - Uses the DYN_NAMESPACE env var if set and non-empty after trimming. + - Falls back to 'global' otherwise. + """ + ns = os.environ.get("DYN_NAMESPACE", "").strip() + return ns or "global"components/backends/sglang/src/dynamo/sglang/utils/clear_namespace.py (1)
43-50: Make CLI fallback robust and user-friendly; accept blank as 'global'
- The current flow asserts a non-empty namespace. Elsewhere (Rust) empty string maps to global, and other Python call sites may pass blanks inadvertently. Normalize env/CLI input and avoid a hard failure when omitted or blank.
- Also add a helpful help= message to --namespace.
Apply this diff:
- parser.add_argument("--namespace", type=str, required=False, default=None) + parser.add_argument( + "--namespace", + type=str, + required=False, + default=None, + help="Dynamo namespace (overrides $DYN_NAMESPACE). Defaults to 'global' if neither is set." + ) @@ - if not args.namespace: - args.namespace = os.environ.get("DYN_NAMESPACE") - assert ( - args.namespace - ), "Missing namespace, either pass --namespace or set DYN_NAMESPACE" + # Normalize: CLI wins; else env; treat blank/whitespace as global. + if not args.namespace: + args.namespace = os.environ.get("DYN_NAMESPACE", "") + args.namespace = (args.namespace or "").strip() or "global"lib/llm/src/namespace.rs (1)
7-9: Trim whitespace to avoid treating " " as a distinct namespaceNormalize the input to reduce surprises from env/config with stray whitespace. This keeps behavior consistent with Python-side normalization we’re adopting.
Apply this diff:
-pub fn is_global_namespace(namespace: &str) -> bool { - namespace == GLOBAL_NAMESPACE || namespace.is_empty() -} +pub fn is_global_namespace(namespace: &str) -> bool { + let ns = namespace.trim(); + ns.is_empty() || ns == GLOBAL_NAMESPACE +}components/backends/mocker/src/dynamo/mocker/main.py (1)
18-20: Normalize DYN_NAMESPACE to avoid dyn://.backend.generate when env is blankIf DYN_NAMESPACE is set to an empty or whitespace-only string, DEFAULT_ENDPOINT becomes invalid (dyn://.backend.generate). Normalize to default to global.
Apply this diff:
-DYN_NAMESPACE = os.environ.get("DYN_NAMESPACE", "global") -DEFAULT_ENDPOINT = f"dyn://{DYN_NAMESPACE}.backend.generate" +DYN_NAMESPACE = (os.environ.get("DYN_NAMESPACE", "").strip() or "global") +DEFAULT_ENDPOINT = f"dyn://{DYN_NAMESPACE}.backend.generate"Optional (for better testability): compute the default inside cmd_line_args to respect env changes at runtime rather than module-import time.
# Inside cmd_line_args(), before adding --endpoint: ns = (os.environ.get("DYN_NAMESPACE", "").strip() or "global") default_endpoint = f"dyn://{ns}.backend.generate" # Then use: parser.add_argument("--endpoint", type=str, default=default_endpoint, help=f"Dynamo endpoint string (default: {default_endpoint})")components/backends/vllm/src/dynamo/vllm/args.py (3)
108-110: Normalize DYN_NAMESPACE and keep endpoint derivation deterministic
- Treat blank/whitespace env values as global to avoid empty namespace propagation into ETCD contexts and discovery.
- Component/endpoint mapping looks good.
Apply this diff:
- config.namespace = os.environ.get("DYN_NAMESPACE", "global") + config.namespace = os.environ.get("DYN_NAMESPACE", "").strip() or "global" config.component = "prefill" if args.is_prefill_worker else "backend" config.endpoint = "generate"
64-67: Help text is outdated; there is no "provided namespace" CLI option anymoreThe message suggests a CLI-provided namespace, but we now derive namespace from DYN_NAMESPACE (default: global). Update the help to avoid confusion.
Apply this diff:
- parser.add_argument( - "--is-prefill-worker", - action="store_true", - help="Enable prefill functionality for this worker. Uses the provided namespace to construct dyn://namespace.prefill.generate", - ) + parser.add_argument( + "--is-prefill-worker", + action="store_true", + help="Enable prefill functionality for this worker. Namespace is read from $DYN_NAMESPACE (default: global). Endpoint will be dyn://{namespace}.prefill.generate", + )
185-190: Fix assert message to reference side_channel_port (not kv_port)The assertion checks side_channel_port but the message mentions kv_port. Minor clarity fix.
Apply this diff:
- assert ( - config.side_channel_port is not None - ), "Must set the kv_port, use configure_ports_with_etcd" + assert ( + config.side_channel_port is not None + ), "Must set side_channel_port; call configure_ports_with_etcd first"components/frontend/src/dynamo/frontend/main.py (3)
41-45: Keep namespace constants aligned across languages.
GLOBAL_NAMESPACE = "global"duplicates the Rust-side constant exposed via lib/llm/src/namespace.rs. This is fine for now, but it’s easy to drift. Consider centralizing the Python definition (e.g., a small shared Python module) or adding a brief comment referencing the Rust constant to reduce future mismatches.
125-130: Validate namespace input and document env-var fallback in help.
- Add validation to prevent accidental invalid values (whitespace, dots) that would form malformed endpoint_ids.
- Update the help string to explicitly mention the DYN_NAMESPACE env var fallback.
Apply this diff to improve the CLI arg:
parser.add_argument( "--namespace", - type=str, + type=validate_namespace, default=None, - help="Dynamo namespace for model discovery scoping. If specified, models will only be discovered from this namespace. If not specified, discovers models from all namespaces (global discovery).", + help="Dynamo namespace for model discovery scoping. If specified, models are discovered only from this namespace. If not specified, falls back to DYN_NAMESPACE env var; if unset or blank, uses 'global' (global discovery). Allowed characters: [A-Za-z0-9_-].", )Add this helper outside the selected lines (e.g., near other validators):
def validate_namespace(value: str) -> str: value = (value or "").strip() if not value: raise argparse.ArgumentTypeError("namespace must be non-empty if provided") # Disallow dots/whitespace; allow hyphens/underscores to align with examples like 'sglang-disagg' if not re.match(r"^[A-Za-z0-9][A-Za-z0-9_-]*$", value): raise argparse.ArgumentTypeError( f"namespace contains invalid characters: {value!r}. Allowed pattern: ^[A-Za-z0-9][A-Za-z0-9_-]*$" ) return value
165-167: Treat blank env var as unset and strip whitespace.If DYN_NAMESPACE is set to a blank or whitespace-only value, current code will pass that through and later default to GLOBAL_NAMESPACE, but you end up carrying around a misleading flags.namespace value. Strip and ignore blank here for clarity.
Apply this diff:
- if not flags.namespace: - flags.namespace = os.environ.get(DYNAMO_NAMESPACE_ENV_VAR) + if not flags.namespace: + env_ns = os.environ.get(DYNAMO_NAMESPACE_ENV_VAR) + if env_ns is not None: + env_ns = env_ns.strip() + if env_ns: + flags.namespace = env_nscomponents/backends/sglang/src/dynamo/sglang/decode_worker/main.py (1)
74-76: Make namespace resolution robust and observable.Treat blank env values as unset, and log the resolved namespace for easier debugging.
Apply this diff:
- component = runtime.namespace(get_dynamo_namespace()).component("decode") - await component.create_service() + ns = (get_dynamo_namespace() or "").strip() or "global" + logging.info(f"Using namespace: '{ns}' for decode worker") + component = runtime.namespace(ns).component("decode") + await component.create_service()Optionally, harden the source utility outside this file by trimming and defaulting there:
# components/backends/sglang/src/dynamo/sglang/common/sgl_utils.py def get_dynamo_namespace(): return (os.environ.get("DYN_NAMESPACE") or "").strip() or "global"lib/llm/tests/http_namespace_integration.rs (3)
34-101: Filtering logic mirrors intended isolation semantics; minor duplication nit.The “global includes all; specific ns filters strictly to that ns” behavior is explicit. Consider extracting the filtering predicate into a small helper to reduce repetition across tests.
Example refactor (outside this hunk):
fn visible<'a>(models: &'a [ModelEntry], ns: &str) -> Vec<&'a ModelEntry> { let is_global = is_global_namespace(ns); models.iter().filter(|m| is_global || m.endpoint.namespace == ns).collect() }
104-153: Good coverage of endpoint parsing and namespace classification; add one invalid parse case.Parsing only valid triples misses negative cases (e.g., missing segments). Adding one assertion for an invalid endpoint ID will harden the contract and protect against regressions.
Suggested addition (outside this hunk):
#[test] fn test_endpoint_id_invalid_formats() { let invalids = ["no-dots", "too.many.dots.here", ".leading.dot", "trailing.dot.", "ns..name"]; for s in invalids { assert!(s.parse::<Endpoint>().is_err(), "Expected parse error for '{s}'"); } }
155-222: Scoping scenarios reflect strict namespace isolation; consider asserting absence of “global” under scoped frontend.You already verify the count is 2; adding an assertion that the “global” entry is not present can catch accidental inclusion if logic changes.
For example:
assert!(visible_models.iter().all(|m| m.endpoint.namespace != GLOBAL_NAMESPACE));lib/llm/src/entrypoint/input/common.rs (1)
78-83: Confirm intended scope: dynamic-input path now watches all namespaces.Unlike the HTTP path (which filters by namespace), this path explicitly opts out of filtering. Please confirm this is intentional for the dynamic input flow; if not, consider passing the local model’s namespace similar to HTTP.
components/backends/sglang/src/dynamo/sglang/worker/main.py (1)
336-339: Sanitize namespace from env to avoid accidental whitespace/empty values.If
DYN_NAMESPACEis set to a blank/whitespace value, you’ll end up registering under that literal namespace (not “global”), which is likely unintended and inconsistent with Rust-sideis_global_namespacesemantics. Strip and default here.Apply this diff:
- dynamo_namespace = get_dynamo_namespace() + # Normalize: treat empty/whitespace as "global" + dynamo_namespace = (get_dynamo_namespace() or "global").strip() or "global"lib/llm/src/discovery/watcher.rs (1)
91-94: Avoid relying on Option::is_none_or for broader MSRV compatibility.
Option::is_none_ormay not be available on older toolchains. Usemap_or(true, ...)to keep behavior identical without bumping MSRV.Apply this diff:
- let global_namespace = target_namespace.is_none_or(is_global_namespace); + let global_namespace = target_namespace.map_or(true, is_global_namespace);lib/llm/src/entrypoint/input/http.rs (1)
176-186: WatcherConfig packaging is clean; consider clarifying doc re: “empty namespace”.Docs mention “empty or equals GLOBAL_NAMESPACE”, but in this call site you only ever pass
None(for global) orSome(non-global). Consider updating the comment to “If target_namespace is None or equals GLOBAL_NAMESPACE/empty on the watcher side, discover all namespaces” to avoid confusion.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (23)
components/backends/mocker/src/dynamo/mocker/main.py(2 hunks)components/backends/sglang/deploy/agg.yaml(1 hunks)components/backends/sglang/deploy/agg_router.yaml(1 hunks)components/backends/sglang/deploy/disagg-multinode.yaml(1 hunks)components/backends/sglang/deploy/disagg.yaml(1 hunks)components/backends/sglang/deploy/disagg_planner.yaml(1 hunks)components/backends/sglang/src/dynamo/sglang/common/__init__.py(2 hunks)components/backends/sglang/src/dynamo/sglang/common/sgl_utils.py(2 hunks)components/backends/sglang/src/dynamo/sglang/decode_worker/main.py(2 hunks)components/backends/sglang/src/dynamo/sglang/utils/clear_namespace.py(2 hunks)components/backends/sglang/src/dynamo/sglang/worker/main.py(3 hunks)components/backends/vllm/README.md(0 hunks)components/backends/vllm/deploy/README.md(0 hunks)components/backends/vllm/src/dynamo/vllm/args.py(1 hunks)components/frontend/src/dynamo/frontend/main.py(6 hunks)container/deps/vllm/install_vllm.sh(1 hunks)lib/llm/src/discovery/watcher.rs(4 hunks)lib/llm/src/entrypoint/input/common.rs(1 hunks)lib/llm/src/entrypoint/input/http.rs(5 hunks)lib/llm/src/lib.rs(1 hunks)lib/llm/src/namespace.rs(1 hunks)lib/llm/tests/http_namespace_integration.rs(1 hunks)lib/llm/tests/namespace.rs(1 hunks)
💤 Files with no reviewable changes (2)
- components/backends/vllm/README.md
- components/backends/vllm/deploy/README.md
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-07-28T17:00:07.968Z
Learnt from: biswapanda
PR: ai-dynamo/dynamo#2137
File: components/backends/sglang/deploy/agg_router.yaml:0-0
Timestamp: 2025-07-28T17:00:07.968Z
Learning: In components/backends/sglang/deploy/agg_router.yaml, the clear_namespace command is intentionally designed to block the router from starting if it fails (using &&). This is a deliberate design decision where namespace clearing is a critical prerequisite and the router should not start with an uncleared namespace.
Applied to files:
components/backends/sglang/deploy/disagg_planner.yamlcomponents/backends/sglang/deploy/disagg-multinode.yamlcomponents/backends/sglang/deploy/agg_router.yamlcomponents/backends/sglang/deploy/agg.yamlcomponents/backends/sglang/deploy/disagg.yaml
📚 Learning: 2025-07-21T00:10:56.947Z
Learnt from: zaristei
PR: ai-dynamo/dynamo#2020
File: container/deps/vllm/install_vllm.sh:115-118
Timestamp: 2025-07-21T00:10:56.947Z
Learning: Graceful fallback for PyTorch wheel installation is broken on ARM architecture, so immediate exit on pinned version failure is preferred over fallback mechanisms in container/deps/vllm/install_vllm.sh for ARM64.
Applied to files:
container/deps/vllm/install_vllm.sh
📚 Learning: 2025-07-22T10:22:28.972Z
Learnt from: ptarasiewiczNV
PR: ai-dynamo/dynamo#2027
File: container/deps/vllm/install_vllm.sh:0-0
Timestamp: 2025-07-22T10:22:28.972Z
Learning: The `--torch-backend=auto` flag works with vLLM installations via uv pip install, even though it's not a standard pip option. This flag is processed by vLLM's build system during installation to automatically match PyTorch distribution with container CUDA versions.
Applied to files:
container/deps/vllm/install_vllm.sh
🧬 Code Graph Analysis (10)
lib/llm/src/lib.rs (5)
lib/runtime/src/distributed.rs (1)
namespace(185-187)lib/runtime/src/component.rs (2)
namespace(218-220)namespace(602-609)lib/bindings/python/rust/lib.rs (1)
namespace(286-291)lib/bindings/python/src/dynamo/_core.pyi (1)
namespace(37-41)lib/llm/src/kv_router/publisher.rs (1)
namespace(988-989)
components/backends/sglang/src/dynamo/sglang/common/__init__.py (1)
components/backends/sglang/src/dynamo/sglang/common/sgl_utils.py (1)
get_dynamo_namespace(106-107)
lib/llm/tests/http_namespace_integration.rs (1)
lib/llm/src/namespace.rs (1)
is_global_namespace(7-9)
components/backends/sglang/src/dynamo/sglang/utils/clear_namespace.py (1)
components/backends/sglang/src/dynamo/sglang/utils/sgl_http_server.py (1)
parse_args(179-194)
lib/llm/src/discovery/watcher.rs (1)
lib/llm/src/namespace.rs (1)
is_global_namespace(7-9)
lib/llm/tests/namespace.rs (2)
lib/llm/src/namespace.rs (1)
is_global_namespace(7-9)lib/llm/src/discovery/watcher.rs (2)
serde_json(98-98)serde_json(433-433)
components/backends/sglang/src/dynamo/sglang/decode_worker/main.py (1)
components/backends/sglang/src/dynamo/sglang/common/sgl_utils.py (1)
get_dynamo_namespace(106-107)
lib/llm/src/entrypoint/input/http.rs (2)
lib/llm/src/namespace.rs (1)
is_global_namespace(7-9)lib/llm/src/discovery/watcher.rs (1)
new(60-74)
components/frontend/src/dynamo/frontend/main.py (3)
lib/bindings/python/rust/lib.rs (2)
namespace(286-291)get(363-379)lib/bindings/python/src/dynamo/_core.pyi (2)
namespace(37-41)get(127-140)lib/runtime/src/component/client.rs (1)
is_static(129-131)
components/backends/sglang/src/dynamo/sglang/worker/main.py (3)
components/backends/sglang/src/dynamo/sglang/common/sgl_utils.py (1)
get_dynamo_namespace(106-107)lib/bindings/python/rust/lib.rs (2)
component(543-549)namespace(286-291)lib/bindings/python/src/dynamo/_core.pyi (2)
component(187-191)namespace(37-41)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build and Test - dynamo
- GitHub Check: pre-merge-rust (lib/runtime/examples)
- GitHub Check: pre-merge-rust (.)
- GitHub Check: pre-merge-rust (lib/bindings/python)
🔇 Additional comments (29)
container/deps/vllm/install_vllm.sh (1)
153-155: Confirm Python ABI compatibility with the precompiled wheel tag.The wheel name is tagged
cp38-abi3. That requires the runtime to be CPython >= 3.8 and compatible with the stable ABI. If our base images run Python < 3.8 or PyPy, installation will fail at runtime. Please confirm our build images use CPython 3.8+.If helpful, I can add a guard to the script to assert Python version >= 3.8 before proceeding with the precompiled path and fall back to source build otherwise. Do you want me to push that change?
components/backends/sglang/src/dynamo/sglang/common/sgl_utils.py (1)
7-7: Import looks fineAdding os is necessary for environment-driven namespace. No issues here.
lib/llm/src/namespace.rs (1)
5-5: Constant is clear and consistentGLOBAL_NAMESPACE = "global" provides a single source of truth in Rust code.
components/frontend/src/dynamo/frontend/main.py (3)
22-22: LGTM: logging import is appropriate.Importing logging is necessary for the new startup messages.
206-215: LGTM: dynamic endpoint_id construction is consistent with namespacing.
- Scoped discovery when a namespace is provided: ".frontend.http"
- Defaults to ".frontend.http" when absent
229-241: LGTM: helpful startup logs.The messages clearly indicate static vs dynamic and the scope used, which will aid troubleshooting.
lib/llm/src/lib.rs (1)
30-30: LGTM: public namespace module export.Exposing
pub mod namespace;is straightforward and aligns with tests and watcher usage.components/backends/sglang/deploy/disagg.yaml (1)
27-27: DYN_NAMESPACE injection and fallback verified
- The operator’s
component_common.go(lines 63–68) injectsDYN_NAMESPACEfromdynamoNamespace.clear_namespace(clear_namespace.py, lines 43–50) falls back toos.environ["DYN_NAMESPACE"]when--namespaceis omitted.- The frontend CLI (
main.py, namespace flag) similarly defaults to theDYN_NAMESPACEenv var if--namespaceisn’t provided.All tests in
component_planner_test.goandgraph_test.gocover the operator injection. No further changes needed.components/backends/sglang/src/dynamo/sglang/decode_worker/main.py (1)
20-20: LGTM: dynamic namespace import.Importing
get_dynamo_namespacealigns this worker with the new environment-driven namespace model.components/backends/sglang/deploy/disagg-multinode.yaml (2)
29-29: Namespace flag removal looks good; keep the startup gate via &&.Dropping the explicit --namespace aligns with env-driven discovery. The
&&ensures the frontend won’t start if namespace clearing fails, which matches the intended behavior.Please confirm that the operator injects
DYN_NAMESPACEfor this pod (derived fromdynamoNamespace: sglang-disagg-multinode). If not injected,clear_namespacemay default to the global namespace and clear the wrong scope.
20-22: Verify componentType usage for Frontend (“main” vs “frontend”).Other manifests use
componentType: frontend. Usingmainhere may alter service naming or discovery labels depending on the CRD’s conventions.If the CRD expects
frontendfor the HTTP entrypoint, consider aligning it for consistency. Otherwise, please confirmmainis the intended role for this graph and won’t affect discovery or service exposure.components/backends/sglang/deploy/agg_router.yaml (1)
27-27: LGTM: env-driven clear_namespace and startup gating preserved.Removing
--namespaceis fine ifDYN_NAMESPACEis provided by the operator fromdynamoNamespace. The&&ensures failure to clear blocks startup, which matches prior design.Please confirm
DYN_NAMESPACEinjection for this Frontend pod; otherwise,clear_namespacewould fall back to the global namespace.components/backends/sglang/deploy/disagg_planner.yaml (2)
34-34: LGTM: clear_namespace now relies on env; startup still gated.The change aligns with the env-driven namespace approach, and the
&&keeps the critical “do not start without clearing” behavior.
12-16: Environment variable name may be inconsistent with code (DYNAMO_NAMESPACEvsDYN_NAMESPACE).Runtime code and helpers read
DYN_NAMESPACE(per this PR). This manifest setsDYNAMO_NAMESPACE, which may be ignored unless separately consumed.
- If the operator injects
DYN_NAMESPACEfromdynamoNamespace, you can dropDYNAMO_NAMESPACEhere to reduce confusion.- If you intend to override via env, add
DYN_NAMESPACE: "dynamo"explicitly or renameDYNAMO_NAMESPACEtoDYN_NAMESPACE.I can generate a search script to find remaining uses of
DYNAMO_NAMESPACEacross the repo if helpful.components/backends/sglang/src/dynamo/sglang/common/__init__.py (2)
16-23: Expose get_dynamo_namespace from common: good addition.Importing
get_dynamo_namespacehere provides a stable, central import path for callers (from dynamo.sglang.common import get_dynamo_namespace).
25-40: all updated correctly to export get_dynamo_namespace.Public API surface now includes the namespace helper alongside existing utilities and types. No issues spotted.
lib/llm/tests/http_namespace_integration.rs (2)
15-32: Helper to construct ModelEntry is clear and reusable.The helper keeps tests concise and standardizes Endpoint construction across scenarios.
224-275: Boundary tests are solid; case sensitivity check is valuable.Treating empty as global and enforcing lower-case “global” matches the current
is_global_namespacecontract.lib/llm/src/entrypoint/input/common.rs (1)
80-81: Updated watcher call aligns with new API (OK).Passing
Nonefor the optional namespace parameter correctly preserves the previous behavior (no filtering). Looks good.lib/llm/tests/namespace.rs (5)
11-47: Namespace helpers are well-covered.Good coverage for
GLOBAL_NAMESPACE, empty string, whitespace, and case sensitivity. This aligns with the currentis_global_namespaceimplementation.
69-101: ModelEntry construction tests are clear and correct.These validate
Endpointfields andModelTyperound-trip well. Solid.
102-165: Filtering logic mirrors runtime watcher behavior (OK).Using
is_global_namespaceto decide whether to include all is consistent with the watcher’s semantics. Nice.
167-194: Serialization round-trip test is valuable.Ensures etcd payload compatibility. No issues.
195-213: Endpoint parsing tests are helpful.Covering both dot and slash separators is great for robustness. Assuming
Endpoint::fromhandles both forms, this should pass.components/backends/sglang/src/dynamo/sglang/worker/main.py (2)
36-40: Importingget_dynamo_namespace(OK).This keeps the resolution centralized. Good.
346-352: Decode client discovery uses the same namespace (OK).Consistent with worker component creation; this avoids cross-namespace leakage.
lib/llm/src/discovery/watcher.rs (1)
141-158: Structured logs with namespace context (nice).Including
namespacein add/update logs improves observability across namespaces.lib/llm/src/entrypoint/input/http.rs (2)
41-48: Namespace scoping logic is correct and non-intrusive.Using
is_global_namespaceto decide between Some(namespace) vs None aligns with the watcher’s behavior. Passingas_deref()later is a clean handoff.
188-227: run_watcher task orchestration looks good.
- Using a channel to toggle HTTP endpoints on model type updates is solid.
- The watcher task passes
target_namespace.as_deref()correctly.No issues.
| config.namespace = os.environ.get("DYN_NAMESPACE", "global") | ||
| config.component = "prefill" if args.is_prefill_worker else "backend" | ||
| config.endpoint = "generate" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Repository-wide sanity check for lingering defaults and env var name drift
Given the migration to DYN_NAMESPACE and deprecation of hard-coded dyn://dynamo.* endpoints, verify there are no remaining references to:
- dyn://dynamo.*
- DYNAMO_NAMESPACE (old env var)
- DEFAULT_ENDPOINT values that may capture stale env at import-time
Run this script:
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Searching for hard-coded dyn://dynamo.* endpoints..."
rg -nP -C2 'dyn://dynamo\.' -g '!**/node_modules/**'
echo
echo "Searching for old env var name DYNAMO_NAMESPACE..."
rg -nP -C3 '\bDYNAMO_NAMESPACE\b' -g '!**/node_modules/**'
echo
echo "Searching for module-level DEFAULT_ENDPOINT definitions (may capture env too early)..."
rg -nP -C2 '^\s*DEFAULT_ENDPOINT\s*=\s*f?[\'"]dyn://' --type=python
echo
echo "Review env namespace retrievals; ensure blank/whitespace are normalized to 'global'..."
rg -nP -C2 'os\.environ\.get\(\s*["\']DYN_NAMESPACE["\']\s*(?:,\s*["\'][^"\']*["\']\s*)?\)' --type=pythonLength of output: 10567
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Searching for module-level DEFAULT_ENDPOINT definitions (may capture env too early)..."
rg -nP -C2 --type=python '^\s*DEFAULT_ENDPOINT\s*=\s*f?["\'].*dyn://' -g '!**/node_modules/**'
echo
echo "Searching for os.environ.get usages of DYN_NAMESPACE..."
rg -nP -C2 --type=python "os\.environ\.get\(\s*['\"]DYN_NAMESPACE['\"]" -g '!**/node_modules/**'Length of output: 319
Remove hard-coded dyn://dynamo. endpoints & deprecated DYNAMO_NAMESPACE*
We ran a repo-wide search and found numerous lingering defaults. Please update or remove all hard-coded dyn://dynamo.* URLs and switch fully to the new environment variable, and deprecate the old DYNAMO_NAMESPACE. Also ensure that DYN_NAMESPACE values that are empty or whitespace normalize back to "global".
Locations requiring attention:
- Examples & docs (remove/update default endpoints):
• lib/bindings/python/examples/hello_world/* –DEFAULT_ENDPOINT = "dyn://dynamo.backend.generate"
• docs/guides/dynamo_run.md – multiplein=/out=dyn://dynamo.*usages
• examples/multimodal_v1/… –DEFAULT_ENDPOINT,DEFAULT_DOWNSTREAM_ENDPOINT, worker default endpoints - Backends:
• components/backends/llama_cpp/src/dynamo/llama_cpp/main.py –DEFAULT_ENDPOINT = "dyn://dynamo.backend.generate"
• components/backends/trtllm/src/dynamo/trtllm/utils/trtllm_utils.py – severalDEFAULT_ENDPOINT = "dyn://dynamo.*" - Deployment configs & defaults (remove old env var):
• components/backends/vllm/deploy/disagg_planner.yaml –DYNAMO_NAMESPACE
• components/backends/sglang/deploy/disagg_planner.yaml –DYNAMO_NAMESPACE
• components/planner/src/dynamo/planner/defaults.py –namespace = os.environ.get("DYNAMO_NAMESPACE", …)
Suggested fixes:
- Replace all hard-coded
dyn://dynamo.*with a parameterized endpoint (e.g. derived from a newDYN_ENDPOINTor passed via CLI). - Remove references to
DYNAMO_NAMESPACE; use onlyos.environ.get("DYN_NAMESPACE", "global")with.strip()and a fallback when the result is empty.
| fi | ||
| else | ||
| echo "Installing vllm for AMD64 architecture" | ||
| uv pip install torch==2.7.1+cu128 torchaudio==2.7.1 torchvision==0.22.1 --index-url https://download.pytorch.org/whl/cu128 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Respect TORCH_BACKEND when installing PyTorch (avoid hardcoded cu128).
The AMD64 path hardcodes the cu128 channel, ignoring the --torch-backend argument and TORCH_BACKEND env var. This will silently install CUDA 12.8 wheels even if a different backend (e.g., cpu) is requested, leading to mismatches. Make the index URL and torch spec conditional on TORCH_BACKEND.
Apply this diff:
- uv pip install torch==2.7.1+cu128 torchaudio==2.7.1 torchvision==0.22.1 --index-url https://download.pytorch.org/whl/cu128
+ # Respect --torch-backend for PyTorch channel and version tag
+ if [ "$TORCH_BACKEND" = "cpu" ]; then
+ PYTORCH_INDEX_URL="https://download.pytorch.org/whl/cpu"
+ TORCH_SPEC="torch==2.7.1"
+ else
+ PYTORCH_INDEX_URL="https://download.pytorch.org/whl/${TORCH_BACKEND}"
+ TORCH_SPEC="torch==2.7.1+${TORCH_BACKEND}"
+ fi
+ uv pip install "$TORCH_SPEC" torchaudio==2.7.1 torchvision==0.22.1 --index-url "$PYTORCH_INDEX_URL"Note: Given past learnings, defaulting --torch-backend=auto during vLLM install also works well via uv; consider using auto as the default unless you have a specific need to pin.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| uv pip install torch==2.7.1+cu128 torchaudio==2.7.1 torchvision==0.22.1 --index-url https://download.pytorch.org/whl/cu128 | |
| # Respect --torch-backend for PyTorch channel and version tag | |
| if [ "$TORCH_BACKEND" = "cpu" ]; then | |
| PYTORCH_INDEX_URL="https://download.pytorch.org/whl/cpu" | |
| TORCH_SPEC="torch==2.7.1" | |
| else | |
| PYTORCH_INDEX_URL="https://download.pytorch.org/whl/${TORCH_BACKEND}" | |
| TORCH_SPEC="torch==2.7.1+${TORCH_BACKEND}" | |
| fi | |
| uv pip install "$TORCH_SPEC" torchaudio==2.7.1 torchvision==0.22.1 --index-url "$PYTORCH_INDEX_URL" |
🤖 Prompt for AI Agents
In container/deps/vllm/install_vllm.sh around line 150, the pip install line
hardcodes the cu128 CUDA wheel; change it to read the --torch-backend argument
or the TORCH_BACKEND env var (default to "auto"), and set the torch package spec
and --index-url conditionally based on that value (e.g., use +cu128 and the
cu128 index for cu128, +cpu and the cpu index for cpu, and for "auto" choose a
safe default that doesn't force cu128). Replace the literal pip args with
variables computed from TORCH_BACKEND so the uv pip install command uses the
correct wheel tag and index URL instead of the hardcoded cu128.
| // Filter by namespace if target_namespace is specified | ||
| if let Some(target_ns) = target_namespace { | ||
| if !global_namespace && model_entry.endpoint.namespace != target_ns { | ||
| tracing::debug!( | ||
| model_namespace = model_entry.endpoint.namespace, | ||
| target_namespace = target_ns, | ||
| model_name = model_entry.name, | ||
| "Skipping model from different namespace" | ||
| ); | ||
| continue; | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Asymmetric filtering causes noisy errors on Delete events.
You skip saving entries for non-target namespaces on Put events, but still process Delete events for those keys. This leads to “Missing ModelEntry for {key}” errors on deletes for namespaces you never tracked. Prefer to silently ignore deletes for unknown keys (or filter deletes similarly).
Two possible fixes:
- Minimal: Treat missing entry on delete as a no-op with a debug log.
- Broader: Only process deletes for keys you have tracked (requires checking manager state before calling
handle_delete).
Minimal fix (change outside the edited hunk; proposed patch):
@@
async fn handle_delete(&self, kv: &KeyValue) -> anyhow::Result<Option<String>> {
let key = kv.key_str()?;
- let model_entry = match self.manager.remove_model_entry(key) {
- Some(entry) => entry,
- None => {
- anyhow::bail!("Missing ModelEntry for {key}");
- }
- };
+ let Some(model_entry) = self.manager.remove_model_entry(key) else {
+ // When watching a specific namespace we may never have tracked this key.
+ tracing::debug!("Delete for unknown model key {key}, ignoring");
+ return Ok(None);
+ };Also applies to: 170-181
🤖 Prompt for AI Agents
In lib/llm/src/discovery/watcher.rs around lines 113-125 (and similarly
170-181), Delete events for keys in namespaces you never tracked produce noisy
"Missing ModelEntry" errors because Put events skipped non-target namespaces but
Delete events are still processed; update the Delete handling to treat a missing
entry as a no-op: before erroring when a ModelEntry lookup fails on Delete,
check if the key is unknown and simply emit a debug log and return early (or
guard deletes by first checking manager state for the tracked key) so deletes
for untracked namespaces are ignored silently.
Overview:
Details:
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit
New Features
Refactor
Documentation
Chores
Tests