Skip to content

Conversation

@tzulingk
Copy link
Contributor

@tzulingk tzulingk commented Aug 9, 2025

Overview:

Add a "model" label to Component metrics.

Details:

This pull request introduces model-specific metrics by adding a model field to the Component struct and updating the metrics labeling logic.

Changes

  • Component struct: The Component struct has been updated to include an optional model field.
  • create_metrics() function: This function now checks if the metrics registry contains a model name. If present, the model name is added as a label to the generated metrics.
  • Related code: All affected code paths have been modified to handle the new model field.

Vllm metrics before this PR

# HELP dynamo_component_response_bytes_total Total number of bytes sent in responses by work handler
# TYPE dynamo_component_response_bytes_total counter
dynamo_component_response_bytes_total{dynamo_component="backend",dynamo_endpoint="clear_kv_blocks",dynamo_namespace="dynamo"} 0
dynamo_component_response_bytes_total{dynamo_component="backend",dynamo_endpoint="generate",dynamo_namespace="dynamo"} 0
dynamo_component_response_bytes_total{dynamo_component="backend",dynamo_endpoint="load_metrics",dynamo_namespace="dynamo"} 0

Vllm metrics after this PR
Note that we have model="Qwen/Qwen3-0.6B" label now.

# HELP dynamo_component_response_bytes_total Total number of bytes sent in responses by work handler
# TYPE dynamo_component_response_bytes_total counter
dynamo_component_response_bytes_total{dynamo_component="backend",dynamo_endpoint="clear_kv_blocks",dynamo_namespace="dynamo",model="Qwen/Qwen3-0.6B"} 0
dynamo_component_response_bytes_total{dynamo_component="backend",dynamo_endpoint="generate",dynamo_namespace="dynamo",model="Qwen/Qwen3-0.6B"} 0
dynamo_component_response_bytes_total{dynamo_component="backend",dynamo_endpoint="load_metrics",dynamo_namespace="dynamo",model="Qwen/Qwen3-0.6B"} 0

Where should the reviewer start?

lib/runtime/src/component.rs: model is added to Component/Endpoint.
lib/runtime/src/metrics.rs: how model label is added to the metrics.

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

DIS-360 Add a "model" label to Component metrics

Summary by CodeRabbit

  • New Features

    • Added support for specifying an optional model name when selecting or configuring components across the platform.
    • Model names are now included as labels in metrics for enhanced observability.
  • Bug Fixes

    • Ensured components not associated with a model are explicitly handled as such.
  • Documentation

    • Added clarifying comments regarding model association for components in tests and code.
  • Refactor

    • Updated configuration and initialization flows to propagate model name information where applicable.
    • Adjusted method and struct signatures to support optional model name parameters.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Aug 9, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 9, 2025

Walkthrough

This change set standardizes the handling of model names across multiple components by updating the component method signatures and their invocations to accept an optional model name parameter. Structs, configuration objects, and trait implementations are adjusted accordingly, and the model name is now propagated through component creation, metrics labeling, and engine instantiation across the codebase.

Changes

Cohort / File(s) Change Summary
Component Method Model Parameter
lib/runtime/src/component.rs, lib/bindings/python/rust/lib.rs
The component method in Namespace is updated to accept an optional model name parameter. The Component struct and related builder, getter, and metrics methods are updated to store and expose this model name. Python bindings are updated to match the new method signature.
Metrics and Auto-labeling
lib/runtime/src/metrics.rs
The MetricsRegistry trait gains an optional model() method. The create_metric function now auto-labels metrics with the model name if present, after sanitizing it.
Component Usage Updates
components/backends/vllm/src/dynamo/vllm/main.py, components/metrics/src/bin/mock_worker.rs, components/metrics/src/main.rs, components/router/src/main.rs, lib/bindings/c/src/lib.rs, lib/llm/src/discovery/watcher.rs, lib/llm/src/entrypoint/input/common.rs, lib/llm/src/entrypoint/input/endpoint.rs, lib/llm/src/entrypoint/input/http.rs
All invocations of the component method are updated to supply the new model name argument, either as Some(model_name) or None, depending on context. Comments are added where components are intentionally model-agnostic.
Configuration and CLI
components/metrics/src/lib.rs, components/metrics/src/main.rs
The metrics worker and CLI argument structs are extended to include a model name field, which is propagated through configuration and used when resolving components.
Mocker Engine Model Handling
lib/llm/src/mocker/engine.rs, launch/dynamo-run/src/lib.rs, lib/bindings/python/rust/llm/entrypoint.rs
The mocker engine and related functions are updated to accept and propagate an optional model name. Integration tests and engine instantiation pass the model name as needed.
Integration Test Updates
lib/llm/src/kv_router/prefill_counter.rs, lib/llm/src/kv_router/sequence.rs
Integration tests are updated to explicitly pass None for model name in component creation where appropriate, with clarifying comments added.

Sequence Diagram(s)

sequenceDiagram
    participant CLI/User
    participant Config
    participant Namespace
    participant Component
    participant Metrics

    CLI/User->>Config: Provide component and model name
    Config->>Namespace: component(name, Some(model_name))
    Namespace->>Component: Create with name, model_name
    Component-->>Namespace: Return Component instance
    Namespace-->>Config: Return Component
    Component->>Metrics: model()
    Metrics->>Metrics: Auto-label with model if present
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇
A model hops from place to place,
Now every component knows its face.
With names passed down the code so neat,
Metrics and engines skip a beat.
From CLI to test, the change is clear—
Components and models, always near!
Hoppity hop, the code’s in gear!

Note

🔌 MCP (Model Context Protocol) integration is now available in Early Access!

Pro users can now connect to remote MCP servers under the Integrations page to get reviews and chat conversations that understand additional development context.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🔭 Outside diff range comments (4)
components/metrics/src/lib.rs (3)

350-399: Add "model" to all metric label schemas (gauge/counter vecs)

To expose the model dimension, extend the label arrays by one position. This change is breaking (cardinality increase) and will create new time series; plan dashboards/alerts accordingly.

-                &["component", "endpoint", "worker_id"]
+                &["component", "endpoint", "worker_id", "model"]
@@
-                &["component", "endpoint", "worker_id"]
+                &["component", "endpoint", "worker_id", "model"]
@@
-                &["component", "endpoint", "worker_id"]
+                &["component", "endpoint", "worker_id", "model"]
@@
-                &["component", "endpoint", "worker_id"]
+                &["component", "endpoint", "worker_id", "model"]
@@
-                &["component", "endpoint"]
+                &["component", "endpoint", "model"]
@@
-                &["component", "endpoint"]
+                &["component", "endpoint", "model"]
@@
-                &["component", "endpoint", "worker_id"]
+                &["component", "endpoint", "worker_id", "model"]
@@
-                &["component", "endpoint", "worker_id"]
+                &["component", "endpoint", "worker_id", "model"]
@@
-                &["component", "endpoint", "worker_id"]
+                &["component", "endpoint", "worker_id", "model"]

403-413: Pass the model label value at call sites

Use config.model_name when setting or incrementing metrics. Default to "unknown" when None.

 fn set_worker_gauge(
@@
-        gauge
-            .with_label_values(&[&config.component_name, &config.endpoint_name, worker_id])
+        let model = config.model_name.as_deref().unwrap_or("unknown");
+        gauge
+            .with_label_values(&[
+                &config.component_name,
+                &config.endpoint_name,
+                worker_id,
+                model,
+            ])
             .set(value);
@@
 fn increment_worker_counter(
@@
-        counter
-            .with_label_values(&[&config.component_name, &config.endpoint_name, worker_id])
+        let model = config.model_name.as_deref().unwrap_or("unknown");
+        counter
+            .with_label_values(&[
+                &config.component_name,
+                &config.endpoint_name,
+                worker_id,
+                model,
+            ])
             .inc_by(value);
@@
 fn set_endpoint_gauge(
@@
-        gauge
-            .with_label_values(&[&config.component_name, &config.endpoint_name])
+        let model = config.model_name.as_deref().unwrap_or("unknown");
+        gauge
+            .with_label_values(&[&config.component_name, &config.endpoint_name, model])
             .set(value);

Also applies to: 416-426, 428-438


516-536: Include "model" when reading cumulative counters

CounterVec::with_label_values must match the updated label schema.

-        let cumulative_isl = self
+        let model = config.model_name.as_deref().unwrap_or("unknown");
+        let cumulative_isl = self
             .kv_hit_rate_isl_blocks
             .with_label_values(&[
                 &config.component_name,
                 &config.endpoint_name,
                 &worker_id_str,
+                model,
             ])
             .get();
@@
-        let cumulative_overlap = self
+        let cumulative_overlap = self
             .kv_hit_rate_overlap_blocks
             .with_label_values(&[
                 &config.component_name,
                 &config.endpoint_name,
                 &worker_id_str,
+                model,
             ])
             .get();

Also applies to: 525-533

launch/dynamo-run/src/lib.rs (1)

137-143: Fix rustfmt failure by wrapping long call

CI reports cargo fmt differences (Lines 137-143). The long make_mocker_engine call should be line-broken per rustfmt.

-            let engine =
-                dynamo_llm::mocker::engine::make_mocker_engine(drt, endpoint, Some(local_model.card().slug().to_string()), args).await?;
+            let engine = dynamo_llm::mocker::engine::make_mocker_engine(
+                drt,
+                endpoint,
+                Some(local_model.card().slug().to_string()),
+                args,
+            )
+            .await?;

Also run cargo fmt locally to ensure no other drift.

🧹 Nitpick comments (7)
components/metrics/src/bin/mock_worker.rs (1)

149-149: Standardize naming: use kebab-case for component name and model slug for the label

Per prior team preference, use kebab-case in .component() calls. Also, align model label semantics with other call sites that pass card.slug().

Apply this minimal change for consistency:

-        .component("MyComponent", Some("MyModel".to_string()))?
+        .component("my-component", Some("my-model".to_string()))?

If this mock should reflect real model identity, consider wiring the actual card.slug() instead of a hardcoded string.

lib/llm/src/entrypoint/input/endpoint.rs (2)

34-39: Propagate model label via slug: good; fix formatting to address cargo fmt failure

The extraction of model_name via card().slug() looks correct and consistent. The pre-merge check flagged formatting on these lines. Please run cargo fmt.

If you prefer a minimal manual tweak, this formatting matches rustfmt output:

-    let model_name = match &engine_config {
-        EngineConfig::StaticFull { model, .. } | EngineConfig::StaticCore { model, .. } => {
-            Some(model.card().slug().to_string())
-        }
-        EngineConfig::StaticRemote(model) | EngineConfig::Dynamic(model) => Some(model.card().slug().to_string()),
-    };
+    let model_name = match &engine_config {
+        EngineConfig::StaticFull { model, .. } | EngineConfig::StaticCore { model, .. } => {
+            Some(model.card().slug().to_string())
+        }
+        EngineConfig::StaticRemote(model) | EngineConfig::Dynamic(model) => {
+            Some(model.card().slug().to_string())
+        }
+    };

50-104: Optional: factor out model_name extraction to reduce duplication

The card().slug() extraction appears in multiple places across the codebase. Consider a small helper to centralize this and enforce consistency.

Happy to draft a helper if you want.

lib/bindings/python/rust/lib.rs (1)

538-541: Python API stays backward compatible; consider adding text_signature for introspection

The added optional model parameter with a default preserves backward compatibility and correctly forwards to self.inner.component(name, model). To improve Python help/introspection, add a text signature.

 impl Namespace {
-    #[pyo3(signature = (name, model = None))]
+    #[pyo3(signature = (name, model = None))]
+    #[pyo3(text_signature = "(name, model=None)")]
     fn component(&self, name: String, model: Option<String>) -> PyResult<Component> {
         let inner = self.inner.component(name, model).map_err(to_pyerr)?;
         Ok(Component {
             inner,
             event_loop: self.event_loop.clone(),
         })
     }
 }
lib/runtime/src/component.rs (1)

128-131: Clarify whether model participates in identity (Hash/Eq); document or include accordingly.

Component's Hash/Eq implementations ignore model. If two Components have the same namespace/name but different model, they will compare equal and collide in hash-based structures. If this is intentional (model is just a label for metrics), please document that explicitly. If not, include model in Hash/Eq.

Suggested doc tweak to make intent explicit:

-    /// Name of the model
+    /// Optional model name used for metrics labeling only (not part of Component identity/Hash/Eq)
lib/llm/src/mocker/engine.rs (2)

51-52: Optional: consider kebab-case for label values to match broader naming conventions.

"mocker_model" could be "mocker-model" for consistency across labels (component names prefer kebab-case; aligning model labels can aid uniform dashboards). Not required; only if your Prometheus/UX conventions prefer it.


515-522: Break long line and run rustfmt to satisfy CI.

Reformat the constructor call; also run cargo fmt.

-    let annotated_engine =
-        AnnotatedMockEngine::new(MockVllmEngine::new(args), distributed_runtime, endpoint, model_name);
+    let annotated_engine = AnnotatedMockEngine::new(
+        MockVllmEngine::new(args),
+        distributed_runtime,
+        endpoint,
+        model_name,
+    );
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fa4a7f1 and 7ff3c75.

📒 Files selected for processing (18)
  • components/backends/vllm/src/dynamo/vllm/main.py (2 hunks)
  • components/metrics/src/bin/mock_worker.rs (1 hunks)
  • components/metrics/src/lib.rs (1 hunks)
  • components/metrics/src/main.rs (4 hunks)
  • components/router/src/main.rs (1 hunks)
  • launch/dynamo-run/src/lib.rs (1 hunks)
  • lib/bindings/c/src/lib.rs (1 hunks)
  • lib/bindings/python/rust/lib.rs (1 hunks)
  • lib/bindings/python/rust/llm/entrypoint.rs (1 hunks)
  • lib/llm/src/discovery/watcher.rs (1 hunks)
  • lib/llm/src/entrypoint/input/common.rs (1 hunks)
  • lib/llm/src/entrypoint/input/endpoint.rs (1 hunks)
  • lib/llm/src/entrypoint/input/http.rs (1 hunks)
  • lib/llm/src/kv_router/prefill_counter.rs (2 hunks)
  • lib/llm/src/kv_router/sequence.rs (2 hunks)
  • lib/llm/src/mocker/engine.rs (6 hunks)
  • lib/runtime/src/component.rs (5 hunks)
  • lib/runtime/src/metrics.rs (2 hunks)
🧰 Additional context used
🧠 Learnings (10)
📓 Common learnings
Learnt from: ryanolson
PR: ai-dynamo/dynamo#1788
File: components/metrics/src/bin/mock_worker.rs:135-135
Timestamp: 2025-07-07T16:17:26.408Z
Learning: The codebase should use kebab-case for component names in .component() calls for uniformity, as indicated by the user's preference for kebab-case or snake_case uniformity, and kebab-case already has existing usage in the codebase.
📚 Learning: 2025-07-14T21:25:56.930Z
Learnt from: ryanolson
PR: ai-dynamo/dynamo#1919
File: lib/runtime/src/engine.rs:168-168
Timestamp: 2025-07-14T21:25:56.930Z
Learning: The AsyncEngineContextProvider trait in lib/runtime/src/engine.rs was intentionally changed from `Send + Sync + Debug` to `Send + Debug` because the Sync bound was overly constraining. The trait should only require Send + Debug as designed.

Applied to files:

  • components/router/src/main.rs
  • lib/llm/src/entrypoint/input/common.rs
  • lib/bindings/c/src/lib.rs
  • lib/llm/src/kv_router/sequence.rs
  • launch/dynamo-run/src/lib.rs
  • lib/llm/src/entrypoint/input/http.rs
  • lib/bindings/python/rust/llm/entrypoint.rs
  • lib/llm/src/entrypoint/input/endpoint.rs
  • lib/llm/src/mocker/engine.rs
📚 Learning: 2025-06-05T01:04:24.775Z
Learnt from: PeaBrane
PR: ai-dynamo/dynamo#1392
File: launch/dynamo-run/src/subprocess/vllm_v1_inc.py:71-71
Timestamp: 2025-06-05T01:04:24.775Z
Learning: The `create_endpoint` method in `WorkerMetricsPublisher` has backward compatibility maintained through pyo3 signature annotation `#[pyo3(signature = (component, dp_rank = None))]`, making the `dp_rank` parameter optional with a default value of `None`.

Applied to files:

  • components/router/src/main.rs
  • lib/llm/src/entrypoint/input/common.rs
  • components/metrics/src/lib.rs
  • lib/bindings/c/src/lib.rs
  • lib/llm/src/kv_router/prefill_counter.rs
  • components/metrics/src/bin/mock_worker.rs
  • lib/bindings/python/rust/lib.rs
  • lib/llm/src/entrypoint/input/http.rs
📚 Learning: 2025-07-16T12:41:12.543Z
Learnt from: grahamking
PR: ai-dynamo/dynamo#1962
File: lib/runtime/src/component/client.rs:270-273
Timestamp: 2025-07-16T12:41:12.543Z
Learning: In lib/runtime/src/component/client.rs, the current mutex usage in get_or_create_dynamic_instance_source is temporary while evaluating whether the mutex can be dropped entirely. The code currently has a race condition between try_lock and lock().await, but this is acknowledged as an interim state during the performance optimization process.

Applied to files:

  • lib/llm/src/entrypoint/input/common.rs
  • lib/llm/src/kv_router/sequence.rs
  • lib/llm/src/entrypoint/input/http.rs
  • lib/llm/src/mocker/engine.rs
📚 Learning: 2025-05-30T06:38:09.630Z
Learnt from: PeaBrane
PR: ai-dynamo/dynamo#1285
File: lib/llm/src/kv_router/scoring.rs:58-63
Timestamp: 2025-05-30T06:38:09.630Z
Learning: In lib/llm/src/kv_router/scoring.rs, the user prefers to keep the panic behavior when calculating load_avg and variance with empty endpoints rather than adding guards for division by zero. They want the code to fail fast on this error condition.

Applied to files:

  • lib/llm/src/entrypoint/input/common.rs
  • lib/llm/src/kv_router/sequence.rs
  • lib/llm/src/kv_router/prefill_counter.rs
  • lib/llm/src/entrypoint/input/http.rs
📚 Learning: 2025-05-29T00:02:35.018Z
Learnt from: alec-flowers
PR: ai-dynamo/dynamo#1181
File: lib/llm/src/kv_router/publisher.rs:379-425
Timestamp: 2025-05-29T00:02:35.018Z
Learning: In lib/llm/src/kv_router/publisher.rs, the functions `create_stored_blocks` and `create_stored_block_from_parts` are correctly implemented and not problematic duplications of existing functionality elsewhere in the codebase.

Applied to files:

  • lib/bindings/c/src/lib.rs
  • lib/llm/src/kv_router/sequence.rs
  • lib/llm/src/kv_router/prefill_counter.rs
📚 Learning: 2025-06-02T19:37:27.666Z
Learnt from: oandreeva-nv
PR: ai-dynamo/dynamo#1195
File: lib/llm/tests/block_manager.rs:150-152
Timestamp: 2025-06-02T19:37:27.666Z
Learning: In Rust/Tokio applications, when background tasks use channels for communication, dropping the sender automatically signals task termination when the receiver gets `None`. The `start_batching_publisher` function in `lib/llm/tests/block_manager.rs` demonstrates this pattern: when the `KVBMDynamoRuntimeComponent` is dropped, its `batch_tx` sender is dropped, causing `rx.recv()` to return `None`, which triggers cleanup and task termination.

Applied to files:

  • lib/bindings/c/src/lib.rs
📚 Learning: 2025-05-29T06:20:12.901Z
Learnt from: ryanolson
PR: ai-dynamo/dynamo#1093
File: lib/llm/src/block_manager/block/registry.rs:98-122
Timestamp: 2025-05-29T06:20:12.901Z
Learning: In lib/llm/src/block_manager/block/registry.rs, the background task spawned for handling unregister notifications uses detached concurrency by design. The JoinHandle is intentionally not stored as this represents a reasonable architectural tradeoff for a long-running cleanup task.

Applied to files:

  • lib/bindings/c/src/lib.rs
  • lib/llm/src/kv_router/sequence.rs
📚 Learning: 2025-06-05T01:02:15.318Z
Learnt from: PeaBrane
PR: ai-dynamo/dynamo#1392
File: lib/llm/src/kv_router/scoring.rs:35-46
Timestamp: 2025-06-05T01:02:15.318Z
Learning: In lib/llm/src/kv_router/scoring.rs, PeaBrane prefers panic-based early failure over Result-based error handling for the worker_id() method to catch invalid data early during development.

Applied to files:

  • lib/llm/src/kv_router/sequence.rs
  • lib/llm/src/kv_router/prefill_counter.rs
📚 Learning: 2025-06-08T03:12:03.985Z
Learnt from: jthomson04
PR: ai-dynamo/dynamo#1429
File: lib/runtime/src/utils/leader_worker_barrier.rs:69-72
Timestamp: 2025-06-08T03:12:03.985Z
Learning: In the leader-worker barrier implementation in lib/runtime/src/utils/leader_worker_barrier.rs, the `wait_for_key_count` function correctly uses exact equality (`==`) instead of greater-than-or-equal (`>=`) because worker IDs must be unique (enforced by etcd create-only operations), ensuring exactly the expected number of workers can register.

Applied to files:

  • lib/llm/src/kv_router/prefill_counter.rs
🪛 GitHub Actions: Pre Merge Validation of (ai-dynamo/dynamo/refs/pull/2389/merge) by tzulingk.
lib/bindings/c/src/lib.rs

[error] 147-147: pre-commit hook 'trailing-whitespace' failed. Trailing whitespace was fixed in this file. Run 'pre-commit run --all-files' locally to reproduce.

🪛 GitHub Actions: Rust pre-merge checks
lib/bindings/c/src/lib.rs

[error] 147-148: cargo fmt check failed due to formatting differences. Please run 'cargo fmt' to fix code style issues.

launch/dynamo-run/src/lib.rs

[error] 137-143: cargo fmt check failed due to formatting differences. Please run 'cargo fmt' to fix code style issues.

lib/llm/src/entrypoint/input/endpoint.rs

[error] 35-39: cargo fmt check failed due to formatting differences. Please run 'cargo fmt' to fix code style issues.

components/metrics/src/main.rs

[error] 138-138: cargo fmt check failed due to formatting differences. Please run 'cargo fmt' to fix code style issues.

lib/llm/src/mocker/engine.rs

[error] 457-462: cargo fmt check failed due to formatting differences. Please run 'cargo fmt' to fix code style issues.


[error] 517-525: cargo fmt check failed due to formatting differences. Please run 'cargo fmt' to fix code style issues.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (21)
components/router/src/main.rs (1)

63-68: Correct: explicitly pass model None for a model-agnostic component

This matches the new API and the component’s model-agnostic nature. LGTM.

lib/llm/src/kv_router/sequence.rs (2)

801-806: Tests updated to new API: passing None is appropriate

The router tests aren’t tied to a model; this change is correct.


915-920: Ditto for second test site

Consistent with the new signature; looks good.

lib/llm/src/entrypoint/input/common.rs (1)

109-114: Good: scope the component by model slug

Using Some(card.slug().to_string()) threads model context through runtime resolution and enables model-labeled metrics. Ensure slug remains stable to avoid metric cardinality churn on renames.

lib/llm/src/kv_router/prefill_counter.rs (2)

369-372: LGTM: explicit None for model name clarifies intent

Explicitly passing None matches the new signature and documents the counters component is model-agnostic.


435-437: LGTM: consistent model-agnostic component creation

The second thread mirrors the first with .component("counters", None), keeping the test symmetric.

lib/llm/src/entrypoint/input/endpoint.rs (1)

41-44: Remaining single-arg .component(...) calls must be updated to the new two-argument signature

Our scan uncovered the following call sites still using the old single-arg form. Please update each to pass a second argument (e.g. None or a model slug):

• lib/llm/src/http/service/clear_kv_blocks.rs:112
• lib/llm/src/entrypoint/input/endpoint.rs:206
• lib/runtime/tests/namespace_etcd_path.rs:59, 101, 102
• lib/runtime/tests/soak.rs:133, 157
• lib/runtime/src/component/service.rs:108
• lib/runtime/src/metrics.rs:671, 873
• lib/runtime/src/component/component.rs:93
• lib/runtime/examples/system_metrics/tests/integration_test.rs:71
• lib/runtime/examples/system_metrics/src/lib.rs:92
• lib/runtime/examples/system_metrics/src/bin/system_client.rs:34
• lib/runtime/examples/hello_world/src/bin/server.rs:71
• lib/runtime/examples/hello_world/src/bin/client.rs:33
• lib/runtime/examples/service_metrics/src/bin/service_server.rs:73
• lib/runtime/examples/service_metrics/src/bin/service_client.rs:34

Example fix:

- namespace.component("backend")?
+ namespace.component("backend", None)?
⛔ Skipped due to learnings
Learnt from: ryanolson
PR: ai-dynamo/dynamo#1788
File: components/metrics/src/bin/mock_worker.rs:135-135
Timestamp: 2025-07-07T16:17:26.408Z
Learning: The codebase should use kebab-case for component names in .component() calls for uniformity, as indicated by the user's preference for kebab-case or snake_case uniformity, and kebab-case already has existing usage in the codebase.
lib/llm/src/entrypoint/input/http.rs (1)

66-71: LGTM: passes slug as model label consistently

Using card.slug() here aligns with other call sites and keeps metrics labeling consistent.

lib/llm/src/discovery/watcher.rs (1)

172-173: Correctly propagates model context into component resolution

Passing Some(model_entry.name.clone()) into namespace.component(...) aligns with the new API and PR goal. Looks good.

components/backends/vllm/src/dynamo/vllm/main.py (1)

135-137: Correctly passes model into component() calls

Both call sites now pass config.model, matching the updated binding/runtime API. The unchanged component("prefill") call remains valid due to the default model=None.

Given team preference for kebab-case component names, please confirm config.component conforms to kebab-case across CLI/config inputs.

Also applies to: 170-172

lib/runtime/src/metrics.rs (1)

397-401: Trait extension looks good; ensure implementors wire model through

Adding fn model(&self) -> Option<String> with a default None is clean and backward compatible.

Please confirm Component/Endpoint registries override model() appropriately (e.g., Component returns its model; Endpoint forwards its parent’s) so the label appears where intended per DIS-360.

lib/runtime/src/component.rs (4)

183-186: MetricsRegistry::model forwarding looks correct.

Returns a cloned Option as expected; aligns with metrics auto-labeling.


226-229: Getter mirrors stored model as intended.

No concerns.


337-340: Endpoint correctly forwards model() to its Component.

This ensures endpoint-level metrics inherit the component model label.


505-513: Fix doc comment grammar

-    /// Create a [`Component`] in the namespace who's endpoints can be discovered with etcd
+    /// Create a [`Component`] in the namespace whose endpoints can be discovered with etcd

I’ve verified that this PR doesn’t introduce any new .component() call sites using snake_case literals—no further changes are needed here.

components/metrics/src/main.rs (3)

128-131: Explicitly passing None for "count" is appropriate.

Aggregator component is model-agnostic; this call matches intent and keeps semantics clear.


141-142: Target component resolution now forwards the model; LGTM.

Passing config.model_name.clone() maintains flexibility across model/non-model targets.


63-66: Fix formatting: cargo fmt is failing in this file.

CI reports rustfmt differences. After applying the above diffs, please run:

  • cargo fmt

This will resolve the style check failure.

Also applies to: 116-117, 128-131, 141-142

lib/llm/src/mocker/engine.rs (3)

445-446: Constructor gains model_name parameter — interface change looks consistent.

New parameter cleanly threads model through to component resolution.


460-464: Passing model_name into namespace.component is correct.

Closure safely clones the Option<String> per retry. All good.


557-558: Tests updated to pass model through component lookup — LGTM.

Using MOCKER_MODEL ensures coverage for model-aware component resolution.

Also applies to: 626-629

Copy link
Contributor

@keivenchang keivenchang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the idea of labels applicable for other use cases, while preserving the idea that Components are flexible (e.g. they can be serving models, or not serving models at all).

A quick question for you @ryanolson. Do you think that labels should be inherited by the children? For example, if a Component has a label called {model="llama-3.4.5"}, should Endpoint(s) also have such a label? Intuitively, I think that would make sense, but I'm wondering if you think otherwise.

@keivenchang
Copy link
Contributor

+1 LGTM from me. Regarding the topic of children inheriting a parent's label-- that can be done on a separate PR/Linear.

@tzulingk
Copy link
Contributor Author

tzulingk commented Aug 13, 2025

@grahamking please take a look at my newest commit.

per discussion in https://nvidia.slack.com/archives/C08UQCG3RNV/p1755105938302299?thread_ts=1755095101.510609&cid=C08UQCG3RNV
Letting each layer has its own label and we return the labels from different hierarchies when calling stored_labels()?
That is if you call stored_labels()

  • Namespace::stored_labels(): returns Namespace labels.
  • Component::stored_labels(): returns Namespace + Component labels.
  • Endpoint::stored_labels(): returns Namespace + Component + Endpoint labels.

With this we would have the debugability that you mentioned, at the same time address @nnshah1's comment on the consistency.

@tzulingk tzulingk merged commit 3a3f5bf into main Aug 14, 2025
10 checks passed
@tzulingk tzulingk deleted the tzulingk/model_label_dis_360_clean branch August 14, 2025 18:48
let component = distributed_runtime
.namespace(&endpoint_id.namespace)?
.component(&endpoint_id.component)?;
.component(&endpoint_id.component)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question:

@grahamking is this for the in process engines?

@keivenchang do we have these hooked up into the frontend metrics?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this in #2292

.namespace(&endpoint_id.namespace)?
.component(&endpoint_id.component)?;
.component(&endpoint_id.component)
.and_then(|c| c.add_labels(&[("model", card.slug().to_string().as_str())]))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keivenchang , @grahamking , @tzulingk - similar question here - is this relevant on a client?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this in #2292

.namespace(&endpoint_id.namespace)?
.component(&endpoint_id.component)?;
.component(&endpoint_id.component)
.and_then(|c| c.add_labels(&[("model", &model_entry.name)]))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question about client here- will take offline.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this in: #2292

@keivenchang
Copy link
Contributor

keivenchang commented Aug 15, 2025

Sorry about the confusions I caused, I'll try to explain the situation better:

  1. The worker code (e.g. "backend") in lib/runtime/src/* are fine, these are what generates the labels (for the model)
  2. As for the frontend lib/llm/src/* code, since the FE is already generating the model label, and there are some lingering questions about having the model concept in Components, I decided to remove these in feat(metrics): add NATS client metrics to prometheus_metrics_fmt #2292 yesterday.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants