feat: Add a "model" label to Component metrics #2389

tzulingk · 2025-08-09T05:13:42Z

Overview:

Add a "model" label to Component metrics.

Details:

This pull request introduces model-specific metrics by adding a model field to the Component struct and updating the metrics labeling logic.

Changes

Component struct: The Component struct has been updated to include an optional model field.
create_metrics() function: This function now checks if the metrics registry contains a model name. If present, the model name is added as a label to the generated metrics.
Related code: All affected code paths have been modified to handle the new model field.

Vllm metrics before this PR

# HELP dynamo_component_response_bytes_total Total number of bytes sent in responses by work handler
# TYPE dynamo_component_response_bytes_total counter
dynamo_component_response_bytes_total{dynamo_component="backend",dynamo_endpoint="clear_kv_blocks",dynamo_namespace="dynamo"} 0
dynamo_component_response_bytes_total{dynamo_component="backend",dynamo_endpoint="generate",dynamo_namespace="dynamo"} 0
dynamo_component_response_bytes_total{dynamo_component="backend",dynamo_endpoint="load_metrics",dynamo_namespace="dynamo"} 0

Vllm metrics after this PR
Note that we have model="Qwen/Qwen3-0.6B" label now.

# HELP dynamo_component_response_bytes_total Total number of bytes sent in responses by work handler
# TYPE dynamo_component_response_bytes_total counter
dynamo_component_response_bytes_total{dynamo_component="backend",dynamo_endpoint="clear_kv_blocks",dynamo_namespace="dynamo",model="Qwen/Qwen3-0.6B"} 0
dynamo_component_response_bytes_total{dynamo_component="backend",dynamo_endpoint="generate",dynamo_namespace="dynamo",model="Qwen/Qwen3-0.6B"} 0
dynamo_component_response_bytes_total{dynamo_component="backend",dynamo_endpoint="load_metrics",dynamo_namespace="dynamo",model="Qwen/Qwen3-0.6B"} 0

Where should the reviewer start?

lib/runtime/src/component.rs: model is added to Component/Endpoint.
lib/runtime/src/metrics.rs: how model label is added to the metrics.

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

DIS-360 Add a "model" label to Component metrics

Summary by CodeRabbit

New Features
- Added support for specifying an optional model name when selecting or configuring components across the platform.
- Model names are now included as labels in metrics for enhanced observability.
Bug Fixes
- Ensured components not associated with a model are explicitly handled as such.
Documentation
- Added clarifying comments regarding model association for components in tests and code.
Refactor
- Updated configuration and initialization flows to propagate model name information where applicable.
- Adjusted method and struct signatures to support optional model name parameters.

…rvice_name has been removed in PR#2349

copy-pr-bot · 2025-08-09T05:13:47Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2025-08-09T05:21:16Z

Walkthrough

This change set standardizes the handling of model names across multiple components by updating the component method signatures and their invocations to accept an optional model name parameter. Structs, configuration objects, and trait implementations are adjusted accordingly, and the model name is now propagated through component creation, metrics labeling, and engine instantiation across the codebase.

Changes

Cohort / File(s)	Change Summary
Component Method Model Parameter `lib/runtime/src/component.rs`, `lib/bindings/python/rust/lib.rs`	The `component` method in `Namespace` is updated to accept an optional model name parameter. The `Component` struct and related builder, getter, and metrics methods are updated to store and expose this model name. Python bindings are updated to match the new method signature.
Metrics and Auto-labeling `lib/runtime/src/metrics.rs`	The `MetricsRegistry` trait gains an optional `model()` method. The `create_metric` function now auto-labels metrics with the model name if present, after sanitizing it.
Component Usage Updates `components/backends/vllm/src/dynamo/vllm/main.py`, `components/metrics/src/bin/mock_worker.rs`, `components/metrics/src/main.rs`, `components/router/src/main.rs`, `lib/bindings/c/src/lib.rs`, `lib/llm/src/discovery/watcher.rs`, `lib/llm/src/entrypoint/input/common.rs`, `lib/llm/src/entrypoint/input/endpoint.rs`, `lib/llm/src/entrypoint/input/http.rs`	All invocations of the `component` method are updated to supply the new model name argument, either as `Some(model_name)` or `None`, depending on context. Comments are added where components are intentionally model-agnostic.
Configuration and CLI `components/metrics/src/lib.rs`, `components/metrics/src/main.rs`	The metrics worker and CLI argument structs are extended to include a model name field, which is propagated through configuration and used when resolving components.
Mocker Engine Model Handling `lib/llm/src/mocker/engine.rs`, `launch/dynamo-run/src/lib.rs`, `lib/bindings/python/rust/llm/entrypoint.rs`	The mocker engine and related functions are updated to accept and propagate an optional model name. Integration tests and engine instantiation pass the model name as needed.
Integration Test Updates `lib/llm/src/kv_router/prefill_counter.rs`, `lib/llm/src/kv_router/sequence.rs`	Integration tests are updated to explicitly pass `None` for model name in component creation where appropriate, with clarifying comments added.

Sequence Diagram(s)

sequenceDiagram
    participant CLI/User
    participant Config
    participant Namespace
    participant Component
    participant Metrics

    CLI/User->>Config: Provide component and model name
    Config->>Namespace: component(name, Some(model_name))
    Namespace->>Component: Create with name, model_name
    Component-->>Namespace: Return Component instance
    Namespace-->>Config: Return Component
    Component->>Metrics: model()
    Metrics->>Metrics: Auto-label with model if present

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇
A model hops from place to place,
Now every component knows its face.
With names passed down the code so neat,
Metrics and engines skip a beat.
From CLI to test, the change is clear—
Components and models, always near!
Hoppity hop, the code’s in gear!

Note

🔌 MCP (Model Context Protocol) integration is now available in Early Access!

Pro users can now connect to remote MCP servers under the Integrations page to get reviews and chat conversations that understand additional development context.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 6

🔭 Outside diff range comments (4)

components/metrics/src/lib.rs (3)

350-399: Add "model" to all metric label schemas (gauge/counter vecs)

To expose the model dimension, extend the label arrays by one position. This change is breaking (cardinality increase) and will create new time series; plan dashboards/alerts accordingly.

-                &["component", "endpoint", "worker_id"]
+                &["component", "endpoint", "worker_id", "model"]
@@
-                &["component", "endpoint", "worker_id"]
+                &["component", "endpoint", "worker_id", "model"]
@@
-                &["component", "endpoint", "worker_id"]
+                &["component", "endpoint", "worker_id", "model"]
@@
-                &["component", "endpoint", "worker_id"]
+                &["component", "endpoint", "worker_id", "model"]
@@
-                &["component", "endpoint"]
+                &["component", "endpoint", "model"]
@@
-                &["component", "endpoint"]
+                &["component", "endpoint", "model"]
@@
-                &["component", "endpoint", "worker_id"]
+                &["component", "endpoint", "worker_id", "model"]
@@
-                &["component", "endpoint", "worker_id"]
+                &["component", "endpoint", "worker_id", "model"]
@@
-                &["component", "endpoint", "worker_id"]
+                &["component", "endpoint", "worker_id", "model"]

403-413: Pass the model label value at call sites

Use config.model_name when setting or incrementing metrics. Default to "unknown" when None.

 fn set_worker_gauge(
@@
-        gauge
-            .with_label_values(&[&config.component_name, &config.endpoint_name, worker_id])
+        let model = config.model_name.as_deref().unwrap_or("unknown");
+        gauge
+            .with_label_values(&[
+                &config.component_name,
+                &config.endpoint_name,
+                worker_id,
+                model,
+            ])
             .set(value);
@@
 fn increment_worker_counter(
@@
-        counter
-            .with_label_values(&[&config.component_name, &config.endpoint_name, worker_id])
+        let model = config.model_name.as_deref().unwrap_or("unknown");
+        counter
+            .with_label_values(&[
+                &config.component_name,
+                &config.endpoint_name,
+                worker_id,
+                model,
+            ])
             .inc_by(value);
@@
 fn set_endpoint_gauge(
@@
-        gauge
-            .with_label_values(&[&config.component_name, &config.endpoint_name])
+        let model = config.model_name.as_deref().unwrap_or("unknown");
+        gauge
+            .with_label_values(&[&config.component_name, &config.endpoint_name, model])
             .set(value);

Also applies to: 416-426, 428-438

516-536: Include "model" when reading cumulative counters

CounterVec::with_label_values must match the updated label schema.

-        let cumulative_isl = self
+        let model = config.model_name.as_deref().unwrap_or("unknown");
+        let cumulative_isl = self
             .kv_hit_rate_isl_blocks
             .with_label_values(&[
                 &config.component_name,
                 &config.endpoint_name,
                 &worker_id_str,
+                model,
             ])
             .get();
@@
-        let cumulative_overlap = self
+        let cumulative_overlap = self
             .kv_hit_rate_overlap_blocks
             .with_label_values(&[
                 &config.component_name,
                 &config.endpoint_name,
                 &worker_id_str,
+                model,
             ])
             .get();

Also applies to: 525-533

launch/dynamo-run/src/lib.rs (1)

137-143: Fix rustfmt failure by wrapping long call

CI reports cargo fmt differences (Lines 137-143). The long make_mocker_engine call should be line-broken per rustfmt.

-            let engine =
-                dynamo_llm::mocker::engine::make_mocker_engine(drt, endpoint, Some(local_model.card().slug().to_string()), args).await?;
+            let engine = dynamo_llm::mocker::engine::make_mocker_engine(
+                drt,
+                endpoint,
+                Some(local_model.card().slug().to_string()),
+                args,
+            )
+            .await?;

Also run cargo fmt locally to ensure no other drift.

🧹 Nitpick comments (7)

components/metrics/src/bin/mock_worker.rs (1)
149-149: Standardize naming: use kebab-case for component name and model slug for the label

Per prior team preference, use kebab-case in .component() calls. Also, align model label semantics with other call sites that pass card.slug().

Apply this minimal change for consistency:
-        .component("MyComponent", Some("MyModel".to_string()))?
+        .component("my-component", Some("my-model".to_string()))?
If this mock should reflect real model identity, consider wiring the actual card.slug() instead of a hardcoded string.
lib/llm/src/entrypoint/input/endpoint.rs (2)
34-39: Propagate model label via slug: good; fix formatting to address cargo fmt failure

The extraction of model_name via card().slug() looks correct and consistent. The pre-merge check flagged formatting on these lines. Please run cargo fmt.

If you prefer a minimal manual tweak, this formatting matches rustfmt output:
-    let model_name = match &engine_config {
-        EngineConfig::StaticFull { model, .. } | EngineConfig::StaticCore { model, .. } => {
-            Some(model.card().slug().to_string())
-        }
-        EngineConfig::StaticRemote(model) | EngineConfig::Dynamic(model) => Some(model.card().slug().to_string()),
-    };
+    let model_name = match &engine_config {
+        EngineConfig::StaticFull { model, .. } | EngineConfig::StaticCore { model, .. } => {
+            Some(model.card().slug().to_string())
+        }
+        EngineConfig::StaticRemote(model) | EngineConfig::Dynamic(model) => {
+            Some(model.card().slug().to_string())
+        }
+    };
50-104: Optional: factor out model_name extraction to reduce duplication

The card().slug() extraction appears in multiple places across the codebase. Consider a small helper to centralize this and enforce consistency.

Happy to draft a helper if you want.
lib/bindings/python/rust/lib.rs (1)
538-541: Python API stays backward compatible; consider adding text_signature for introspection

The added optional model parameter with a default preserves backward compatibility and correctly forwards to self.inner.component(name, model). To improve Python help/introspection, add a text signature.
 impl Namespace {
-    #[pyo3(signature = (name, model = None))]
+    #[pyo3(signature = (name, model = None))]
+    #[pyo3(text_signature = "(name, model=None)")]
     fn component(&self, name: String, model: Option<String>) -> PyResult<Component> {
         let inner = self.inner.component(name, model).map_err(to_pyerr)?;
         Ok(Component {
             inner,
             event_loop: self.event_loop.clone(),
         })
     }
 }
lib/runtime/src/component.rs (1)
128-131: Clarify whether model participates in identity (Hash/Eq); document or include accordingly.

Component's Hash/Eq implementations ignore model. If two Components have the same namespace/name but different model, they will compare equal and collide in hash-based structures. If this is intentional (model is just a label for metrics), please document that explicitly. If not, include model in Hash/Eq.

Suggested doc tweak to make intent explicit:
-    /// Name of the model
+    /// Optional model name used for metrics labeling only (not part of Component identity/Hash/Eq)
lib/llm/src/mocker/engine.rs (2)
51-52: Optional: consider kebab-case for label values to match broader naming conventions.

"mocker_model" could be "mocker-model" for consistency across labels (component names prefer kebab-case; aligning model labels can aid uniform dashboards). Not required; only if your Prometheus/UX conventions prefer it.

515-522: Break long line and run rustfmt to satisfy CI.

Reformat the constructor call; also run cargo fmt.
-    let annotated_engine =
-        AnnotatedMockEngine::new(MockVllmEngine::new(args), distributed_runtime, endpoint, model_name);
+    let annotated_engine = AnnotatedMockEngine::new(
+        MockVllmEngine::new(args),
+        distributed_runtime,
+        endpoint,
+        model_name,
+    );

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fa4a7f1 and 7ff3c75.

📒 Files selected for processing (18)

components/backends/vllm/src/dynamo/vllm/main.py (2 hunks)
components/metrics/src/bin/mock_worker.rs (1 hunks)
components/metrics/src/lib.rs (1 hunks)
components/metrics/src/main.rs (4 hunks)
components/router/src/main.rs (1 hunks)
launch/dynamo-run/src/lib.rs (1 hunks)
lib/bindings/c/src/lib.rs (1 hunks)
lib/bindings/python/rust/lib.rs (1 hunks)
lib/bindings/python/rust/llm/entrypoint.rs (1 hunks)
lib/llm/src/discovery/watcher.rs (1 hunks)
lib/llm/src/entrypoint/input/common.rs (1 hunks)
lib/llm/src/entrypoint/input/endpoint.rs (1 hunks)
lib/llm/src/entrypoint/input/http.rs (1 hunks)
lib/llm/src/kv_router/prefill_counter.rs (2 hunks)
lib/llm/src/kv_router/sequence.rs (2 hunks)
lib/llm/src/mocker/engine.rs (6 hunks)
lib/runtime/src/component.rs (5 hunks)
lib/runtime/src/metrics.rs (2 hunks)

🧰 Additional context used

🧠 Learnings (10)

📓 Common learnings

Learnt from: ryanolson
PR: ai-dynamo/dynamo#1788
File: components/metrics/src/bin/mock_worker.rs:135-135
Timestamp: 2025-07-07T16:17:26.408Z
Learning: The codebase should use kebab-case for component names in .component() calls for uniformity, as indicated by the user's preference for kebab-case or snake_case uniformity, and kebab-case already has existing usage in the codebase.

📚 Learning: 2025-07-14T21:25:56.930Z

Learnt from: ryanolson
PR: ai-dynamo/dynamo#1919
File: lib/runtime/src/engine.rs:168-168
Timestamp: 2025-07-14T21:25:56.930Z
Learning: The AsyncEngineContextProvider trait in lib/runtime/src/engine.rs was intentionally changed from `Send + Sync + Debug` to `Send + Debug` because the Sync bound was overly constraining. The trait should only require Send + Debug as designed.

Applied to files:

components/router/src/main.rs
lib/llm/src/entrypoint/input/common.rs
lib/bindings/c/src/lib.rs
lib/llm/src/kv_router/sequence.rs
launch/dynamo-run/src/lib.rs
lib/llm/src/entrypoint/input/http.rs
lib/bindings/python/rust/llm/entrypoint.rs
lib/llm/src/entrypoint/input/endpoint.rs
lib/llm/src/mocker/engine.rs

📚 Learning: 2025-06-05T01:04:24.775Z

Learnt from: PeaBrane
PR: ai-dynamo/dynamo#1392
File: launch/dynamo-run/src/subprocess/vllm_v1_inc.py:71-71
Timestamp: 2025-06-05T01:04:24.775Z
Learning: The `create_endpoint` method in `WorkerMetricsPublisher` has backward compatibility maintained through pyo3 signature annotation `#[pyo3(signature = (component, dp_rank = None))]`, making the `dp_rank` parameter optional with a default value of `None`.

Applied to files:

components/router/src/main.rs
lib/llm/src/entrypoint/input/common.rs
components/metrics/src/lib.rs
lib/bindings/c/src/lib.rs
lib/llm/src/kv_router/prefill_counter.rs
components/metrics/src/bin/mock_worker.rs
lib/bindings/python/rust/lib.rs
lib/llm/src/entrypoint/input/http.rs

📚 Learning: 2025-07-16T12:41:12.543Z

Learnt from: grahamking
PR: ai-dynamo/dynamo#1962
File: lib/runtime/src/component/client.rs:270-273
Timestamp: 2025-07-16T12:41:12.543Z
Learning: In lib/runtime/src/component/client.rs, the current mutex usage in get_or_create_dynamic_instance_source is temporary while evaluating whether the mutex can be dropped entirely. The code currently has a race condition between try_lock and lock().await, but this is acknowledged as an interim state during the performance optimization process.

Applied to files:

lib/llm/src/entrypoint/input/common.rs
lib/llm/src/kv_router/sequence.rs
lib/llm/src/entrypoint/input/http.rs
lib/llm/src/mocker/engine.rs

📚 Learning: 2025-05-30T06:38:09.630Z

Learnt from: PeaBrane
PR: ai-dynamo/dynamo#1285
File: lib/llm/src/kv_router/scoring.rs:58-63
Timestamp: 2025-05-30T06:38:09.630Z
Learning: In lib/llm/src/kv_router/scoring.rs, the user prefers to keep the panic behavior when calculating load_avg and variance with empty endpoints rather than adding guards for division by zero. They want the code to fail fast on this error condition.

Applied to files:

lib/llm/src/entrypoint/input/common.rs
lib/llm/src/kv_router/sequence.rs
lib/llm/src/kv_router/prefill_counter.rs
lib/llm/src/entrypoint/input/http.rs

📚 Learning: 2025-05-29T00:02:35.018Z

Learnt from: alec-flowers
PR: ai-dynamo/dynamo#1181
File: lib/llm/src/kv_router/publisher.rs:379-425
Timestamp: 2025-05-29T00:02:35.018Z
Learning: In lib/llm/src/kv_router/publisher.rs, the functions `create_stored_blocks` and `create_stored_block_from_parts` are correctly implemented and not problematic duplications of existing functionality elsewhere in the codebase.

Applied to files:

lib/bindings/c/src/lib.rs
lib/llm/src/kv_router/sequence.rs
lib/llm/src/kv_router/prefill_counter.rs

📚 Learning: 2025-06-02T19:37:27.666Z

Learnt from: oandreeva-nv
PR: ai-dynamo/dynamo#1195
File: lib/llm/tests/block_manager.rs:150-152
Timestamp: 2025-06-02T19:37:27.666Z
Learning: In Rust/Tokio applications, when background tasks use channels for communication, dropping the sender automatically signals task termination when the receiver gets `None`. The `start_batching_publisher` function in `lib/llm/tests/block_manager.rs` demonstrates this pattern: when the `KVBMDynamoRuntimeComponent` is dropped, its `batch_tx` sender is dropped, causing `rx.recv()` to return `None`, which triggers cleanup and task termination.

Applied to files:

lib/bindings/c/src/lib.rs

📚 Learning: 2025-05-29T06:20:12.901Z

Learnt from: ryanolson
PR: ai-dynamo/dynamo#1093
File: lib/llm/src/block_manager/block/registry.rs:98-122
Timestamp: 2025-05-29T06:20:12.901Z
Learning: In lib/llm/src/block_manager/block/registry.rs, the background task spawned for handling unregister notifications uses detached concurrency by design. The JoinHandle is intentionally not stored as this represents a reasonable architectural tradeoff for a long-running cleanup task.

Applied to files:

lib/bindings/c/src/lib.rs
lib/llm/src/kv_router/sequence.rs

📚 Learning: 2025-06-05T01:02:15.318Z

Learnt from: PeaBrane
PR: ai-dynamo/dynamo#1392
File: lib/llm/src/kv_router/scoring.rs:35-46
Timestamp: 2025-06-05T01:02:15.318Z
Learning: In lib/llm/src/kv_router/scoring.rs, PeaBrane prefers panic-based early failure over Result-based error handling for the worker_id() method to catch invalid data early during development.

Applied to files:

lib/llm/src/kv_router/sequence.rs
lib/llm/src/kv_router/prefill_counter.rs

📚 Learning: 2025-06-08T03:12:03.985Z

Learnt from: jthomson04
PR: ai-dynamo/dynamo#1429
File: lib/runtime/src/utils/leader_worker_barrier.rs:69-72
Timestamp: 2025-06-08T03:12:03.985Z
Learning: In the leader-worker barrier implementation in lib/runtime/src/utils/leader_worker_barrier.rs, the `wait_for_key_count` function correctly uses exact equality (`==`) instead of greater-than-or-equal (`>=`) because worker IDs must be unique (enforced by etcd create-only operations), ensuring exactly the expected number of workers can register.

Applied to files:

lib/llm/src/kv_router/prefill_counter.rs

🪛 GitHub Actions: Pre Merge Validation of (ai-dynamo/dynamo/refs/pull/2389/merge) by tzulingk.

lib/bindings/c/src/lib.rs

[error] 147-147: pre-commit hook 'trailing-whitespace' failed. Trailing whitespace was fixed in this file. Run 'pre-commit run --all-files' locally to reproduce.

🪛 GitHub Actions: Rust pre-merge checks

lib/bindings/c/src/lib.rs

[error] 147-148: cargo fmt check failed due to formatting differences. Please run 'cargo fmt' to fix code style issues.

launch/dynamo-run/src/lib.rs

[error] 137-143: cargo fmt check failed due to formatting differences. Please run 'cargo fmt' to fix code style issues.

lib/llm/src/entrypoint/input/endpoint.rs

[error] 35-39: cargo fmt check failed due to formatting differences. Please run 'cargo fmt' to fix code style issues.

components/metrics/src/main.rs

[error] 138-138: cargo fmt check failed due to formatting differences. Please run 'cargo fmt' to fix code style issues.

lib/llm/src/mocker/engine.rs

[error] 457-462: cargo fmt check failed due to formatting differences. Please run 'cargo fmt' to fix code style issues.

[error] 517-525: cargo fmt check failed due to formatting differences. Please run 'cargo fmt' to fix code style issues.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Build and Test - dynamo

🔇 Additional comments (21)

components/router/src/main.rs (1)

63-68: Correct: explicitly pass model None for a model-agnostic component

This matches the new API and the component’s model-agnostic nature. LGTM.

lib/llm/src/kv_router/sequence.rs (2)

801-806: Tests updated to new API: passing None is appropriate

The router tests aren’t tied to a model; this change is correct.

915-920: Ditto for second test site

Consistent with the new signature; looks good.

lib/llm/src/entrypoint/input/common.rs (1)

109-114: Good: scope the component by model slug

Using Some(card.slug().to_string()) threads model context through runtime resolution and enables model-labeled metrics. Ensure slug remains stable to avoid metric cardinality churn on renames.

lib/llm/src/kv_router/prefill_counter.rs (2)

369-372: LGTM: explicit None for model name clarifies intent

Explicitly passing None matches the new signature and documents the counters component is model-agnostic.

435-437: LGTM: consistent model-agnostic component creation

The second thread mirrors the first with .component("counters", None), keeping the test symmetric.
lib/llm/src/entrypoint/input/endpoint.rs (1)

41-44: Remaining single-arg .component(...) calls must be updated to the new two-argument signature

Our scan uncovered the following call sites still using the old single-arg form. Please update each to pass a second argument (e.g. None or a model slug):

• lib/llm/src/http/service/clear_kv_blocks.rs:112
• lib/llm/src/entrypoint/input/endpoint.rs:206
• lib/runtime/tests/namespace_etcd_path.rs:59, 101, 102
• lib/runtime/tests/soak.rs:133, 157
• lib/runtime/src/component/service.rs:108
• lib/runtime/src/metrics.rs:671, 873
• lib/runtime/src/component/component.rs:93
• lib/runtime/examples/system_metrics/tests/integration_test.rs:71
• lib/runtime/examples/system_metrics/src/lib.rs:92
• lib/runtime/examples/system_metrics/src/bin/system_client.rs:34
• lib/runtime/examples/hello_world/src/bin/server.rs:71
• lib/runtime/examples/hello_world/src/bin/client.rs:33
• lib/runtime/examples/service_metrics/src/bin/service_server.rs:73
• lib/runtime/examples/service_metrics/src/bin/service_client.rs:34

Example fix:
- namespace.component("backend")?
+ namespace.component("backend", None)?
⛔ Skipped due to learnings
Learnt from: ryanolson
PR: ai-dynamo/dynamo#1788
File: components/metrics/src/bin/mock_worker.rs:135-135
Timestamp: 2025-07-07T16:17:26.408Z
Learning: The codebase should use kebab-case for component names in .component() calls for uniformity, as indicated by the user's preference for kebab-case or snake_case uniformity, and kebab-case already has existing usage in the codebase.
lib/llm/src/entrypoint/input/http.rs (1)

66-71: LGTM: passes slug as model label consistently

Using card.slug() here aligns with other call sites and keeps metrics labeling consistent.

lib/llm/src/discovery/watcher.rs (1)

172-173: Correctly propagates model context into component resolution

Passing Some(model_entry.name.clone()) into namespace.component(...) aligns with the new API and PR goal. Looks good.

components/backends/vllm/src/dynamo/vllm/main.py (1)

135-137: Correctly passes model into component() calls

Both call sites now pass config.model, matching the updated binding/runtime API. The unchanged component("prefill") call remains valid due to the default model=None.

Given team preference for kebab-case component names, please confirm config.component conforms to kebab-case across CLI/config inputs.

Also applies to: 170-172

lib/runtime/src/metrics.rs (1)

397-401: Trait extension looks good; ensure implementors wire model through

Adding fn model(&self) -> Option<String> with a default None is clean and backward compatible.

Please confirm Component/Endpoint registries override model() appropriately (e.g., Component returns its model; Endpoint forwards its parent’s) so the label appears where intended per DIS-360.
lib/runtime/src/component.rs (4)

183-186: MetricsRegistry::model forwarding looks correct.

Returns a cloned Option as expected; aligns with metrics auto-labeling.

226-229: Getter mirrors stored model as intended.

No concerns.

337-340: Endpoint correctly forwards model() to its Component.

This ensures endpoint-level metrics inherit the component model label.

505-513: Fix doc comment grammar
-    /// Create a [`Component`] in the namespace who's endpoints can be discovered with etcd
+    /// Create a [`Component`] in the namespace whose endpoints can be discovered with etcd
I’ve verified that this PR doesn’t introduce any new .component() call sites using snake_case literals—no further changes are needed here.
components/metrics/src/main.rs (3)

128-131: Explicitly passing None for "count" is appropriate.

Aggregator component is model-agnostic; this call matches intent and keeps semantics clear.

141-142: Target component resolution now forwards the model; LGTM.

Passing config.model_name.clone() maintains flexibility across model/non-model targets.

63-66: Fix formatting: cargo fmt is failing in this file.

CI reports rustfmt differences. After applying the above diffs, please run:

cargo fmt

This will resolve the style check failure.

Also applies to: 116-117, 128-131, 141-142

lib/llm/src/mocker/engine.rs (3)

445-446: Constructor gains model_name parameter — interface change looks consistent.

New parameter cleanly threads model through to component resolution.

460-464: Passing model_name into namespace.component is correct.

Closure safely clones the Option<String> per retry. All good.

557-558: Tests updated to pass model through component lookup — LGTM.

Using MOCKER_MODEL ensures coverage for model-aware component resolution.

Also applies to: 626-629

components/metrics/src/lib.rs

components/metrics/src/main.rs

lib/bindings/c/src/lib.rs

lib/bindings/python/rust/llm/entrypoint.rs

lib/runtime/src/metrics.rs

components/backends/vllm/src/dynamo/vllm/main.py

lib/llm/src/entrypoint/input/endpoint.rs

lib/runtime/src/component.rs

lib/runtime/src/lib.rs

keivenchang

Thanks for making the idea of labels applicable for other use cases, while preserving the idea that Components are flexible (e.g. they can be serving models, or not serving models at all).

A quick question for you @ryanolson. Do you think that labels should be inherited by the children? For example, if a Component has a label called {model="llama-3.4.5"}, should Endpoint(s) also have such a label? Intuitively, I think that would make sense, but I'm wondering if you think otherwise.

lib/runtime/src/metrics.rs

keivenchang · 2025-08-12T23:04:17Z

+1 LGTM from me. Regarding the topic of children inheriting a parent's label-- that can be done on a separate PR/Linear.

…lone Component labels to Endpoint.

LGTM

tzulingk · 2025-08-13T19:17:53Z

@grahamking please take a look at my newest commit.

per discussion in https://nvidia.slack.com/archives/C08UQCG3RNV/p1755105938302299?thread_ts=1755095101.510609&cid=C08UQCG3RNV
Letting each layer has its own label and we return the labels from different hierarchies when calling stored_labels()?
That is if you call stored_labels()

Namespace::stored_labels(): returns Namespace labels.
Component::stored_labels(): returns Namespace + Component labels.
Endpoint::stored_labels(): returns Namespace + Component + Endpoint labels.

With this we would have the debugability that you mentioned, at the same time address @nnshah1's comment on the consistency.

lib/llm/src/entrypoint/input/endpoint.rs

lib/runtime/examples/hello_world/src/bin/client.rs

nnshah1 · 2025-08-14T19:08:05Z

lib/llm/src/entrypoint/input/endpoint.rs

    let component = distributed_runtime
        .namespace(&endpoint_id.namespace)?
-        .component(&endpoint_id.component)?;
+        .component(&endpoint_id.component)


question:

@grahamking is this for the in process engines?

@keivenchang do we have these hooked up into the frontend metrics?

I removed this in #2292

nnshah1 · 2025-08-14T19:09:59Z

lib/llm/src/entrypoint/input/common.rs

                .namespace(&endpoint_id.namespace)?
-                .component(&endpoint_id.component)?;
+                .component(&endpoint_id.component)
+                .and_then(|c| c.add_labels(&[("model", card.slug().to_string().as_str())]))?;


@keivenchang , @grahamking , @tzulingk - similar question here - is this relevant on a client?

I removed this in #2292

nnshah1 · 2025-08-14T19:11:14Z

lib/llm/src/discovery/watcher.rs

            .namespace(&endpoint_id.namespace)?
-            .component(&endpoint_id.component)?;
+            .component(&endpoint_id.component)
+            .and_then(|c| c.add_labels(&[("model", &model_entry.name)]))?;


same question about client here- will take offline.

I removed this in: #2292

keivenchang · 2025-08-15T16:37:38Z

Sorry about the confusions I caused, I'll try to explain the situation better:

The worker code (e.g. "backend") in lib/runtime/src/* are fine, these are what generates the labels (for the model)
As for the frontend lib/llm/src/* code, since the FE is already generating the model label, and there are some lingering questions about having the model concept in Components, I decided to remove these in feat(metrics): add NATS client metrics to prometheus_metrics_fmt #2292 yesterday.

Signed-off-by: Hannah Zhang <[email protected]>

tzulingk added 2 commits August 8, 2025 21:59

Add model label to Component

528626e

Use ModelDeploymentCard.slug() for model name. ModelDeploymentCard.se…

7ff3c75

…rvice_name has been removed in PR#2349

tzulingk requested review from a team, GuanLuo, PeaBrane, alec-flowers, biswapanda, grahamking, ishandhanani, jthomson04, kkranen, nnshah1, paulhendricks, piotrm-nvidia, ptarasiewiczNV, rmccorm4, ryanolson, tanmayv25, tedzhouhk and tmonty12 as code owners August 9, 2025 05:13

pull-request-size bot added the size/L label Aug 9, 2025

github-actions bot added the feat label Aug 9, 2025

tzulingk requested a review from keivenchang August 9, 2025 05:14

coderabbitai bot reviewed Aug 9, 2025

View reviewed changes

tzulingk added 3 commits August 8, 2025 22:23

cargo fmt

e654668

Address coderabbit comments.

506a495

Fix compile errors in examples.

84f8d92

tzulingk added 2 commits August 11, 2025 21:44

Remove ',' from compoonent() python binding input signature.

667ebe1

Clean up.

86d1811

tzulingk requested review from grahamking and keivenchang August 12, 2025 05:17

grahamking reviewed Aug 12, 2025

View reviewed changes

components/backends/vllm/src/dynamo/vllm/main.py Show resolved Hide resolved

grahamking reviewed Aug 12, 2025

View reviewed changes

lib/llm/src/entrypoint/input/endpoint.rs Outdated Show resolved Hide resolved

grahamking reviewed Aug 12, 2025

View reviewed changes

lib/runtime/src/component.rs Show resolved Hide resolved

grahamking reviewed Aug 12, 2025

View reviewed changes

lib/runtime/src/lib.rs Show resolved Hide resolved

keivenchang reviewed Aug 12, 2025

View reviewed changes

lib/runtime/src/metrics.rs Show resolved Hide resolved

Use model.service_name() instead of model.card().slug().

e182fd5

tzulingk requested review from grahamking and keivenchang August 13, 2025 01:33

stored_labels return all the labels from higher hierachy and do not c…

52b53f9

…lone Component labels to Endpoint.

keivenchang approved these changes Aug 13, 2025

View reviewed changes

grahamking reviewed Aug 14, 2025

View reviewed changes

lib/llm/src/entrypoint/input/endpoint.rs Show resolved Hide resolved

grahamking approved these changes Aug 14, 2025

View reviewed changes

nnshah1 reviewed Aug 14, 2025

View reviewed changes

lib/runtime/examples/hello_world/src/bin/client.rs Show resolved Hide resolved

tzulingk merged commit 3a3f5bf into main Aug 14, 2025
10 checks passed

tzulingk deleted the tzulingk/model_label_dis_360_clean branch August 14, 2025 18:48

nnshah1 reviewed Aug 14, 2025

View reviewed changes

This was referenced Aug 16, 2025

feat: Add model label for vllm backend metrics #2474

Merged

fix: replace metrics callback with background scraping to prevent tim… #2480

Merged

harryskim mentioned this pull request Aug 22, 2025

[Roadmap]: 0.4.1 - 0.5.0 roadmap and key dates #2649

Open

hhzhang16 pushed a commit that referenced this pull request Aug 27, 2025

feat: Add a "model" label to Component metrics (#2389)

4542409

Signed-off-by: Hannah Zhang <[email protected]>

feat: Add a "model" label to Component metrics #2389

feat: Add a "model" label to Component metrics #2389

Uh oh!

Conversation

tzulingk commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Aug 9, 2025

Uh oh!

coderabbitai bot commented Aug 9, 2025

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

keivenchang left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

keivenchang commented Aug 12, 2025

Uh oh!

tzulingk commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nnshah1 Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

keivenchang Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

nnshah1 Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

keivenchang Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

nnshah1 Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

keivenchang Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

keivenchang commented Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

tzulingk commented Aug 9, 2025 •

edited

Loading

keivenchang left a comment •

edited

Loading

tzulingk commented Aug 13, 2025 •

edited

Loading

keivenchang commented Aug 15, 2025 •

edited

Loading