-
Notifications
You must be signed in to change notification settings - Fork 690
feat: Add a "model" label to Component metrics #2389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…rvice_name has been removed in PR#2349
WalkthroughThis change set standardizes the handling of model names across multiple components by updating the Changes
Sequence Diagram(s)sequenceDiagram
participant CLI/User
participant Config
participant Namespace
participant Component
participant Metrics
CLI/User->>Config: Provide component and model name
Config->>Namespace: component(name, Some(model_name))
Namespace->>Component: Create with name, model_name
Component-->>Namespace: Return Component instance
Namespace-->>Config: Return Component
Component->>Metrics: model()
Metrics->>Metrics: Auto-label with model if present
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
Note 🔌 MCP (Model Context Protocol) integration is now available in Early Access!Pro users can now connect to remote MCP servers under the Integrations page to get reviews and chat conversations that understand additional development context. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🔭 Outside diff range comments (4)
components/metrics/src/lib.rs (3)
350-399: Add "model" to all metric label schemas (gauge/counter vecs)To expose the model dimension, extend the label arrays by one position. This change is breaking (cardinality increase) and will create new time series; plan dashboards/alerts accordingly.
- &["component", "endpoint", "worker_id"] + &["component", "endpoint", "worker_id", "model"] @@ - &["component", "endpoint", "worker_id"] + &["component", "endpoint", "worker_id", "model"] @@ - &["component", "endpoint", "worker_id"] + &["component", "endpoint", "worker_id", "model"] @@ - &["component", "endpoint", "worker_id"] + &["component", "endpoint", "worker_id", "model"] @@ - &["component", "endpoint"] + &["component", "endpoint", "model"] @@ - &["component", "endpoint"] + &["component", "endpoint", "model"] @@ - &["component", "endpoint", "worker_id"] + &["component", "endpoint", "worker_id", "model"] @@ - &["component", "endpoint", "worker_id"] + &["component", "endpoint", "worker_id", "model"] @@ - &["component", "endpoint", "worker_id"] + &["component", "endpoint", "worker_id", "model"]
403-413: Pass the model label value at call sitesUse config.model_name when setting or incrementing metrics. Default to "unknown" when None.
fn set_worker_gauge( @@ - gauge - .with_label_values(&[&config.component_name, &config.endpoint_name, worker_id]) + let model = config.model_name.as_deref().unwrap_or("unknown"); + gauge + .with_label_values(&[ + &config.component_name, + &config.endpoint_name, + worker_id, + model, + ]) .set(value); @@ fn increment_worker_counter( @@ - counter - .with_label_values(&[&config.component_name, &config.endpoint_name, worker_id]) + let model = config.model_name.as_deref().unwrap_or("unknown"); + counter + .with_label_values(&[ + &config.component_name, + &config.endpoint_name, + worker_id, + model, + ]) .inc_by(value); @@ fn set_endpoint_gauge( @@ - gauge - .with_label_values(&[&config.component_name, &config.endpoint_name]) + let model = config.model_name.as_deref().unwrap_or("unknown"); + gauge + .with_label_values(&[&config.component_name, &config.endpoint_name, model]) .set(value);Also applies to: 416-426, 428-438
516-536: Include "model" when reading cumulative countersCounterVec::with_label_values must match the updated label schema.
- let cumulative_isl = self + let model = config.model_name.as_deref().unwrap_or("unknown"); + let cumulative_isl = self .kv_hit_rate_isl_blocks .with_label_values(&[ &config.component_name, &config.endpoint_name, &worker_id_str, + model, ]) .get(); @@ - let cumulative_overlap = self + let cumulative_overlap = self .kv_hit_rate_overlap_blocks .with_label_values(&[ &config.component_name, &config.endpoint_name, &worker_id_str, + model, ]) .get();Also applies to: 525-533
launch/dynamo-run/src/lib.rs (1)
137-143: Fix rustfmt failure by wrapping long callCI reports cargo fmt differences (Lines 137-143). The long
make_mocker_enginecall should be line-broken per rustfmt.- let engine = - dynamo_llm::mocker::engine::make_mocker_engine(drt, endpoint, Some(local_model.card().slug().to_string()), args).await?; + let engine = dynamo_llm::mocker::engine::make_mocker_engine( + drt, + endpoint, + Some(local_model.card().slug().to_string()), + args, + ) + .await?;Also run
cargo fmtlocally to ensure no other drift.
🧹 Nitpick comments (7)
components/metrics/src/bin/mock_worker.rs (1)
149-149: Standardize naming: use kebab-case for component name and model slug for the labelPer prior team preference, use kebab-case in .component() calls. Also, align model label semantics with other call sites that pass card.slug().
Apply this minimal change for consistency:
- .component("MyComponent", Some("MyModel".to_string()))? + .component("my-component", Some("my-model".to_string()))?If this mock should reflect real model identity, consider wiring the actual card.slug() instead of a hardcoded string.
lib/llm/src/entrypoint/input/endpoint.rs (2)
34-39: Propagate model label via slug: good; fix formatting to address cargo fmt failureThe extraction of
model_nameviacard().slug()looks correct and consistent. The pre-merge check flagged formatting on these lines. Please run cargo fmt.If you prefer a minimal manual tweak, this formatting matches rustfmt output:
- let model_name = match &engine_config { - EngineConfig::StaticFull { model, .. } | EngineConfig::StaticCore { model, .. } => { - Some(model.card().slug().to_string()) - } - EngineConfig::StaticRemote(model) | EngineConfig::Dynamic(model) => Some(model.card().slug().to_string()), - }; + let model_name = match &engine_config { + EngineConfig::StaticFull { model, .. } | EngineConfig::StaticCore { model, .. } => { + Some(model.card().slug().to_string()) + } + EngineConfig::StaticRemote(model) | EngineConfig::Dynamic(model) => { + Some(model.card().slug().to_string()) + } + };
50-104: Optional: factor out model_name extraction to reduce duplicationThe
card().slug()extraction appears in multiple places across the codebase. Consider a small helper to centralize this and enforce consistency.Happy to draft a helper if you want.
lib/bindings/python/rust/lib.rs (1)
538-541: Python API stays backward compatible; consider adding text_signature for introspectionThe added optional
modelparameter with a default preserves backward compatibility and correctly forwards toself.inner.component(name, model). To improve Python help/introspection, add a text signature.impl Namespace { - #[pyo3(signature = (name, model = None))] + #[pyo3(signature = (name, model = None))] + #[pyo3(text_signature = "(name, model=None)")] fn component(&self, name: String, model: Option<String>) -> PyResult<Component> { let inner = self.inner.component(name, model).map_err(to_pyerr)?; Ok(Component { inner, event_loop: self.event_loop.clone(), }) } }lib/runtime/src/component.rs (1)
128-131: Clarify whethermodelparticipates in identity (Hash/Eq); document or include accordingly.
Component'sHash/Eqimplementations ignoremodel. If two Components have the same namespace/name but different model, they will compare equal and collide in hash-based structures. If this is intentional (model is just a label for metrics), please document that explicitly. If not, includemodelinHash/Eq.Suggested doc tweak to make intent explicit:
- /// Name of the model + /// Optional model name used for metrics labeling only (not part of Component identity/Hash/Eq)lib/llm/src/mocker/engine.rs (2)
51-52: Optional: consider kebab-case for label values to match broader naming conventions.
"mocker_model"could be"mocker-model"for consistency across labels (component names prefer kebab-case; aligning model labels can aid uniform dashboards). Not required; only if your Prometheus/UX conventions prefer it.
515-522: Break long line and run rustfmt to satisfy CI.Reformat the constructor call; also run cargo fmt.
- let annotated_engine = - AnnotatedMockEngine::new(MockVllmEngine::new(args), distributed_runtime, endpoint, model_name); + let annotated_engine = AnnotatedMockEngine::new( + MockVllmEngine::new(args), + distributed_runtime, + endpoint, + model_name, + );
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
components/backends/vllm/src/dynamo/vllm/main.py(2 hunks)components/metrics/src/bin/mock_worker.rs(1 hunks)components/metrics/src/lib.rs(1 hunks)components/metrics/src/main.rs(4 hunks)components/router/src/main.rs(1 hunks)launch/dynamo-run/src/lib.rs(1 hunks)lib/bindings/c/src/lib.rs(1 hunks)lib/bindings/python/rust/lib.rs(1 hunks)lib/bindings/python/rust/llm/entrypoint.rs(1 hunks)lib/llm/src/discovery/watcher.rs(1 hunks)lib/llm/src/entrypoint/input/common.rs(1 hunks)lib/llm/src/entrypoint/input/endpoint.rs(1 hunks)lib/llm/src/entrypoint/input/http.rs(1 hunks)lib/llm/src/kv_router/prefill_counter.rs(2 hunks)lib/llm/src/kv_router/sequence.rs(2 hunks)lib/llm/src/mocker/engine.rs(6 hunks)lib/runtime/src/component.rs(5 hunks)lib/runtime/src/metrics.rs(2 hunks)
🧰 Additional context used
🧠 Learnings (10)
📓 Common learnings
Learnt from: ryanolson
PR: ai-dynamo/dynamo#1788
File: components/metrics/src/bin/mock_worker.rs:135-135
Timestamp: 2025-07-07T16:17:26.408Z
Learning: The codebase should use kebab-case for component names in .component() calls for uniformity, as indicated by the user's preference for kebab-case or snake_case uniformity, and kebab-case already has existing usage in the codebase.
📚 Learning: 2025-07-14T21:25:56.930Z
Learnt from: ryanolson
PR: ai-dynamo/dynamo#1919
File: lib/runtime/src/engine.rs:168-168
Timestamp: 2025-07-14T21:25:56.930Z
Learning: The AsyncEngineContextProvider trait in lib/runtime/src/engine.rs was intentionally changed from `Send + Sync + Debug` to `Send + Debug` because the Sync bound was overly constraining. The trait should only require Send + Debug as designed.
Applied to files:
components/router/src/main.rslib/llm/src/entrypoint/input/common.rslib/bindings/c/src/lib.rslib/llm/src/kv_router/sequence.rslaunch/dynamo-run/src/lib.rslib/llm/src/entrypoint/input/http.rslib/bindings/python/rust/llm/entrypoint.rslib/llm/src/entrypoint/input/endpoint.rslib/llm/src/mocker/engine.rs
📚 Learning: 2025-06-05T01:04:24.775Z
Learnt from: PeaBrane
PR: ai-dynamo/dynamo#1392
File: launch/dynamo-run/src/subprocess/vllm_v1_inc.py:71-71
Timestamp: 2025-06-05T01:04:24.775Z
Learning: The `create_endpoint` method in `WorkerMetricsPublisher` has backward compatibility maintained through pyo3 signature annotation `#[pyo3(signature = (component, dp_rank = None))]`, making the `dp_rank` parameter optional with a default value of `None`.
Applied to files:
components/router/src/main.rslib/llm/src/entrypoint/input/common.rscomponents/metrics/src/lib.rslib/bindings/c/src/lib.rslib/llm/src/kv_router/prefill_counter.rscomponents/metrics/src/bin/mock_worker.rslib/bindings/python/rust/lib.rslib/llm/src/entrypoint/input/http.rs
📚 Learning: 2025-07-16T12:41:12.543Z
Learnt from: grahamking
PR: ai-dynamo/dynamo#1962
File: lib/runtime/src/component/client.rs:270-273
Timestamp: 2025-07-16T12:41:12.543Z
Learning: In lib/runtime/src/component/client.rs, the current mutex usage in get_or_create_dynamic_instance_source is temporary while evaluating whether the mutex can be dropped entirely. The code currently has a race condition between try_lock and lock().await, but this is acknowledged as an interim state during the performance optimization process.
Applied to files:
lib/llm/src/entrypoint/input/common.rslib/llm/src/kv_router/sequence.rslib/llm/src/entrypoint/input/http.rslib/llm/src/mocker/engine.rs
📚 Learning: 2025-05-30T06:38:09.630Z
Learnt from: PeaBrane
PR: ai-dynamo/dynamo#1285
File: lib/llm/src/kv_router/scoring.rs:58-63
Timestamp: 2025-05-30T06:38:09.630Z
Learning: In lib/llm/src/kv_router/scoring.rs, the user prefers to keep the panic behavior when calculating load_avg and variance with empty endpoints rather than adding guards for division by zero. They want the code to fail fast on this error condition.
Applied to files:
lib/llm/src/entrypoint/input/common.rslib/llm/src/kv_router/sequence.rslib/llm/src/kv_router/prefill_counter.rslib/llm/src/entrypoint/input/http.rs
📚 Learning: 2025-05-29T00:02:35.018Z
Learnt from: alec-flowers
PR: ai-dynamo/dynamo#1181
File: lib/llm/src/kv_router/publisher.rs:379-425
Timestamp: 2025-05-29T00:02:35.018Z
Learning: In lib/llm/src/kv_router/publisher.rs, the functions `create_stored_blocks` and `create_stored_block_from_parts` are correctly implemented and not problematic duplications of existing functionality elsewhere in the codebase.
Applied to files:
lib/bindings/c/src/lib.rslib/llm/src/kv_router/sequence.rslib/llm/src/kv_router/prefill_counter.rs
📚 Learning: 2025-06-02T19:37:27.666Z
Learnt from: oandreeva-nv
PR: ai-dynamo/dynamo#1195
File: lib/llm/tests/block_manager.rs:150-152
Timestamp: 2025-06-02T19:37:27.666Z
Learning: In Rust/Tokio applications, when background tasks use channels for communication, dropping the sender automatically signals task termination when the receiver gets `None`. The `start_batching_publisher` function in `lib/llm/tests/block_manager.rs` demonstrates this pattern: when the `KVBMDynamoRuntimeComponent` is dropped, its `batch_tx` sender is dropped, causing `rx.recv()` to return `None`, which triggers cleanup and task termination.
Applied to files:
lib/bindings/c/src/lib.rs
📚 Learning: 2025-05-29T06:20:12.901Z
Learnt from: ryanolson
PR: ai-dynamo/dynamo#1093
File: lib/llm/src/block_manager/block/registry.rs:98-122
Timestamp: 2025-05-29T06:20:12.901Z
Learning: In lib/llm/src/block_manager/block/registry.rs, the background task spawned for handling unregister notifications uses detached concurrency by design. The JoinHandle is intentionally not stored as this represents a reasonable architectural tradeoff for a long-running cleanup task.
Applied to files:
lib/bindings/c/src/lib.rslib/llm/src/kv_router/sequence.rs
📚 Learning: 2025-06-05T01:02:15.318Z
Learnt from: PeaBrane
PR: ai-dynamo/dynamo#1392
File: lib/llm/src/kv_router/scoring.rs:35-46
Timestamp: 2025-06-05T01:02:15.318Z
Learning: In lib/llm/src/kv_router/scoring.rs, PeaBrane prefers panic-based early failure over Result-based error handling for the worker_id() method to catch invalid data early during development.
Applied to files:
lib/llm/src/kv_router/sequence.rslib/llm/src/kv_router/prefill_counter.rs
📚 Learning: 2025-06-08T03:12:03.985Z
Learnt from: jthomson04
PR: ai-dynamo/dynamo#1429
File: lib/runtime/src/utils/leader_worker_barrier.rs:69-72
Timestamp: 2025-06-08T03:12:03.985Z
Learning: In the leader-worker barrier implementation in lib/runtime/src/utils/leader_worker_barrier.rs, the `wait_for_key_count` function correctly uses exact equality (`==`) instead of greater-than-or-equal (`>=`) because worker IDs must be unique (enforced by etcd create-only operations), ensuring exactly the expected number of workers can register.
Applied to files:
lib/llm/src/kv_router/prefill_counter.rs
🪛 GitHub Actions: Pre Merge Validation of (ai-dynamo/dynamo/refs/pull/2389/merge) by tzulingk.
lib/bindings/c/src/lib.rs
[error] 147-147: pre-commit hook 'trailing-whitespace' failed. Trailing whitespace was fixed in this file. Run 'pre-commit run --all-files' locally to reproduce.
🪛 GitHub Actions: Rust pre-merge checks
lib/bindings/c/src/lib.rs
[error] 147-148: cargo fmt check failed due to formatting differences. Please run 'cargo fmt' to fix code style issues.
launch/dynamo-run/src/lib.rs
[error] 137-143: cargo fmt check failed due to formatting differences. Please run 'cargo fmt' to fix code style issues.
lib/llm/src/entrypoint/input/endpoint.rs
[error] 35-39: cargo fmt check failed due to formatting differences. Please run 'cargo fmt' to fix code style issues.
components/metrics/src/main.rs
[error] 138-138: cargo fmt check failed due to formatting differences. Please run 'cargo fmt' to fix code style issues.
lib/llm/src/mocker/engine.rs
[error] 457-462: cargo fmt check failed due to formatting differences. Please run 'cargo fmt' to fix code style issues.
[error] 517-525: cargo fmt check failed due to formatting differences. Please run 'cargo fmt' to fix code style issues.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (21)
components/router/src/main.rs (1)
63-68: Correct: explicitly pass model None for a model-agnostic componentThis matches the new API and the component’s model-agnostic nature. LGTM.
lib/llm/src/kv_router/sequence.rs (2)
801-806: Tests updated to new API: passing None is appropriateThe router tests aren’t tied to a model; this change is correct.
915-920: Ditto for second test siteConsistent with the new signature; looks good.
lib/llm/src/entrypoint/input/common.rs (1)
109-114: Good: scope the component by model slugUsing Some(card.slug().to_string()) threads model context through runtime resolution and enables model-labeled metrics. Ensure slug remains stable to avoid metric cardinality churn on renames.
lib/llm/src/kv_router/prefill_counter.rs (2)
369-372: LGTM: explicit None for model name clarifies intentExplicitly passing None matches the new signature and documents the counters component is model-agnostic.
435-437: LGTM: consistent model-agnostic component creationThe second thread mirrors the first with
.component("counters", None), keeping the test symmetric.lib/llm/src/entrypoint/input/endpoint.rs (1)
41-44: Remaining single-arg.component(...)calls must be updated to the new two-argument signatureOur scan uncovered the following call sites still using the old single-arg form. Please update each to pass a second argument (e.g.
Noneor a model slug):• lib/llm/src/http/service/clear_kv_blocks.rs:112
• lib/llm/src/entrypoint/input/endpoint.rs:206
• lib/runtime/tests/namespace_etcd_path.rs:59, 101, 102
• lib/runtime/tests/soak.rs:133, 157
• lib/runtime/src/component/service.rs:108
• lib/runtime/src/metrics.rs:671, 873
• lib/runtime/src/component/component.rs:93
• lib/runtime/examples/system_metrics/tests/integration_test.rs:71
• lib/runtime/examples/system_metrics/src/lib.rs:92
• lib/runtime/examples/system_metrics/src/bin/system_client.rs:34
• lib/runtime/examples/hello_world/src/bin/server.rs:71
• lib/runtime/examples/hello_world/src/bin/client.rs:33
• lib/runtime/examples/service_metrics/src/bin/service_server.rs:73
• lib/runtime/examples/service_metrics/src/bin/service_client.rs:34Example fix:
- namespace.component("backend")? + namespace.component("backend", None)?⛔ Skipped due to learnings
Learnt from: ryanolson PR: ai-dynamo/dynamo#1788 File: components/metrics/src/bin/mock_worker.rs:135-135 Timestamp: 2025-07-07T16:17:26.408Z Learning: The codebase should use kebab-case for component names in .component() calls for uniformity, as indicated by the user's preference for kebab-case or snake_case uniformity, and kebab-case already has existing usage in the codebase.lib/llm/src/entrypoint/input/http.rs (1)
66-71: LGTM: passes slug as model label consistentlyUsing
card.slug()here aligns with other call sites and keeps metrics labeling consistent.lib/llm/src/discovery/watcher.rs (1)
172-173: Correctly propagates model context into component resolutionPassing
Some(model_entry.name.clone())intonamespace.component(...)aligns with the new API and PR goal. Looks good.components/backends/vllm/src/dynamo/vllm/main.py (1)
135-137: Correctly passes model into component() callsBoth call sites now pass
config.model, matching the updated binding/runtime API. The unchangedcomponent("prefill")call remains valid due to the defaultmodel=None.Given team preference for kebab-case component names, please confirm
config.componentconforms to kebab-case across CLI/config inputs.Also applies to: 170-172
lib/runtime/src/metrics.rs (1)
397-401: Trait extension looks good; ensure implementors wire model throughAdding
fn model(&self) -> Option<String>with a defaultNoneis clean and backward compatible.Please confirm
Component/Endpointregistries overridemodel()appropriately (e.g.,Componentreturns its model;Endpointforwards its parent’s) so the label appears where intended per DIS-360.lib/runtime/src/component.rs (4)
183-186: MetricsRegistry::model forwarding looks correct.Returns a cloned Option as expected; aligns with metrics auto-labeling.
226-229: Getter mirrors storedmodelas intended.No concerns.
337-340: Endpoint correctly forwardsmodel()to its Component.This ensures endpoint-level metrics inherit the component model label.
505-513: Fix doc comment grammar- /// Create a [`Component`] in the namespace who's endpoints can be discovered with etcd + /// Create a [`Component`] in the namespace whose endpoints can be discovered with etcdI’ve verified that this PR doesn’t introduce any new
.component()call sites using snake_case literals—no further changes are needed here.components/metrics/src/main.rs (3)
128-131: Explicitly passingNonefor "count" is appropriate.Aggregator component is model-agnostic; this call matches intent and keeps semantics clear.
141-142: Target component resolution now forwards the model; LGTM.Passing
config.model_name.clone()maintains flexibility across model/non-model targets.
63-66: Fix formatting: cargo fmt is failing in this file.CI reports rustfmt differences. After applying the above diffs, please run:
- cargo fmt
This will resolve the style check failure.
Also applies to: 116-117, 128-131, 141-142
lib/llm/src/mocker/engine.rs (3)
445-446: Constructor gainsmodel_nameparameter — interface change looks consistent.New parameter cleanly threads model through to component resolution.
460-464: Passingmodel_nameintonamespace.componentis correct.Closure safely clones the
Option<String>per retry. All good.
557-558: Tests updated to pass model through component lookup — LGTM.Using
MOCKER_MODELensures coverage for model-aware component resolution.Also applies to: 626-629
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making the idea of labels applicable for other use cases, while preserving the idea that Components are flexible (e.g. they can be serving models, or not serving models at all).
A quick question for you @ryanolson. Do you think that labels should be inherited by the children? For example, if a Component has a label called {model="llama-3.4.5"}, should Endpoint(s) also have such a label? Intuitively, I think that would make sense, but I'm wondering if you think otherwise.
|
+1 LGTM from me. Regarding the topic of children inheriting a parent's label-- that can be done on a separate PR/Linear. |
…lone Component labels to Endpoint.
|
@grahamking please take a look at my newest commit. per discussion in https://nvidia.slack.com/archives/C08UQCG3RNV/p1755105938302299?thread_ts=1755095101.510609&cid=C08UQCG3RNV
With this we would have the debugability that you mentioned, at the same time address @nnshah1's comment on the consistency. |
| let component = distributed_runtime | ||
| .namespace(&endpoint_id.namespace)? | ||
| .component(&endpoint_id.component)?; | ||
| .component(&endpoint_id.component) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question:
@grahamking is this for the in process engines?
@keivenchang do we have these hooked up into the frontend metrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this in #2292
| .namespace(&endpoint_id.namespace)? | ||
| .component(&endpoint_id.component)?; | ||
| .component(&endpoint_id.component) | ||
| .and_then(|c| c.add_labels(&[("model", card.slug().to_string().as_str())]))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keivenchang , @grahamking , @tzulingk - similar question here - is this relevant on a client?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this in #2292
| .namespace(&endpoint_id.namespace)? | ||
| .component(&endpoint_id.component)?; | ||
| .component(&endpoint_id.component) | ||
| .and_then(|c| c.add_labels(&[("model", &model_entry.name)]))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same question about client here- will take offline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this in: #2292
|
Sorry about the confusions I caused, I'll try to explain the situation better:
|
Signed-off-by: Hannah Zhang <[email protected]>
Overview:
Add a "model" label to Component metrics.
Details:
This pull request introduces model-specific metrics by adding a model field to the Component struct and updating the metrics labeling logic.
Changes
Vllm metrics before this PR
Vllm metrics after this PR
Note that we have
model="Qwen/Qwen3-0.6B"label now.Where should the reviewer start?
lib/runtime/src/component.rs:
modelis added to Component/Endpoint.lib/runtime/src/metrics.rs: how model label is added to the metrics.
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
DIS-360 Add a "model" label to Component metrics
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Refactor