feat: add KVBM host to disk metrics | clean up dashboard #3534

ziqifan617 · 2025-10-09T21:54:06Z

Overview:

Add KVBM host to disk offloading metric
Remove onboard/offload request metrics, which has no direct meaning to customers
Update runbook and grafana dashboard

Summary by CodeRabbit

New Features
- Expanded KVBM metrics, including new offload block counters; metrics are now wired through the block manager and offload paths.
- Revamped Grafana dashboard: 5s refresh, updated panel layout/titles, and new block-focused visualizations.
Refactor
- Metrics set streamlined: legacy request counters removed; new offload/onboard block metrics and names introduced (Prometheus/Grafana users should update alerts/dashboards).
Documentation
- Updated vLLM and TRT-LLM guides to use the Qwen/Qwen3-0.6B model in all commands, examples, and benchmarking scripts.

copy-pr-bot · 2025-10-09T21:54:09Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

ziqifan617 · 2025-10-09T21:54:45Z

/ok to test 292ebb4

coderabbitai · 2025-10-09T21:56:06Z

Walkthrough

Refactors KVBM metrics: removes request counters, adds block-level offload metric (H2D), wires optional KvbmMetrics through builders/configs into OffloadManager, and increments during offload. Adjusts transfer strategy (Disk→Device uses Write). Updates Prometheus names and Grafana dashboard accordingly. Documentation switches example models to Qwen/Qwen3-0.6B.

Changes

Cohort / File(s)	Summary
Grafana dashboard updates `deploy/metrics/grafana_dashboards/grafana-kvbm-dashboard.json`	Dashboard id/refresh/version changed; panels re-id/repositioned; offload/onboard panels renamed; onboard request panels removed; new block-level panels added (H2D/D2H, etc.); expressions retargeted to new metrics.
Docs: model switch to Qwen `docs/guides/run_kvbm_in_trtllm.md`, `docs/guides/run_kvbm_in_vllm.md`	Replace DeepSeek-R1-Distill-Llama-8B with Qwen/Qwen3-0.6B in commands, payloads, and benchmark examples; minor note additions.
Bindings: BlockManagerBuilder metrics support `lib/bindings/python/rust/llm/block_manager.rs`	Add optional kvbm_metrics field and builder method; pass through to KvBlockManagerConfig during build.
VLLM/TRTLLM leaders: wire metrics into builder `lib/bindings/python/rust/llm/block_manager/vllm/connector/leader.rs`, `.../leader/recorder.rs`, `.../leader/slot.rs`, `.../trtllm_leader.rs`	Pass cloned KvbmMetrics into BlockManagerBuilder; remove increments of offload_requests/onboard_requests in slot.
Runtime metrics schema `lib/runtime/src/metrics/prometheus_names.rs`, `lib/llm/src/block_manager/metrics_kvbm.rs`	Remove OFFLOAD_REQUESTS/ONBOARD_REQUESTS; add OFFLOAD_BLOCKS_H2D; update KvbmMetrics fields and initialization accordingly.
Block manager config and state propagation `lib/llm/src/block_manager/config.rs`, `lib/llm/src/block_manager/state.rs`	Add optional kvbm_metrics to KvBlockManagerConfig; propagate into OffloadManagerConfig across state initializations.
Offload manager metrics usage `lib/llm/src/block_manager/offload.rs`	Add optional kvbm_metrics to OffloadManagerConfig/OffloadManager; on host offload, increment offload_blocks_h2d before sending request.
Transfer strategy adjustment `lib/llm/src/block_manager/block/transfer/strategy.rs`	For DiskStorage→DeviceStorage, use NixlTransfer::Write instead of ::Read.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Leader
  participant Builder as BlockManagerBuilder
  participant BMConfig as KvBlockManagerConfig
  participant State as KvBlockManagerState
  participant OffCfg as OffloadManagerConfig
  participant OffMgr as OffloadManager

  Leader->>Builder: new()
  Leader->>Builder: kvbm_metrics(kvbm_metrics.clone())
  Builder->>BMConfig: build() with kvbm_metrics: Some(...)
  BMConfig-->>Leader: config

  Leader->>State: init(config)
  State->>OffCfg: new(..., kvbm_metrics: config.kvbm_metrics.clone())
  State->>OffMgr: OffloadManager::new(OffCfg)
  OffMgr-->>State: instance ready

sequenceDiagram
  autonumber
  participant Caller as BlockManager
  participant OffMgr as OffloadManager
  participant HostTx as host_offload_tx
  note over Caller,OffMgr: Offload path with metrics

  Caller->>OffMgr: offload(block: Host)
  alt kvbm_metrics is Some
    OffMgr->>OffMgr: offload_blocks_h2d.inc()
  else no metrics
    OffMgr-->>OffMgr: skip
  end
  OffMgr->>HostTx: send(request)
  HostTx-->>OffMgr: ack
  note over Caller,OffMgr: Prior request counters removed in slot.rs

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat: add initial batch of KVBM metrics on match, offload and onboard #2673 — Also modifies KVBM metrics schema and Grafana dashboard; similar renames and wiring paths.
feat: enable dynamo metrics on KVBM #2626 — Earlier KVBM metrics integration introducing wiring and counters that this PR refactors/removes.

Poem

A hop, a skip, I count each block,
From host to disk—tick-tock, tick-tock.
No “requests” now, just flows I see,
In panels bright as carrot tea. 🥕
I write, not read, from disk I dart—
Metrics nibble at the heart.

Pre-merge checks

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description Check	⚠️ Warning	The pull request description includes a high-level Overview section but omits the required Details, Where should the reviewer start, and Related Issues sections specified in the repository template, leaving critical information about the specific code changes, review entry points, and issue tracking unaddressed. This lack of structure can hinder reviewers’ ability to understand the scope and locate relevant files. Consequently, the description does not conform to the repository’s expected template and is incomplete.	Please expand the description to include a Details section that outlines the specific file and code changes, a Where should the reviewer start section calling out key files and functions, and a Related Issues section linking to any issue numbers or tickets addressed by this pull request. These additions will align the description with the repository template and provide reviewers with clear guidance. Ensuring all template sections are populated will improve review efficiency and traceability.
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title succinctly identifies the primary feature addition—KVBM host-to-disk metrics—and notes the associated dashboard cleanup, which together represent the main changes in this pull request. It is clear, concise, and directly related to the changeset without extraneous details. As such, it effectively communicates the core intent to reviewers scanning the commit history.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

richardhuo-nv

LGTM

coderabbitai · 2025-10-09T21:59:13Z

Walkthrough

Updates Grafana dashboard layout/titles/refresh; replaces model references in docs to Qwen/Qwen3-0.6B; adds kvbm_metrics plumbing through builders/config/state into OffloadManager; increments new offload_blocks_h2d metric on host-to-disk offload; removes request counters; adjusts Prometheus constant names; changes one transfer strategy mapping.

Changes

Cohort / File(s)	Summary
Grafana dashboard `deploy/metrics/grafana_dashboards/grafana-kvbm-dashboard.json`	Reorders/moves panels and ids, updates expressions and titles to offload/onboard block metrics, switches refresh to 5s, updates version, removes request-oriented blocks.
Docs: TRT-LLM `docs/guides/run_kvbm_in_trtllm.md`	Replaces model references to Qwen/Qwen3-0.6B in commands, API payloads, metrics sections, and benchmarking snippets.
Docs: vLLM `docs/guides/run_kvbm_in_vllm.md`	Same model replacement across serve/benchmark examples; note added about updating launch scripts.
Bindings: BlockManagerBuilder + connectors `lib/bindings/python/rust/llm/block_manager.rs`, `.../vllm/connector/leader.rs`, `.../vllm/connector/leader/recorder.rs`, `.../vllm/connector/leader/slot.rs`, `.../vllm/connector/trtllm_leader.rs`	Adds optional kvbm_metrics to builder and passes it from leaders/recorders; removes per-request increments in slot (onboard/offload).
Core config and metrics `lib/llm/src/block_manager/config.rs`, `lib/llm/src/block_manager/metrics_kvbm.rs`, `lib/runtime/src/metrics/prometheus_names.rs`, `lib/llm/src/block_manager/state.rs`, `lib/llm/src/block_manager/offload.rs`	Introduces KvbmMetrics option on KvBlockManagerConfig and OffloadManager; propagates via state; removes OFFLOAD_REQUESTS/ONBOARD_REQUESTS counters; adds OFFLOAD_BLOCKS_H2D metric and increments it on host→disk offload.
Transfer strategy `lib/llm/src/block_manager/block/transfer/strategy.rs`	Changes DiskStorage→DeviceStorage write mapping to use Nixl Write instead of Read.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant Leader as Connector Leader
  participant Bldr as BlockManagerBuilder
  participant Cfg as KvBlockManagerConfig
  participant State as KvBlockManagerState
  participant OffMan as OffloadManager
  note over Client,Leader: Initialization and request handling (metrics wiring)

  Client->>Leader: initialize(...)
  Leader->>Bldr: kvbm_metrics(metrics.clone())
  Bldr->>Cfg: build() with kvbm_metrics
  Cfg-->>State: config(kvbm_metrics)
  State->>OffMan: new(OffloadManagerConfig{ kvbm_metrics })
  note over OffMan: Holds optional KvbmMetrics

  Client->>Leader: offload host block
  Leader->>OffMan: offload(host->disk)
  alt host->disk
    OffMan->>OffMan: kvbm_metrics.offload_blocks_h2d.inc()
  end

sequenceDiagram
  autonumber
  participant Disk as DiskStorage
  participant Dev as DeviceStorage
  participant Strat as WriteToStrategy

  Disk->>Strat: write_to(DeviceStorage)
  note over Strat: Mapping changed
  Strat-->>Disk: TransferStrategy::Nixl(Write)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

A hare with graphs and metrics bright,
Hops through configs in the night.
Requests are gone—blocks take the lead,
Offloads counted, swift in speed.
Qwen now served, the charts refresh—
Nixl writes, the paths enmesh.
Thump-thump—shipping done. Refresh!

Pre-merge checks

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description Check	⚠️ Warning	The PR description only includes the Overview section and omits the required Details, Where should the reviewer start, and Related Issues sections from the repository template.	Please expand the description to include a Details section describing the specific changes, a Where should the reviewer start section listing key files for review, and a Related Issues section linking any relevant GitHub issues.
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title concisely summarizes the primary changes by stating the new host-to-disk metric addition and the dashboard cleanup while following a common feature-oriented prefix style.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

lib/runtime/src/metrics/prometheus_names.rs (1)
326-331: Add OFFLOAD_BLOCKS_H2D to Python bindings
The new metric must be mirrored in lib/bindings/python/rust/prometheus_names.rs. Add:
/// The number of offload blocks from host to disk
pub const OFFLOAD_BLOCKS_H2D: &str = "offload_blocks_h2d";

🧹 Nitpick comments (2)

lib/llm/src/block_manager/offload.rs (2)

495-499: Consider adding test coverage for the new metric.

The test suite in this file doesn't appear to verify that offload_blocks_h2d is incremented correctly during host-to-disk offloads. Adding a test case would help ensure the metric works as expected and prevent regressions.

495-499: Differentiate initiated vs completed offload metrics
The existing offload_blocks_h2d counter (lib/llm/src/block_manager/offload.rs:495–499) fires on enqueue. To capture actual transfer outcomes, add a offload_blocks_completed_total{status="success"|"failure"} counter in the offload worker after transfer, and consider a gauge (e.g. offload_blocks_in_progress) for in-flight operations.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 179f993 and 292ebb4.

📒 Files selected for processing (14)

deploy/metrics/grafana_dashboards/grafana-kvbm-dashboard.json (12 hunks)
docs/guides/run_kvbm_in_trtllm.md (5 hunks)
docs/guides/run_kvbm_in_vllm.md (4 hunks)
lib/bindings/python/rust/llm/block_manager.rs (3 hunks)
lib/bindings/python/rust/llm/block_manager/vllm/connector/leader.rs (1 hunks)
lib/bindings/python/rust/llm/block_manager/vllm/connector/leader/recorder.rs (1 hunks)
lib/bindings/python/rust/llm/block_manager/vllm/connector/leader/slot.rs (0 hunks)
lib/bindings/python/rust/llm/block_manager/vllm/connector/trtllm_leader.rs (1 hunks)
lib/llm/src/block_manager/block/transfer/strategy.rs (1 hunks)
lib/llm/src/block_manager/config.rs (1 hunks)
lib/llm/src/block_manager/metrics_kvbm.rs (5 hunks)
lib/llm/src/block_manager/offload.rs (4 hunks)
lib/llm/src/block_manager/state.rs (2 hunks)
lib/runtime/src/metrics/prometheus_names.rs (1 hunks)

💤 Files with no reviewable changes (1)

lib/bindings/python/rust/llm/block_manager/vllm/connector/leader/slot.rs

🧰 Additional context used

🧬 Code graph analysis (3)

lib/llm/src/block_manager/state.rs (1)

lib/llm/src/block_manager/block/data/logical.rs (1)

resources (69-71)

lib/llm/src/block_manager/metrics_kvbm.rs (1)

lib/bindings/python/src/dynamo/_prometheus_metrics.pyi (1)

IntCounter (126-142)

lib/llm/src/block_manager/offload.rs (1)

lib/bindings/python/rust/llm/block_manager.rs (1)

kvbm_metrics (246-252)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: clippy (lib/runtime/examples)
GitHub Check: clippy (launch/dynamo-run)
GitHub Check: clippy (lib/bindings/python)
GitHub Check: clippy (.)
GitHub Check: Build and Test - dynamo

🔇 Additional comments (3)

lib/llm/src/block_manager/state.rs (1)

149-156: LGTM! Clean kvbm_metrics propagation.

The kvbm_metrics field is correctly propagated from resources.config into OffloadManagerConfig in both the logical and local locality initialization paths. The clone operation is appropriate for the Optional type.

Also applies to: 265-272

lib/bindings/python/rust/llm/block_manager.rs (1)

219-219: LGTM! Proper builder pattern implementation.

The kvbm_metrics integration follows the standard builder pattern correctly:

Optional field in the builder struct

Chainable setter method

Conditional propagation during build

Also applies to: 246-252, 312-316

lib/llm/src/block_manager/offload.rs (1)

77-78: LGTM! Proper kvbm_metrics plumbing.

The kvbm_metrics field is correctly added to both config and manager structs, with proper initialization and optional handling throughout.

Also applies to: 101-102, 129-129

ziqifan617 · 2025-10-09T22:35:05Z

/ok to test 134588e

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 179f993 and 134588e.

📒 Files selected for processing (14)

deploy/metrics/grafana_dashboards/grafana-kvbm-dashboard.json (12 hunks)
docs/guides/run_kvbm_in_trtllm.md (5 hunks)
docs/guides/run_kvbm_in_vllm.md (4 hunks)
lib/bindings/python/rust/llm/block_manager.rs (3 hunks)
lib/bindings/python/rust/llm/block_manager/vllm/connector/leader.rs (1 hunks)
lib/bindings/python/rust/llm/block_manager/vllm/connector/leader/recorder.rs (1 hunks)
lib/bindings/python/rust/llm/block_manager/vllm/connector/leader/slot.rs (0 hunks)
lib/bindings/python/rust/llm/block_manager/vllm/connector/trtllm_leader.rs (1 hunks)
lib/llm/src/block_manager/block/transfer/strategy.rs (1 hunks)
lib/llm/src/block_manager/config.rs (1 hunks)
lib/llm/src/block_manager/metrics_kvbm.rs (5 hunks)
lib/llm/src/block_manager/offload.rs (4 hunks)
lib/llm/src/block_manager/state.rs (2 hunks)
lib/runtime/src/metrics/prometheus_names.rs (1 hunks)

💤 Files with no reviewable changes (1)

lib/bindings/python/rust/llm/block_manager/vllm/connector/leader/slot.rs

🧰 Additional context used

🧬 Code graph analysis (2)

lib/llm/src/block_manager/offload.rs (1)

lib/bindings/python/rust/llm/block_manager.rs (1)

kvbm_metrics (246-252)

lib/llm/src/block_manager/metrics_kvbm.rs (1)

lib/bindings/python/src/dynamo/_prometheus_metrics.pyi (1)

IntCounter (126-142)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)

GitHub Check: trtllm (arm64)
GitHub Check: vllm (arm64)
GitHub Check: trtllm (amd64)
GitHub Check: vllm (amd64)
GitHub Check: Build and Test - dynamo
GitHub Check: tests (launch/dynamo-run)
GitHub Check: clippy (.)
GitHub Check: tests (.)
GitHub Check: tests (lib/runtime/examples)
GitHub Check: clippy (lib/runtime/examples)
GitHub Check: tests (lib/bindings/python)
GitHub Check: clippy (launch/dynamo-run)

🔇 Additional comments (18)

lib/llm/src/block_manager/metrics_kvbm.rs (4)

23-24: LGTM! Metric naming and description are clear.

The new offload_blocks_h2d field accurately represents the host-to-disk offload metric with a clear description.

50-56: LGTM! Metric registration is correct.

The metric registration for offload_blocks_h2d is properly implemented with consistent naming and description.

76-84: LGTM! Early return path is complete.

The early return when create_endpoint is false correctly includes all metric fields, including the new offload_blocks_h2d.

130-137: LGTM! Final struct construction is consistent.

The final KvbmMetrics construction properly includes offload_blocks_h2d and excludes the removed request counters.

lib/llm/src/block_manager/block/transfer/strategy.rs (1)

93-98: LGTM! Transfer strategy change aligns with offload semantics.

Changing from NixlTransfer::Read to NixlTransfer::Write for DeviceStorage to DiskStorage is correct. When offloading from device to disk, the operation is initiated from the device (source) side, so using Write is semantically accurate.

Note: The test at line 223-226 correctly asserts this new behavior.

lib/bindings/python/rust/llm/block_manager/vllm/connector/trtllm_leader.rs (1)

103-109: LGTM! Metrics wiring is correct.

The KvbmMetrics is properly wired into the BlockManagerBuilder during initialization, enabling metrics collection for the KvConnectorLeader.

lib/bindings/python/rust/llm/block_manager/vllm/connector/leader.rs (1)

127-134: LGTM! Metrics wiring is consistent.

The KvbmMetrics wiring matches the pattern used in trtllm_leader.rs, ensuring consistent metrics collection across different connector implementations.

lib/bindings/python/rust/llm/block_manager/vllm/connector/leader/recorder.rs (1)

143-150: LGTM! Metrics wiring is properly implemented.

The recorder variant also correctly wires KvbmMetrics into the BlockManagerBuilder, maintaining consistency across all connector implementations.

lib/llm/src/block_manager/config.rs (1)

198-201: LGTM! Config field is well-designed.

The optional kvbm_metrics field allows per-block-manager metrics tracking with good backwards compatibility. The documentation clearly explains its purpose.

lib/llm/src/block_manager/state.rs (2)

149-156: LGTM! Metrics propagation for logical locality.

The kvbm_metrics is correctly cloned and propagated into the OffloadManagerConfig, enabling metrics collection for logical locality operations.

265-272: LGTM! Metrics propagation for local locality.

The metrics propagation pattern is consistent with the logical locality implementation, ensuring uniform metrics collection across different locality types.

docs/guides/run_kvbm_in_trtllm.md (1)

81-82: LGTM! Documentation updated with consistent model references.

All model references have been consistently updated from deepseek-ai/DeepSeek-R1-Distill-Llama-8B to Qwen/Qwen3-0.6B across the guide. The smaller model size is likely better suited for demonstration and testing purposes.

Also applies to: 87-87, 102-102, 116-117, 136-136, 158-158

lib/bindings/python/rust/llm/block_manager.rs (1)

219-219: LGTM! Clean builder pattern implementation.

The kvbm_metrics field integration follows the established builder pattern consistently. The optional field, builder method, and config propagation are all properly implemented.

Also applies to: 246-252, 312-316

lib/runtime/src/metrics/prometheus_names.rs (1)

326-327: LGTM! Metric naming follows conventions.

The new OFFLOAD_BLOCKS_H2D constant aligns with the shift to block-level metrics and follows the established naming pattern. The separation of H2D (host-to-disk) and D2H (device-to-host) metrics provides clearer observability of the offload pipeline.

lib/llm/src/block_manager/offload.rs (1)

77-78: LGTM! Metrics properly wired through config.

The kvbm_metrics field is correctly added to both OffloadManagerConfig and OffloadManager, and properly propagated during initialization. The optional field design allows metrics to be disabled when not needed.

Also applies to: 101-102, 129-129

deploy/metrics/grafana_dashboards/grafana-kvbm-dashboard.json (2)

231-231: LGTM! Dashboard metrics align with code changes.

The Grafana dashboard correctly updates metric expressions and panel titles to reflect the shift from request-based to block-based tracking:

kvbm_offload_blocks_d2h for Device to Host offloads

kvbm_offload_blocks_h2d for Host to Disk offloads

The panel titles clearly communicate the transfer direction, improving observability.

Also applies to: 240-240, 327-327, 336-336

546-546: Good: Explicit refresh rate improves consistency.

Setting the refresh interval to "5s" instead of "auto" ensures consistent dashboard behavior across different Grafana deployments and user preferences.

docs/guides/run_kvbm_in_vllm.md (1)

83-83: Model availability confirmed. Qwen/Qwen3-0.6B is publicly accessible on Hugging Face; no changes needed.

lib/llm/src/block_manager/offload.rs

ziqifan617 · 2025-10-09T23:32:57Z

/ok to test f7047cb

Signed-off-by: Ziqi Fan <[email protected]>

ziqifan617 · 2025-10-10T17:29:12Z

/ok to test b4858c2

Signed-off-by: Ziqi Fan <[email protected]>

ziqifan617 requested review from a team as code owners October 9, 2025 21:54

pull-request-size bot added the size/L label Oct 9, 2025

github-actions bot added the feat label Oct 9, 2025

ziqifan617 requested a review from richardhuo-nv October 9, 2025 21:54

richardhuo-nv approved these changes Oct 9, 2025

View reviewed changes

coderabbitai bot reviewed Oct 9, 2025

View reviewed changes

copy-pr-bot bot temporarily deployed to GITLAB October 9, 2025 22:35 Inactive

ziqifan617 requested a review from hhzhang16 October 9, 2025 22:35

copy-pr-bot bot temporarily deployed to GITLAB October 9, 2025 22:35 Inactive

coderabbitai bot reviewed Oct 9, 2025

View reviewed changes

lib/llm/src/block_manager/offload.rs Show resolved Hide resolved

ziqifan617 enabled auto-merge (squash) October 9, 2025 22:39

copy-pr-bot bot temporarily deployed to GITLAB October 9, 2025 23:33 Inactive

hhzhang16 approved these changes Oct 10, 2025

View reviewed changes

ziqifan617 requested review from a team as code owners October 10, 2025 17:04

pull-request-size bot added size/XXL and removed size/L labels Oct 10, 2025

ziqifan617 disabled auto-merge October 10, 2025 17:05

ziqifan617 added 4 commits October 10, 2025 10:22

feat: add KVBM host to disk metrics | clean up dashboard

40252cd

Signed-off-by: Ziqi Fan <[email protected]>

cargo fmt

d8decea

Signed-off-by: Ziqi Fan <[email protected]>

fix

0af70d3

Signed-off-by: Ziqi Fan <[email protected]>

feat: add KVBM host to disk metrics | clean up dashboard

b4858c2

Signed-off-by: Ziqi Fan <[email protected]>

ziqifan617 force-pushed the ziqif/kvbm-metrics-update branch from f77d85f to b4858c2 Compare October 10, 2025 17:27

pull-request-size bot added size/L and removed size/XXL labels Oct 10, 2025

copy-pr-bot bot temporarily deployed to GITLAB October 10, 2025 17:29 Inactive

ziqifan617 removed request for a team October 10, 2025 17:30

ziqifan617 enabled auto-merge (squash) October 10, 2025 17:31

ziqifan617 merged commit ca67409 into main Oct 10, 2025
30 of 31 checks passed

ziqifan617 deleted the ziqif/kvbm-metrics-update branch October 10, 2025 18:17

coderabbitai bot mentioned this pull request Oct 10, 2025

feat: rm the old KVBM metrics | update G2 to G3 metrics collection #3561

Merged

ziqifan617 added a commit that referenced this pull request Oct 20, 2025

feat: add KVBM host to disk metrics | clean up dashboard (#3534)

41278e1

Signed-off-by: Ziqi Fan <[email protected]>

nv-tusharma pushed a commit that referenced this pull request Oct 20, 2025

feat: add KVBM host to disk metrics | clean up dashboard (#3534)

878102a

Signed-off-by: Ziqi Fan <[email protected]>

AryanBagade mentioned this pull request Nov 11, 2025

feat: Add output token counter to frontend metrics #4202

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add KVBM host to disk metrics | clean up dashboard #3534

feat: add KVBM host to disk metrics | clean up dashboard #3534

Uh oh!

ziqifan617 commented Oct 9, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Oct 9, 2025

Uh oh!

ziqifan617 commented Oct 9, 2025

Uh oh!

coderabbitai bot commented Oct 9, 2025 •

edited

Loading

Uh oh!

richardhuo-nv left a comment

Uh oh!

coderabbitai bot commented Oct 9, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

ziqifan617 commented Oct 9, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

ziqifan617 commented Oct 9, 2025

Uh oh!

ziqifan617 commented Oct 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: add KVBM host to disk metrics | clean up dashboard #3534

feat: add KVBM host to disk metrics | clean up dashboard #3534

Uh oh!

Conversation

ziqifan617 commented Oct 9, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Oct 9, 2025

Uh oh!

ziqifan617 commented Oct 9, 2025

Uh oh!

coderabbitai bot commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks

Uh oh!

richardhuo-nv left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot commented Oct 9, 2025

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

ziqifan617 commented Oct 9, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ziqifan617 commented Oct 9, 2025

Uh oh!

ziqifan617 commented Oct 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ziqifan617 commented Oct 9, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 9, 2025 •

edited

Loading