Skip to content

Conversation

@jthomson04
Copy link
Contributor

@jthomson04 jthomson04 commented Aug 19, 2025

Summary by CodeRabbit

  • Documentation
    • Updated setup, build, and run instructions to use vLLM with KVBM enabled via flags.
    • Simplified configuration by removing the need to export a KVBM manager environment variable.
    • Revised serve command to include KV transfer configuration for clearer, end-to-end guidance.
    • Confirmed the example request remains unchanged.
    • Improved step-by-step guidance to ensure smoother execution for users.

Signed-off-by: jthomson04 <[email protected]>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 19, 2025

Walkthrough

Documentation updated to shift KV-BM usage from a dedicated kvbm framework to vLLM with KV transfer configuration. Build and run commands now target the vLLM framework with an enable flag. Environment variable-based KV manager enablement was removed. The serve command now includes an explicit kv-transfer-config using DynamoConnector.

Changes

Cohort / File(s) Summary
Docs: vLLM + KV-BM guide
docs/guides/run_kvbm_in_vllm.md
Updated build/run to use --framework vllm (with --enable-kvbm on build). Removed DYN_KVBM_MANAGER export. Modified vllm serve to include --kv-transfer-config specifying DynamoConnector, kv_both, and module path. Curl example unchanged.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor U as User
  participant C as Container
  participant V as vLLM Server
  participant D as DynamoConnector
  participant K as KV Backend

  U->>C: ./container/build.sh --framework vllm --enable-kvbm
  U->>C: ./container/run.sh --framework vllm -it --mount-workspace --use-nixl-gds
  U->>V: vllm serve --kv-transfer-config {kv_connector:DynamoConnector, kv_role:kv_both, ...} model
  Note right of V: Env var DYN_KVBM_MANAGER no longer used

  U->>V: HTTP request (inference)
  V->>D: Initialize KV transfer via connector
  D->>K: Read/Write KV as per kv_role=kv_both
  V-->>U: Inference response
Loading

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

Poem

In the vLLM fields I hop with glee,
Flags flipped neatly: kvbm set free.
No env charms, just configs clear,
Dynamo whispers, KV draws near.
Requests arrive, results take wing—
Thump-thump! says rabbit, “Serve the thing.” 🐇✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
docs/guides/run_kvbm_in_vllm.md (3)

20-20: Fix typos and improve clarity in the intro sentence.

Minor wording/grammar issues.

Apply this diff:

-This guide explains how to leverage KVBM (KV Block Manager) to mange KV cache and do KV offloading in vLLM.
+This guide explains how to leverage KVBM (KV Block Manager) to manage the KV cache and perform KV offloading in vLLM.

50-60: Fix typos in the example prompt (user-facing).

There are a couple of misspellings in the example JSON payload.

Apply this diff (only the content string is changed; JSON remains valid):

-        "content": "In the heart of Eldoria, an ancient land of boundless magic and mysterious creatures, lies the long-forgotten city of Aeloria. Once a beacon of knowledge and power, Aeloria was buried beneath the shifting sands of time, lost to the world for centuries. You are an intrepid explorer, known for your unparalleled curiosity and courage, who has stumbled upon an ancient map hinting at ests that Aeloria holds a secret so profound that it has the potential to reshape the very fabric of reality. Your journey will take you through treacherous deserts, enchanted forests, and across perilous mountain ranges. Your Task: Character Background: Develop a detailed background for your character. Describe their motivations for seeking out Aeloria, their skills and weaknesses, and any personal connections to the ancient city or its legends. Are they driven by a quest for knowledge, a search for lost familt clue is hidden."
+        "content": "In the heart of Eldoria, an ancient land of boundless magic and mysterious creatures, lies the long-forgotten city of Aeloria. Once a beacon of knowledge and power, Aeloria was buried beneath the shifting sands of time, lost to the world for centuries. You are an intrepid explorer, known for your unparalleled curiosity and courage, who has stumbled upon an ancient map suggesting that Aeloria holds a secret so profound that it has the potential to reshape the very fabric of reality. Your journey will take you through treacherous deserts, enchanted forests, and across perilous mountain ranges. Your task: Character Background: Develop a detailed background for your character. Describe their motivations for seeking out Aeloria, their skills and weaknesses, and any personal connections to the ancient city or its legends. Are they driven by a quest for knowledge, a search for lost family, or another hidden clue?"

28-31: Update etcd compose path and service name in documentation

The compose file has moved and now defines the service as etcd-server. Please update the snippet in docs/guides/run_kvbm_in_vllm.md accordingly:

• File: docs/guides/run_kvbm_in_vllm.md (lines 28–31)
Replace:

# start up etcd for KVBM leader/worker registration and discovery
docker compose -f deploy/metrics/docker-compose.yml up -d

With:

# start up etcd for KVBM leader/worker registration and discovery
docker compose -f deploy/docker-compose.yml up -d etcd-server
🧹 Nitpick comments (3)
docs/guides/run_kvbm_in_vllm.md (3)

32-37: Note on removed KVBM manager env: add a brief migration note.

Since DYN_KVBM_MANAGER=kvbm is no longer needed, consider adding a one-liner explaining that KVBM is now activated via vLLM’s --kv-transfer-config instead of an env var.

Example addition after Line 37:

+Note: Prior versions used `export DYN_KVBM_MANAGER=kvbm`. This is no longer required—KVBM is enabled via the `--kv-transfer-config` flag in `vllm serve`.

38-45: Clarify units and default behavior for KVBM environment variables

Verified that DYN_KVBM_CPU_CACHE_GB and DYN_KVBM_DISK_CACHE_GB are indeed read by the vLLM integration (compute_num_blocks in leader.rs). To improve clarity, add a note in the guide:

Location: docs/guides/run_kvbm_in_vllm.md, after line 45

 docs/guides/run_kvbm_in_vllm.md
@@ Line 45
+Note: Values are in gigabytes (GB). If unset, KVBM falls back to its built-in defaults. Adjust CPU/disk budgets based on your workload and node capacity.

47-47: Confirmed connector module path; optional explicit port

Verified that dynamo.llm.vllm_integration.connector exists and exports DynamoConnector. The JSON quoting is correct for bash. For clarity, you may optionally specify the port in the vllm serve command:

• Location: docs/guides/run_kvbm_in_vllm.md:47

Suggested diff:

-vllm serve --kv-transfer-config '{"kv_connector":"DynamoConnector","kv_role":"kv_both","kv_connector_module_path":"dynamo.llm.vllm_integration.connector"}' deepseek-ai/DeepSeek-R1-Distill-Llama-8B
+vllm serve --port 8000 --kv-transfer-config '{"kv_connector":"DynamoConnector","kv_role":"kv_both","kv_connector_module_path":"dynamo.llm.vllm_integration.connector"}' deepseek-ai/DeepSeek-R1-Distill-Llama-8B

Leave as-is if you prefer using the default port.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 86a4a58 and 83ede12.

📒 Files selected for processing (1)
  • docs/guides/run_kvbm_in_vllm.md (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (1)
docs/guides/run_kvbm_in_vllm.md (1)

33-33: Confirmed: build.sh supports --framework vllm and --enable-kvbm

Both flags are explicitly handled in container/build.sh, so the documentation snippet is accurate:

  • container/build.sh defines FRAMEWORKS=(["VLLM"]=…) and uppercases input (lines 52, 305–311), and the help text lists “vllm” as a valid framework (line 370).
  • The --enable-kvbm flag is parsed in the case block (lines 278–280) and shown in usage output (line 386).

Docs can remain as-is.

@jthomson04 jthomson04 merged commit c0eaed4 into main Aug 19, 2025
13 of 15 checks passed
@jthomson04 jthomson04 deleted the jthomson04/fix-kvbm-guide branch August 19, 2025 23:36
hhzhang16 pushed a commit that referenced this pull request Aug 27, 2025
Signed-off-by: jthomson04 <[email protected]>
Signed-off-by: Hannah Zhang <[email protected]>
nv-anants pushed a commit that referenced this pull request Aug 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants