feat: disable kv events in vLLM when lora is enabled by biswapanda · Pull Request #4128 · ai-dynamo/dynamo

biswapanda · 2025-11-05T19:47:31Z

Overview:

Disable kv events in vLLM when lora is enabled.

There is a bug in the KV cache block storage system where the code was incorrectly trying to access request.lora_request.id instead of the correct request.lora_request.adapter_id property.

Bug is fixed in vllm-project/vllm#27728 but not released yet.

DEP-588

Details:

Fixed KV events with LoRA: Added upstream bug workaround that disables KV events when LoRA is enabled, preventing crashes until vLLM PR #27728 is released
Improved KV publisher setup: Added null check to prevent setup when kv_events_config is None

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes GitHub issue: #xxx

Summary by CodeRabbit

Bug Fixes
- Improved KV events publishing configuration handling to prevent incompatible feature combinations.
- KV events publishing is now properly disabled when both prefix caching and LORA features are enabled simultaneously.
- Added proper handling for cases where KV events configuration is not explicitly provided.

coderabbitai · 2025-11-05T19:52:37Z

Walkthrough

Two files in the vLLM integration add guard conditions to KV events publishing: one prevents publishing when LORA is enabled during prefix caching, and the other adds an early exit when no KV events config is provided.

Changes

Cohort / File(s)	Summary
KV Events Configuration Guards `components/src/dynamo/vllm/args.py`, `components/src/dynamo/vllm/main.py`	Added guard conditions to disable KV events publishing: first returns None from `create_kv_events_config` when LORA is enabled during prefix caching, with explanatory comments; second adds early exit in `setup_kv_event_publisher` when `kv_events_config` is None, bypassing publisher setup and associated logging.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Both changes are straightforward guard/early-exit patterns with localized scope
Check that the LORA+prefix-caching condition is correct and the None propagation doesn't mask legitimate configuration paths
Verify the early exit in setup_kv_event_publisher doesn't skip necessary initialization steps

Poem

🐰 A hop, skip, and guard we place,
When LORA meets cache's face—
We say "not now!" with early return,
Let KV publishers wait their turn,
With None as sentinel, we skip with grace! 🎯

Pre-merge checks

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly matches the main change: adding logic to disable KV events when LoRA is enabled in vLLM, as evidenced by the new guard in create_kv_events_config that returns None when LORA is enabled.
Description check	✅ Passed	The PR description covers the overview, details, and related issues sections but lacks specific guidance on where reviewers should start examining the code.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4765d88 and 5c8a8e0.

📒 Files selected for processing (2)

components/src/dynamo/vllm/args.py (1 hunks)
components/src/dynamo/vllm/main.py (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: trtllm (amd64)
GitHub Check: trtllm (arm64)
GitHub Check: sglang (arm64)
GitHub Check: vllm (amd64)
GitHub Check: sglang (amd64)
GitHub Check: vllm (arm64)
GitHub Check: operator (arm64)
GitHub Check: operator (amd64)
GitHub Check: Build and Test - dynamo

🔇 Additional comments (2)

components/src/dynamo/vllm/main.py (1)
153-154: Early exit is correct but insufficient on its own.

The None check correctly prevents accessing .endpoint on a None value at line 170. However, this check alone doesn't protect against the case where a user provides --kv-events-config while LoRA is enabled, since that config would not be None.

As noted in the args.py review, consider adding an explicit LoRA check here as well:
     if config.is_decode_worker:
         logger.info("Skipping KV event publisher setup for decode worker")
         return None
 
+    # Skip KV events when LoRA is enabled due to upstream bug
+    if config.engine_args.enable_lora:
+        logger.info("Skipping KV event publisher setup due to LoRA being enabled")
+        return None
+
     if config.engine_args.kv_events_config is None:
         return None
This provides defense-in-depth and makes the intent clearer at the usage site.
components/src/dynamo/vllm/args.py (1)

387-391: The review comment conflates unrelated code paths. The LoRA workaround only affects kv_events_config created by create_kv_events_config, not external configs.

The consolidator_config.py and multimodal worker.py files use kv_events_config from independent sources, not from create_kv_events_config. The only code consuming the result of create_kv_events_config is components/src/dynamo/vllm/main.py (line 170), which properly guards access with an explicit None check (line 153-154).

The LoRA workaround (lines 387-391) correctly returns None only for the config it creates, preventing unsafe access in the calling code. Pre-existing unguarded accesses in other modules are not introduced by or related to these changes.

Likely an incorrect or invalid review comment.

components/src/dynamo/vllm/args.py

Signed-off-by: Daiyaan <darfeen@nvidia.com>

biswapanda self-assigned this Nov 5, 2025

biswapanda requested review from a team as code owners November 5, 2025 19:47

pull-request-size bot added the size/XS label Nov 5, 2025

biswapanda changed the title ~~[feat] disable kv events in vLLM when lora is enabled~~ feat: disable kv events in vLLM when lora is enabled Nov 5, 2025

github-actions bot added the feat label Nov 5, 2025

biswapanda requested a review from alec-flowers November 5, 2025 19:51

coderabbitai bot reviewed Nov 5, 2025

View reviewed changes

components/src/dynamo/vllm/args.py Outdated Show resolved Hide resolved

pull-request-size bot added size/S and removed size/XS labels Nov 5, 2025

copy-pr-bot bot temporarily deployed to GITLAB November 5, 2025 23:40 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 5, 2025 23:48 Inactive

biswapanda enabled auto-merge (squash) November 6, 2025 01:03

mohammedabdulwahhab approved these changes Nov 6, 2025

View reviewed changes

components/src/dynamo/vllm/args.py Outdated Show resolved Hide resolved

copy-pr-bot bot temporarily deployed to GITLAB November 10, 2025 14:35 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 10, 2025 14:36 Inactive

biswapanda force-pushed the bis/lora-vllm-1 branch from eb709a1 to f68bc7e Compare November 10, 2025 14:37

copy-pr-bot bot temporarily deployed to GITLAB November 10, 2025 14:37 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 10, 2025 14:41 Inactive

alec-flowers reviewed Nov 10, 2025

View reviewed changes

components/src/dynamo/vllm/args.py Outdated Show resolved Hide resolved

copy-pr-bot bot temporarily deployed to GITLAB November 10, 2025 17:24 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 10, 2025 17:25 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 10, 2025 17:34 Inactive

biswapanda added 6 commits November 10, 2025 09:36

disable kv events in vLLM when lora is enabled

356075f

update

4fc59a1

update

499aaa7

update message based on comment

7ced1b4

update message based on comment

f0216cd

update message based on comment

f8f8605

biswapanda force-pushed the bis/lora-vllm-1 branch from d6e8e70 to f8f8605 Compare November 10, 2025 17:36

copy-pr-bot bot temporarily deployed to GITLAB November 10, 2025 17:36 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 10, 2025 17:48 Inactive

alec-flowers approved these changes Nov 10, 2025

View reviewed changes

biswapanda merged commit 7802f96 into main Nov 10, 2025
29 of 39 checks passed

biswapanda deleted the bis/lora-vllm-1 branch November 10, 2025 20:01

daiyaanarfeen pushed a commit that referenced this pull request Nov 14, 2025

feat: disable kv events in vLLM when lora is enabled (#4128)

f4b6cd3

Signed-off-by: Daiyaan <darfeen@nvidia.com>

biswapanda mentioned this pull request Dec 8, 2025

feat: KV aware LoRA request routing for vllm #4810

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: disable kv events in vLLM when lora is enabled#4128

feat: disable kv events in vLLM when lora is enabled#4128
biswapanda merged 6 commits intomainfrom
bis/lora-vllm-1

biswapanda commented Nov 5, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Nov 5, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

biswapanda commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

biswapanda commented Nov 5, 2025 •

edited

Loading

coderabbitai bot commented Nov 5, 2025 •

edited

Loading