Skip to content

[Fix] Enable mm_processor_cache with vision LoRA#31927

Merged
DarkLight1337 merged 1 commit intovllm-project:mainfrom
prashanth058:lora-mm-processor-cache-fix
Jan 8, 2026
Merged

[Fix] Enable mm_processor_cache with vision LoRA#31927
DarkLight1337 merged 1 commit intovllm-project:mainfrom
prashanth058:lora-mm-processor-cache-fix

Conversation

@prashanth058
Copy link
Copy Markdown
Contributor

@prashanth058 prashanth058 commented Jan 7, 2026

Purpose

Enables multi-modal processor cache when enable_tower_connector_lora is active. #26674 prefixes identifier with LoRA hash to avoid incorrect encoder cache hits(#26674 (comment)) but processor cache should be shared across LoRAs.

Solution:

  • Add mm_hash field to MultiModalFeatureSpec to store the base hash (without LoRA prefix)
  • Processor cache uses mm_hash for cache lookups (shared across LoRAs)
  • Encoder cache continues using identifier (LoRA-prefixed, separate per LoRA)

@mergify mergify bot added multi-modality Related to multi-modality (#4194) v1 labels Jan 7, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enables the multi-modal processor cache for vision LoRA, which was previously unsupported. The changes introduce a mm_hash field to MultiModalFeatureSpec to allow sharing the processor cache across different LoRAs by using a base hash for the multi-modal data, independent of the LoRA. The encoder cache remains LoRA-specific by using a prefixed identifier. Additionally, the LoRA prefix for the identifier is updated for robustness.

The overall approach is sound and the new test case correctly validates the cache sharing logic. However, I've identified a critical issue with the new method of generating the LoRA prefix for cache identifiers. The use of Python's built-in hash() function is not deterministic across processes, which can lead to cache inconsistencies in a distributed environment. I've provided a suggestion to use a deterministic encoding method instead.

Signed-off-by: prashanth058 <prashanth.dannamaneni@uipath.com>
@prashanth058 prashanth058 force-pushed the lora-mm-processor-cache-fix branch from 7afd479 to ab8710d Compare January 7, 2026 22:29
Copy link
Copy Markdown
Member

@Isotr0py Isotr0py left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Isotr0py Isotr0py added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 8, 2026
@DarkLight1337 DarkLight1337 merged commit d3235cb into vllm-project:main Jan 8, 2026
52 checks passed
yugong333 pushed a commit to yugong333/vllm that referenced this pull request Jan 9, 2026
Signed-off-by: prashanth058 <prashanth.dannamaneni@uipath.com>
akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026
Signed-off-by: prashanth058 <prashanth.dannamaneni@uipath.com>
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
Signed-off-by: prashanth058 <prashanth.dannamaneni@uipath.com>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
Signed-off-by: prashanth058 <prashanth.dannamaneni@uipath.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

multi-modality Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants