[Multimodal] Simplify ViT CUDA graph interfaces by Isotr0py · Pull Request #41234 · vllm-project/vllm

Isotr0py · 2026-04-29T13:30:00Z

Purpose

To support ViT cuda graph, we need to implement about 11 new class methods at model implementation, which has made it much messy.
This PR consolidates get_encoder_cudagraph_num_items, get_encoder_cudagraph_per_item_output_tokens and get_encoder_cudagraph_per_item_input_sizes into one get_encoder_cudagraph_item_specs function.
Also consolidate encoder_cudagraph_forward and encoder_eager_forward into one encoder_forward function.

Test Plan

pytest -s -v tests/v1/cudagraph/test_encoder_cudagraph.py

pytest -s -v tests/models/multimodal/generation/test_vit_cudagraph.py

Test Result

All tests should pass.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

gemini-code-assist

Code Review

This pull request refactors the encoder CUDA graph interface to simplify model implementations. It replaces several specific methods, such as get_input_modality and get_max_frames_per_video, with a unified get_encoder_cudagraph_item_specs method and a consolidated encoder_forward method. The EncoderCudaGraphManager now auto-detects input keys based on configuration. Feedback suggests using ValueError instead of AssertionError for unreachable code in qwen3_vl.py and improving the specificity of error messages in the CUDA graph manager to aid debugging.

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

Isotr0py · 2026-05-06T09:03:43Z

cc @shen-shanshan @b-mu about ViT CUDA graph cleanup.

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

shen-shanshan · 2026-05-07T02:34:42Z

        # actual inputs may be smaller. Zero then slice-copy so padded
        # positions are invisible to attention (cu_seqlens masks them out).
-        input_key = self.config.input_key_by_modality[
+        input_key = input_key = self.config.input_key_by_modality[


This input_key = input_key = ... is changed by mistake?

Ooops, good catch!

shen-shanshan · 2026-05-07T02:39:00Z

+    def get_encoder_cudagraph_item_specs(
        self,
        mm_kwargs: dict[str, Any],
-    ) -> int:
-        """Return the number of items (e.g. images) in the batch."""
-        ...
-
-    def get_encoder_cudagraph_per_item_output_tokens(
-        self,
-        mm_kwargs: dict[str, Any],
-    ) -> list[int]:
-        """Return output token count for each item.
-
-        Used for greedy packing and DP load balancing.
-        """
-        ...
-
-    def get_encoder_cudagraph_per_item_input_sizes(
-        self,
-        mm_kwargs: dict[str, Any],
-    ) -> list[int]:
-        """Return input size (e.g. patch count) for each item.
+    ) -> list["EncoderItemSpec"]:


Since #40830 has been merged, maybe we should also make Qwen2.5-VL adapt to these new interfaces.

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

draft

696f7df

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

mergify Bot added qwen Related to Qwen models nvidia v1 labels Apr 29, 2026

github-project-automation Bot added this to NVIDIA Apr 29, 2026

gemini-code-assist Bot reviewed Apr 29, 2026

View reviewed changes

Comment thread vllm/model_executor/models/qwen3_vl.py

Comment thread vllm/v1/worker/encoder_cudagraph.py Outdated

Isotr0py added 6 commits April 29, 2026 22:11

revert

01966b0

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

revert

10c3627

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

revert and clean

6dbd8e0

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

revert and clean

72ee3ff

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

clean

545dac2

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

fix tests

a74c7a0

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

Isotr0py marked this pull request as ready for review May 6, 2026 09:01

Isotr0py requested review from njhill, sighingnow and vadiklyutiy as code owners May 6, 2026 09:01

claude Bot reviewed May 6, 2026

View reviewed changes

Merge branch 'main' into refactor-vit-cg

e7c0f62

remove unnecessary create_new_process_for_each_test

48b3e75

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

Isotr0py requested review from DarkLight1337 and ywang96 as code owners May 6, 2026 09:12

mergify Bot added the multi-modality Related to multi-modality (#4194) label May 6, 2026

shen-shanshan mentioned this pull request May 7, 2026

[RFC]: Support ViT Full CUDA Graph (Tracker) #38175

Open

20 tasks

shen-shanshan reviewed May 7, 2026

View reviewed changes

Isotr0py added 2 commits May 7, 2026 10:42

oops

963fd2b

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

update qwen2.5-vl

3b8648f

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Multimodal] Simplify ViT CUDA graph interfaces#41234

[Multimodal] Simplify ViT CUDA graph interfaces#41234
Isotr0py wants to merge 11 commits intovllm-project:mainfrom
Isotr0py:refactor-vit-cg

Isotr0py commented Apr 29, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

claude Bot left a comment

Uh oh!

Isotr0py commented May 6, 2026

Uh oh!

shen-shanshan May 7, 2026

Uh oh!

Isotr0py May 7, 2026

Uh oh!

shen-shanshan May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Isotr0py commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

Isotr0py commented May 6, 2026

Uh oh!

shen-shanshan May 7, 2026

Choose a reason for hiding this comment

Uh oh!

Isotr0py May 7, 2026

Choose a reason for hiding this comment

Uh oh!

shen-shanshan May 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Isotr0py commented Apr 29, 2026 •

edited

Loading