[MM][CG] Support ViT CG for Qwen2-VL#41736
[MM][CG] Support ViT CG for Qwen2-VL#41736johncalesp wants to merge 5 commits intovllm-project:mainfrom
Conversation
Signed-off-by: John Calderon <jcalderon@nvidia.com>
|
Documentation preview: https://vllm--41736.org.readthedocs.build/en/41736/ |
|
Hi @johncalesp, the pre-commit checks have failed. Please run: uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
There was a problem hiding this comment.
Code Review
This pull request enables CUDA Graph support for the Qwen2-VL model by implementing the SupportsEncoderCudaGraph protocol and adding the necessary metadata preparation logic. It also updates the documentation and includes a new test configuration for the model. A potential IndexError was identified in the prepare_encoder_metadata method when handling empty inputs in multi-GPU environments, which can be resolved by ensuring the input array is correctly reshaped.
|
@b-mu can you help me and review this PR when you get a chance, thx!. |
|
LGTM. |
|
LGTM |
|
@shen-shanshan can we set the ready tag to run the CI? |
I don't have the authority to add label... |
Purpose
Enable Cudagraph for ViT for Qwen2.5-VL following the precedence from #35963.
Test Plan
Added record in the file
tests/models/multimodal/generation/test_vit_cudagraph.pyTest Result
E2E
Test on H100
Engine command
Benchmark command
Result no CG:
Result CG: