[BugFix] Fix 3D rope in transformers backend by zucchini-nlp · Pull Request #35097 · vllm-project/vllm

zucchini-nlp · 2026-02-23T12:01:13Z

We will require mm_token_type_ids to prepare 3D position ids in Qwen-VL model family after huggingface/transformers#43972 is merged. This PR makes sure that transformers backend keeps functioning and is forward/backwards compatible. Tested with tests/models/multimodal/generation/test_common.py::test_single_image_models[qwen2_5_vl-transformers-test_case53] that the args are passed correctly and rope index can be computed

Also, fixes the glmv model with video input to be consistent with transformers, video timestamps are usually kept as float to get finegrained information about each frame. It will fix the currently failing Glm-OCR processing test in vLLM

Signed-off-by: raushan <raushan@huggingface.co>

gemini-code-assist

Code Review

The pull request addresses a bug in the GLM-OCR processing test in vLLM by ensuring that video timestamps are kept as floats, consistent with transformers. It also makes changes to support mm_token_type_ids for 3D position IDs in the Qwen-VL model family, ensuring forward/backward compatibility. The changes primarily involve modifying how video timestamps are handled and updating multimodal processing logic to incorporate mm_token_type_ids.

Signed-off-by: raushan <raushan@huggingface.co>

hmellor · 2026-02-25T08:07:38Z

vllm/model_executor/models/glm4_1v.py

cc @Isotr0py for this change

hmellor · 2026-02-25T08:10:35Z

vllm/model_executor/models/transformers/multimodal.py

@@ -472,10 +468,16 @@ def get_mrope_input_positions(
            video_grid_thw
        )

+        # In v4 this utility didn't accept any `kwargs`, thus we filter


I don't understand this comment.

Will mm_token_type_ids only exist in v5 and we keep kwargs empty otherwise because get_rope_index would error in v4 if we explicitly passed the None value?

hmellor · 2026-02-25T08:40:06Z

https://buildkite.com/vllm/ci/builds/52883/steps/canvas?sid=019c8f74-ce1c-49ab-b0d0-31e37cfc8519&tab=output seems relevant

Isotr0py · 2026-02-25T08:52:49Z

vllm/model_executor/models/glm4_1v.py

@@ -1011,7 +990,7 @@ def _get_video_second_idx_glm4v(
            uniq.append(uniq[-1])
        frame_indices = uniq

-        full_second_idxs = [int(idx / video_fps) for idx in frame_indices]
+        full_second_idxs = [idx / video_fps for idx in frame_indices]


in transformers we use the same timestamps format for all GLM models, so I am relying on it. Do you want to check-in with GLM authors, I can ask in slack?

cc @zRzRzRzRzRzRzR

Signed-off-by: raushan <raushan@huggingface.co>

Signed-off-by: raushan <raushan@huggingface.co> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Sergey Zinchenko <sergey.zinchenko.rnd@gmail.com>

Signed-off-by: raushan <raushan@huggingface.co> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: EanWang211123 <wangyiheng@sangfor.com.cn>

AndreasKaratzas · 2026-03-02T06:30:12Z

This PR introduced a regression for this test:

pytest -s -v tests/models/multimodal/generation/test_common.py::test_single_image_models[qwen2_5_vl-transformers-test_case53]

The test is part of Multi-Modal Models (Extended) 2. I have already put up a fix for that (#35711).

hmellor · 2026-03-02T13:26:43Z

Thanks for letting us know, I'll look into your fix

Signed-off-by: raushan <raushan@huggingface.co> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

zucchini-nlp added 2 commits February 23, 2026 12:56

mrope-index

dce95b8

Signed-off-by: raushan <raushan@huggingface.co>

timestampls

e75a62b

Signed-off-by: raushan <raushan@huggingface.co>

zucchini-nlp requested a review from hmellor as a code owner February 23, 2026 12:01

github-project-automation bot added this to Transformers backend Feb 23, 2026

github-project-automation bot moved this to Todo in Transformers backend Feb 23, 2026

mergify bot added the bug Something isn't working label Feb 23, 2026

gemini-code-assist bot reviewed Feb 23, 2026

View reviewed changes

zucchini-nlp added 2 commits February 24, 2026 11:49

fix

3836c06

Signed-off-by: raushan <raushan@huggingface.co>

bc for v4!

c852324

Signed-off-by: raushan <raushan@huggingface.co>

hmellor added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 24, 2026

hmellor mentioned this pull request Feb 24, 2026

Update to transformers v5 #30566

Open

hmellor reviewed Feb 25, 2026

View reviewed changes

vllm/model_executor/models/glm4_1v.py

Copy link

Member

hmellor Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @Isotr0py for this change

hmellor reviewed Feb 25, 2026

View reviewed changes

Isotr0py reviewed Feb 25, 2026

View reviewed changes

zucchini-nlp and others added 5 commits February 25, 2026 10:21

update comment

cc0d738

Signed-off-by: raushan <raushan@huggingface.co>

Merge branch 'main' into qwen2-vl

c6f5eae

revert ig?

4d1efbf

Signed-off-by: raushan <raushan@huggingface.co>

Merge branch 'main' into qwen2-vl

e7596be

revert this time

108074e

Signed-off-by: raushan <raushan@huggingface.co>

hmellor enabled auto-merge (squash) February 27, 2026 15:41

Isotr0py approved these changes Feb 27, 2026

View reviewed changes

hmellor merged commit fd6de37 into vllm-project:main Feb 27, 2026
59 checks passed

github-project-automation bot moved this from Todo to Done in Transformers backend Feb 27, 2026

AndreasKaratzas mentioned this pull request Mar 2, 2026

[Bugfix] Guard mm_token_type_ids kwarg in get_mrope_input_positions #35711

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BugFix] Fix 3D rope in transformers backend#35097

[BugFix] Fix 3D rope in transformers backend#35097
hmellor merged 9 commits intovllm-project:mainfrom
zucchini-nlp:qwen2-vl

zucchini-nlp commented Feb 23, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

hmellor Feb 25, 2026

Uh oh!

hmellor Feb 25, 2026

Uh oh!

hmellor commented Feb 25, 2026

Uh oh!

Isotr0py Feb 25, 2026

Uh oh!

zucchini-nlp Feb 25, 2026

Uh oh!

Isotr0py Feb 25, 2026

Uh oh!

Uh oh!

AndreasKaratzas commented Mar 2, 2026

Uh oh!

hmellor commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

zucchini-nlp commented Feb 23, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

hmellor Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

hmellor Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

hmellor commented Feb 25, 2026

Uh oh!

Isotr0py Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Isotr0py Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AndreasKaratzas commented Mar 2, 2026

Uh oh!

hmellor commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zucchini-nlp commented Feb 23, 2026 •

edited by github-actions bot

Loading