Skip to content

Conversation

@DarkLight1337
Copy link
Member

@DarkLight1337 DarkLight1337 commented Nov 10, 2025

Purpose

Allow each model to extract their own arguments from mm_features to avoid bloating the argument list. Also this enables us to use mm_positions (PlaceholderRange) to build the M-RoPE positions instead of having to search through input_ids again (to be done in future PRs).

I've also removed the hf_config argument as it's accessible from the model instance already.

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
@DarkLight1337 DarkLight1337 marked this pull request as ready for review November 10, 2025 16:41
@DarkLight1337 DarkLight1337 moved this from Blocked to In Progress in Multi-modality Core Nov 10, 2025
Comment on lines +892 to +896
model = self.get_model()
assert supports_mrope(model), "M-RoPE support is not implemented."

req_state.mrope_positions, req_state.mrope_position_delta = (
self.model.get_mrope_input_positions(
model.get_mrope_input_positions(
Copy link
Contributor

@lgeiger lgeiger Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of interest, is there a reason for not keeping self.model here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It helps type checker to infer the type of get_mrope_input_positions which enables autocompletion as well.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@DarkLight1337
Copy link
Member Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the get_mrope_input_positions method across multiple models to accept mm_features directly. This is a great change that simplifies the method signature and centralizes argument extraction, improving code clarity and maintainability. The changes are applied consistently, and the removal of the hf_config argument is also a good cleanup.

However, I've identified a couple of issues with the implementation. There's a recurring logic error in ernie45_vl.py, glm4_1v.py, and glm4v.py where image_grid_thw and video_grid_thw are not correctly converted from tensors to lists. Additionally, there's a critical bug in qwen3_omni_moe_thinker.py that incorrectly handles the second_per_grid_ts parameter, potentially leading to incorrect M-RoPE position calculations. I've provided suggestions to fix these issues.

Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
@DarkLight1337
Copy link
Member Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the get_mrope_input_positions method across multiple models to accept a single mm_features argument. This is a positive change that simplifies the method signatures and centralizes feature extraction logic, improving code maintainability and extensibility. The removal of the redundant hf_config argument is also a good cleanup. My review includes one high-severity comment to restore an explicit error check that was removed, which could prevent silent failures.

Signed-off-by: DarkLight1337 <[email protected]>
@DarkLight1337
Copy link
Member Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the get_mrope_input_positions method across multiple models to accept mm_features directly, simplifying the method signature and centralizing argument extraction. This is a good improvement for code clarity and maintainability. However, I've found a critical issue in several models where the data type of image_grid_thw and video_grid_thw has changed from a list of lists to a list of tensors, which will cause runtime errors. I've provided suggestions to fix this by converting the tensors to lists.

Signed-off-by: DarkLight1337 <[email protected]>
@DarkLight1337
Copy link
Member Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the get_mrope_input_positions method across multiple models to simplify its signature. Instead of passing numerous individual arguments related to multi-modal features, it now accepts a single mm_features list. Each model is now responsible for extracting the necessary information from this list, which is a cleaner design. A new helper method MultiModalFeatureSpec.gather_kwargs has been introduced to centralize the logic for extracting these features. Additionally, the redundant hf_config argument has been removed, as it's already accessible via self.config in the model instances. The changes are consistently applied and improve code clarity and maintainability. I have reviewed the changes and found no issues.

Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
@DarkLight1337
Copy link
Member Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the get_mrope_input_positions method across multiple multimodal models. The change simplifies the method signature by passing a list of MultiModalFeatureSpec objects instead of a long list of individual arguments. The models are updated to extract the necessary information from this new mm_features parameter. This is a good refactoring that improves code clarity and maintainability by moving data extraction logic closer to where it's used. The implementation is consistent and correct across all affected models. I have no major concerns.

@DarkLight1337
Copy link
Member Author

Should be good now

@DarkLight1337 DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 10, 2025
@DarkLight1337
Copy link
Member Author

Unblocking extended MM tests just to be sure

Signed-off-by: DarkLight1337 <[email protected]>
Copy link
Member

@Isotr0py Isotr0py left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@DarkLight1337 DarkLight1337 merged commit afffd3c into vllm-project:main Nov 11, 2025
55 checks passed
@DarkLight1337 DarkLight1337 deleted the mrope-mm-features branch November 11, 2025 13:14
@github-project-automation github-project-automation bot moved this from In Progress to Done in Multi-modality Core Nov 11, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Nov 13, 2025
wangxiyuan added a commit to vllm-project/vllm-ascend that referenced this pull request Nov 26, 2025
Bump vLLM version to v0.11.2

What's broken and changed by vLLM:
1. structured_output is broken by
vllm-project/vllm#26866
2. get_mrope_input_positions is broken by
vllm-project/vllm#28399
3. graph mode is broken by
vllm-project/vllm#25110 we'll upgrade torch to
2.8 to fix the problem later
4. embedding is broken by
vllm-project/vllm#27583
5. `get_attn_backend_cls` and attention backend is broken are broken by
vllm-project/vllm#28534
6. spec decode is broken by
vllm-project/vllm#28771
7. sp feature is broken by
vllm-project/vllm#27126
8. mtp is broken by vllm-project/vllm#27922
9. lora is broken by vllm-project/vllm#21068
10. execute_model is broken by
vllm-project/vllm#26866
11. `VLLM_DISABLE_SHARED_EXPERTS_STREAM` env is broken by
vllm-project/vllm#28159
12. kv cahe is broken by vllm-project/vllm#27753
13. dp is broken by vllm-project/vllm#25110

 
What's broken and changed by ourself:
1. qwen vl is broken by vllm-project/vllm#28455
We'll remove model files in the future to avoid this kind of error
2. Engine core is broken by
vllm-project/vllm#23691 We'll remove the patch
file in the future.
3. Ascend scheduler is broken by
vllm-project/vllm#28733 We'll remove ascend
scheudler later.
4. qwen3-next is broken by
vllm-project/vllm#28083 We'll remove model files
in the future to avoid this kind of error
5. qwen vl is broken by vllm-project/vllm#27764.
We'll remove model files in the future

Known issue:
1. ray doesn't work 
2. the accuracy of qwen3-next is not correct
3. qwen3-vl is broken
4. prefix cache+ ascend scheduler + deepseek v2 lite is broken.

Co-authored-by: MengqingCao <[email protected]>
Co-authored-by: hfadzxy <[email protected]>
Co-authored-by: leo-pony <[email protected]>
Co-authored-by: 22dimensions <[email protected]>
Co-authored-by: shen-shanshan <[email protected]>


- vLLM version: v0.11.2

---------

Signed-off-by: wangxiyuan <[email protected]>
Signed-off-by: MengqingCao <[email protected]>
Signed-off-by: hfadzxy <[email protected]>
Signed-off-by: leo-pony <[email protected]>
Co-authored-by: MengqingCao <[email protected]>
Co-authored-by: hfadzxy <[email protected]>
Co-authored-by: leo-pony <[email protected]>
Kurumi5210 pushed a commit to lidenghui1110/vllm-ascend that referenced this pull request Nov 26, 2025
Bump vLLM version to v0.11.2

What's broken and changed by vLLM:
1. structured_output is broken by
vllm-project/vllm#26866
2. get_mrope_input_positions is broken by
vllm-project/vllm#28399
3. graph mode is broken by
vllm-project/vllm#25110 we'll upgrade torch to
2.8 to fix the problem later
4. embedding is broken by
vllm-project/vllm#27583
5. `get_attn_backend_cls` and attention backend is broken are broken by
vllm-project/vllm#28534
6. spec decode is broken by
vllm-project/vllm#28771
7. sp feature is broken by
vllm-project/vllm#27126
8. mtp is broken by vllm-project/vllm#27922
9. lora is broken by vllm-project/vllm#21068
10. execute_model is broken by
vllm-project/vllm#26866
11. `VLLM_DISABLE_SHARED_EXPERTS_STREAM` env is broken by
vllm-project/vllm#28159
12. kv cahe is broken by vllm-project/vllm#27753
13. dp is broken by vllm-project/vllm#25110

What's broken and changed by ourself:
1. qwen vl is broken by vllm-project/vllm#28455
We'll remove model files in the future to avoid this kind of error
2. Engine core is broken by
vllm-project/vllm#23691 We'll remove the patch
file in the future.
3. Ascend scheduler is broken by
vllm-project/vllm#28733 We'll remove ascend
scheudler later.
4. qwen3-next is broken by
vllm-project/vllm#28083 We'll remove model files
in the future to avoid this kind of error
5. qwen vl is broken by vllm-project/vllm#27764.
We'll remove model files in the future

Known issue:
1. ray doesn't work
2. the accuracy of qwen3-next is not correct
3. qwen3-vl is broken
4. prefix cache+ ascend scheduler + deepseek v2 lite is broken.

Co-authored-by: MengqingCao <[email protected]>
Co-authored-by: hfadzxy <[email protected]>
Co-authored-by: leo-pony <[email protected]>
Co-authored-by: 22dimensions <[email protected]>
Co-authored-by: shen-shanshan <[email protected]>

- vLLM version: v0.11.2

---------

Signed-off-by: wangxiyuan <[email protected]>
Signed-off-by: MengqingCao <[email protected]>
Signed-off-by: hfadzxy <[email protected]>
Signed-off-by: leo-pony <[email protected]>
Co-authored-by: MengqingCao <[email protected]>
Co-authored-by: hfadzxy <[email protected]>
Co-authored-by: leo-pony <[email protected]>
Signed-off-by: Kurumi5210 <[email protected]>
devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025
845473182 pushed a commit to 845473182/vllm-ascend that referenced this pull request Nov 29, 2025
Bump vLLM version to v0.11.2

What's broken and changed by vLLM:
1. structured_output is broken by
vllm-project/vllm#26866
2. get_mrope_input_positions is broken by
vllm-project/vllm#28399
3. graph mode is broken by
vllm-project/vllm#25110 we'll upgrade torch to
2.8 to fix the problem later
4. embedding is broken by
vllm-project/vllm#27583
5. `get_attn_backend_cls` and attention backend is broken are broken by
vllm-project/vllm#28534
6. spec decode is broken by
vllm-project/vllm#28771
7. sp feature is broken by
vllm-project/vllm#27126
8. mtp is broken by vllm-project/vllm#27922
9. lora is broken by vllm-project/vllm#21068
10. execute_model is broken by
vllm-project/vllm#26866
11. `VLLM_DISABLE_SHARED_EXPERTS_STREAM` env is broken by
vllm-project/vllm#28159
12. kv cahe is broken by vllm-project/vllm#27753
13. dp is broken by vllm-project/vllm#25110

 
What's broken and changed by ourself:
1. qwen vl is broken by vllm-project/vllm#28455
We'll remove model files in the future to avoid this kind of error
2. Engine core is broken by
vllm-project/vllm#23691 We'll remove the patch
file in the future.
3. Ascend scheduler is broken by
vllm-project/vllm#28733 We'll remove ascend
scheudler later.
4. qwen3-next is broken by
vllm-project/vllm#28083 We'll remove model files
in the future to avoid this kind of error
5. qwen vl is broken by vllm-project/vllm#27764.
We'll remove model files in the future

Known issue:
1. ray doesn't work 
2. the accuracy of qwen3-next is not correct
3. qwen3-vl is broken
4. prefix cache+ ascend scheduler + deepseek v2 lite is broken.

Co-authored-by: MengqingCao <[email protected]>
Co-authored-by: hfadzxy <[email protected]>
Co-authored-by: leo-pony <[email protected]>
Co-authored-by: 22dimensions <[email protected]>
Co-authored-by: shen-shanshan <[email protected]>


- vLLM version: v0.11.2

---------

Signed-off-by: wangxiyuan <[email protected]>
Signed-off-by: MengqingCao <[email protected]>
Signed-off-by: hfadzxy <[email protected]>
Signed-off-by: leo-pony <[email protected]>
Co-authored-by: MengqingCao <[email protected]>
Co-authored-by: hfadzxy <[email protected]>
Co-authored-by: leo-pony <[email protected]>
charlotte12l pushed a commit to charlotte12l/vllm that referenced this pull request Dec 5, 2025
Meihan-chen pushed a commit to Meihan-chen/vllm-ascend that referenced this pull request Dec 5, 2025
Bump vLLM version to v0.11.2

What's broken and changed by vLLM:
1. structured_output is broken by
vllm-project/vllm#26866
2. get_mrope_input_positions is broken by
vllm-project/vllm#28399
3. graph mode is broken by
vllm-project/vllm#25110 we'll upgrade torch to
2.8 to fix the problem later
4. embedding is broken by
vllm-project/vllm#27583
5. `get_attn_backend_cls` and attention backend is broken are broken by
vllm-project/vllm#28534
6. spec decode is broken by
vllm-project/vllm#28771
7. sp feature is broken by
vllm-project/vllm#27126
8. mtp is broken by vllm-project/vllm#27922
9. lora is broken by vllm-project/vllm#21068
10. execute_model is broken by
vllm-project/vllm#26866
11. `VLLM_DISABLE_SHARED_EXPERTS_STREAM` env is broken by
vllm-project/vllm#28159
12. kv cahe is broken by vllm-project/vllm#27753
13. dp is broken by vllm-project/vllm#25110

 
What's broken and changed by ourself:
1. qwen vl is broken by vllm-project/vllm#28455
We'll remove model files in the future to avoid this kind of error
2. Engine core is broken by
vllm-project/vllm#23691 We'll remove the patch
file in the future.
3. Ascend scheduler is broken by
vllm-project/vllm#28733 We'll remove ascend
scheudler later.
4. qwen3-next is broken by
vllm-project/vllm#28083 We'll remove model files
in the future to avoid this kind of error
5. qwen vl is broken by vllm-project/vllm#27764.
We'll remove model files in the future

Known issue:
1. ray doesn't work 
2. the accuracy of qwen3-next is not correct
3. qwen3-vl is broken
4. prefix cache+ ascend scheduler + deepseek v2 lite is broken.

Co-authored-by: MengqingCao <[email protected]>
Co-authored-by: hfadzxy <[email protected]>
Co-authored-by: leo-pony <[email protected]>
Co-authored-by: 22dimensions <[email protected]>
Co-authored-by: shen-shanshan <[email protected]>


- vLLM version: v0.11.2

---------

Signed-off-by: wangxiyuan <[email protected]>
Signed-off-by: MengqingCao <[email protected]>
Signed-off-by: hfadzxy <[email protected]>
Signed-off-by: leo-pony <[email protected]>
Co-authored-by: MengqingCao <[email protected]>
Co-authored-by: hfadzxy <[email protected]>
Co-authored-by: leo-pony <[email protected]>
Clorist33 pushed a commit to Clorist33/vllm-ascend that referenced this pull request Dec 9, 2025
Bump vLLM version to v0.11.2

What's broken and changed by vLLM:
1. structured_output is broken by
vllm-project/vllm#26866
2. get_mrope_input_positions is broken by
vllm-project/vllm#28399
3. graph mode is broken by
vllm-project/vllm#25110 we'll upgrade torch to
2.8 to fix the problem later
4. embedding is broken by
vllm-project/vllm#27583
5. `get_attn_backend_cls` and attention backend is broken are broken by
vllm-project/vllm#28534
6. spec decode is broken by
vllm-project/vllm#28771
7. sp feature is broken by
vllm-project/vllm#27126
8. mtp is broken by vllm-project/vllm#27922
9. lora is broken by vllm-project/vllm#21068
10. execute_model is broken by
vllm-project/vllm#26866
11. `VLLM_DISABLE_SHARED_EXPERTS_STREAM` env is broken by
vllm-project/vllm#28159
12. kv cahe is broken by vllm-project/vllm#27753
13. dp is broken by vllm-project/vllm#25110

What's broken and changed by ourself:
1. qwen vl is broken by vllm-project/vllm#28455
We'll remove model files in the future to avoid this kind of error
2. Engine core is broken by
vllm-project/vllm#23691 We'll remove the patch
file in the future.
3. Ascend scheduler is broken by
vllm-project/vllm#28733 We'll remove ascend
scheudler later.
4. qwen3-next is broken by
vllm-project/vllm#28083 We'll remove model files
in the future to avoid this kind of error
5. qwen vl is broken by vllm-project/vllm#27764.
We'll remove model files in the future

Known issue:
1. ray doesn't work
2. the accuracy of qwen3-next is not correct
3. qwen3-vl is broken
4. prefix cache+ ascend scheduler + deepseek v2 lite is broken.

Co-authored-by: MengqingCao <[email protected]>
Co-authored-by: hfadzxy <[email protected]>
Co-authored-by: leo-pony <[email protected]>
Co-authored-by: 22dimensions <[email protected]>
Co-authored-by: shen-shanshan <[email protected]>

- vLLM version: v0.11.2

---------

Signed-off-by: wangxiyuan <[email protected]>
Signed-off-by: MengqingCao <[email protected]>
Signed-off-by: hfadzxy <[email protected]>
Signed-off-by: leo-pony <[email protected]>
Co-authored-by: MengqingCao <[email protected]>
Co-authored-by: hfadzxy <[email protected]>
Co-authored-by: leo-pony <[email protected]>
Signed-off-by: tanqingshan (A) <[email protected]>
Clorist33 pushed a commit to Clorist33/vllm-ascend that referenced this pull request Dec 10, 2025
Bump vLLM version to v0.11.2

What's broken and changed by vLLM:
1. structured_output is broken by
vllm-project/vllm#26866
2. get_mrope_input_positions is broken by
vllm-project/vllm#28399
3. graph mode is broken by
vllm-project/vllm#25110 we'll upgrade torch to
2.8 to fix the problem later
4. embedding is broken by
vllm-project/vllm#27583
5. `get_attn_backend_cls` and attention backend is broken are broken by
vllm-project/vllm#28534
6. spec decode is broken by
vllm-project/vllm#28771
7. sp feature is broken by
vllm-project/vllm#27126
8. mtp is broken by vllm-project/vllm#27922
9. lora is broken by vllm-project/vllm#21068
10. execute_model is broken by
vllm-project/vllm#26866
11. `VLLM_DISABLE_SHARED_EXPERTS_STREAM` env is broken by
vllm-project/vllm#28159
12. kv cahe is broken by vllm-project/vllm#27753
13. dp is broken by vllm-project/vllm#25110

 
What's broken and changed by ourself:
1. qwen vl is broken by vllm-project/vllm#28455
We'll remove model files in the future to avoid this kind of error
2. Engine core is broken by
vllm-project/vllm#23691 We'll remove the patch
file in the future.
3. Ascend scheduler is broken by
vllm-project/vllm#28733 We'll remove ascend
scheudler later.
4. qwen3-next is broken by
vllm-project/vllm#28083 We'll remove model files
in the future to avoid this kind of error
5. qwen vl is broken by vllm-project/vllm#27764.
We'll remove model files in the future

Known issue:
1. ray doesn't work 
2. the accuracy of qwen3-next is not correct
3. qwen3-vl is broken
4. prefix cache+ ascend scheduler + deepseek v2 lite is broken.

Co-authored-by: MengqingCao <[email protected]>
Co-authored-by: hfadzxy <[email protected]>
Co-authored-by: leo-pony <[email protected]>
Co-authored-by: 22dimensions <[email protected]>
Co-authored-by: shen-shanshan <[email protected]>


- vLLM version: v0.11.2

---------

Signed-off-by: wangxiyuan <[email protected]>
Signed-off-by: MengqingCao <[email protected]>
Signed-off-by: hfadzxy <[email protected]>
Signed-off-by: leo-pony <[email protected]>
Co-authored-by: MengqingCao <[email protected]>
Co-authored-by: hfadzxy <[email protected]>
Co-authored-by: leo-pony <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

multi-modality Related to multi-modality (#4194) qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants