[Bugfix] Fix mm_merge by Potabk · Pull Request #5249 · vllm-project/vllm-ascend

Potabk · 2025-12-22T10:57:10Z

What this PR does / why we need it?

We should transfer the mm_embed to the dtype of input_embed before performing the in-place assignment

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: release/v0.13.0
vLLM main: vllm-project/vllm@ad32e3e

Signed-off-by: wangli <wangli858794774@gmail.com>

gemini-code-assist

Code Review

This pull request addresses a potential RuntimeError in _merge_multimodal_embeddings by ensuring dtype consistency between the input embeddings and the multimodal embeddings. The change correctly casts the flattened multimodal embeddings to the dtype of the input embeddings before performing the in-place assignment. This is a robust fix that prevents crashes due to dtype mismatches. The implementation is correct and I approve of this change.

github-actions · 2025-12-22T11:39:49Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

ApsarasX · 2025-12-22T11:50:36Z

Will there be any accuracy problems without this PR?

So is this a Bugfix or a Misc?

BTW, typo in PR title: Msic -> Misc

Potabk · 2025-12-22T14:39:39Z

@ApsarasX For some models, eg: bagel, the mm_embeds's dtype are different from input_embeds, so without this pr, there exist functional issue

Potabk · 2025-12-23T02:50:49Z

also cc @booker123456 @gcanlin @shen-shanshan

shen-shanshan · 2025-12-23T03:10:13Z

Are there any issues had recorded the error without this PR? I'm not sure about the background of this PR.

Potabk · 2025-12-23T03:13:38Z

test locally with the script https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/vision_language.py
python examples/offline_inference/vision_language.py --model-type bagel not work

Potabk · 2025-12-23T03:16:06Z

The root cause is in this bagel modeling implementation, the dtype of mm_embeds and input_embed not match

gcanlin

LGTM if CI can pass, vllm also has this dtype convert.

booker123456 · 2025-12-23T03:40:48Z

LGTM. Compatible with data types is reasonable.

wangxiyuan · 2025-12-23T06:07:59Z

@@ -37,8 +37,9 @@ def _merge_multimodal_embeddings(
        This updates ``inputs_embeds`` in place.


any plan to remove this patch?

I think that we should request for torch_npu/CANN team to support torch.Tensor.masked_scatter_ then we can remove this patch.

After communicating offline with the author of this patch, I learned that it was added for performance reasons. The original mask_scatter operator has no functional issues. Therefore, we may need to push for the addition of a new ascend branch upstream.

After testing on NPU, it really doesn't have functional issues. @booker123456 is there any performance test for this patch change?

I suggest we consider directly removing this patch to reduce the maintaining cost. It seems that it can't take much performance. @booker123456 WDYT?

I think this patch is still necessary until torch_npu's masked_scatter_ performance catches up with index put.

Potabk · 2025-12-31T01:44:40Z

So can we merge it? @wangxiyuan

…to FIA_rebase * 'main' of https://github.com/vllm-project/vllm-ascend: [feature] mooncake support pcp/dcp in common conditions (vllm-project#5224) [Bugfix] Fix mm_merge (vllm-project#5249) [Main2Main] Upgrade vllm commit to 1230 (vllm-project#5495) [Feature] Refactor PCP &DCP related code (vllm-project#5214) [main][test] Refactor the mtp and eagle test case (vllm-project#5326) [smoke][bugfix] moe_init_routing_v2 active_expert_range use int type (vllm-project#5521) [2/N] Upgrade nightly doc (vllm-project#5534) [Doc] Add new contributors. (vllm-project#5537) [3/N][Nightly] Move ops tests to nightly (vllm-project#5538)

### What this PR does / why we need it? We should transfer the mm_embed to the dtype of input_embed before performing the in-place assignment - vLLM version: release/v0.13.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: wangli <wangli858794774@gmail.com>

### What this PR does / why we need it? We should transfer the mm_embed to the dtype of input_embed before performing the in-place assignment - vLLM version: release/v0.13.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

### What this PR does / why we need it? We should transfer the mm_embed to the dtype of input_embed before performing the in-place assignment - vLLM version: release/v0.13.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: wangli <wangli858794774@gmail.com>

### What this PR does / why we need it? We should transfer the mm_embed to the dtype of input_embed before performing the in-place assignment - vLLM version: release/v0.13.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

fix mm_merge

d4bad0a

Signed-off-by: wangli <wangli858794774@gmail.com>

gemini-code-assist bot reviewed Dec 22, 2025

View reviewed changes

Potabk changed the title ~~[Msic] Fix mm_merge~~ [Bugfix] Fix mm_merge Dec 22, 2025

gcanlin approved these changes Dec 23, 2025

View reviewed changes

Potabk requested a review from wangxiyuan December 23, 2025 03:51

wangxiyuan reviewed Dec 23, 2025

View reviewed changes

wangxiyuan approved these changes Dec 31, 2025

View reviewed changes

wangxiyuan merged commit a5ae07a into vllm-project:main Dec 31, 2025
15 checks passed

Potabk deleted the fix_mm_merge_2 branch December 31, 2025 02:10

bazingazhou233-hub mentioned this pull request Mar 14, 2026

[Bug]: vllm推理internvl3的时候，报错 “selfRef expected dtype is DT_INT32 but found DT_BFLOAT16.” #1883

Open

		@@ -37,8 +37,9 @@ def _merge_multimodal_embeddings(
		This updates ``inputs_embeds`` in place.

Conversation

Potabk commented Dec 22, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

github-actions bot commented Dec 22, 2025

Uh oh!

ApsarasX commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Potabk commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Potabk commented Dec 23, 2025

Uh oh!

shen-shanshan commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Potabk commented Dec 23, 2025

Uh oh!

Potabk commented Dec 23, 2025

Uh oh!

gcanlin left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

booker123456 commented Dec 23, 2025

Uh oh!

wangxiyuan Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

gcanlin Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

Potabk Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gcanlin Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

gcanlin Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

ApsarasX Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

Potabk commented Dec 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Potabk commented Dec 22, 2025 •

edited by github-actions bot

Loading

ApsarasX commented Dec 22, 2025 •

edited

Loading

Potabk commented Dec 22, 2025 •

edited

Loading

shen-shanshan commented Dec 23, 2025 •

edited

Loading

gcanlin left a comment •

edited

Loading

Potabk Dec 23, 2025 •

edited

Loading