[Bugfix] Fix in_profile_run in mtp_proposer dummy_run by slippersss · Pull Request #5165 · vllm-project/vllm-ascend

slippersss · 2025-12-18T08:29:44Z

What this PR does / why we need it?

This PR aims to fix failure of enable_force_load_balance caused by missing in_profile_run in dummy_run of mtp_proposer.

Does this PR introduce any user-facing change?

N/A

How was this patch tested?

by ci

vLLM version: v0.12.0
vLLM main: vllm-project/vllm@ad32e3e

Signed-off-by: Zetong Li <slippersss@126.com>

gemini-code-assist

Code Review

This pull request aims to fix a bug related to in_profile_run in mtp_proposer. The changes correctly add an is_profile parameter to dummy_run and pass it down. However, there are two critical issues. First, in mtp_proposer.py, the parameter passed to set_ascend_forward_context is named is_profile_run instead of the correct in_profile_run, which will cause the flag to be ignored. Second, the call to self.drafter.dummy_run in model_runner_v1.py now includes the is_profile argument, but other Proposer implementations (EagleProposer, NgramProposer, SuffixDecodingProposer) and the base Proposer interface have not been updated to accept this argument, which will lead to a TypeError at runtime.

gemini-code-assist · 2025-12-18T08:32:03Z

                    batch_descriptor=batch_descriptor,
-                    is_mtp_model=True):
+                    is_mtp_model=True,
+                    is_profile_run=is_profile):


The parameter name is_profile_run is incorrect. The set_ascend_forward_context function expects in_profile_run. This typo will cause the is_profile flag to be ignored, as Python will not pass it to the function, and in_profile_run will use its default value (False). This makes the intended fix ineffective.

Suggested change

is_profile_run=is_profile):

in_profile_run=is_profile):

gemini-code-assist · 2025-12-18T08:32:04Z

                    dummy_compute_logits=dummy_drafter_compute_logits,
-                    in_graph_capturing=not force_attention)
+                    in_graph_capturing=not force_attention,
+                    is_profile=is_profile)


This change introduces the is_profile keyword argument to the dummy_run call. However, the Proposer interface and its other implementations (EagleProposer, NgramProposer, SuffixDecodingProposer) have not been updated to accept this argument. This will cause a TypeError at runtime if a proposer other than MtpProposer is used. To fix this, you should update the base Proposer interface in vllm_ascend/spec_decode/interface.py and all its subclasses to include is_profile=False in their dummy_run method signatures.

Signed-off-by: Zetong Li <slippersss@126.com>

github-actions · 2025-12-18T09:31:35Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

…to eplb_refactor * 'main' of https://github.com/vllm-project/vllm-ascend: (52 commits) [Doc]Add the user_guide doc file regarding fine-grained TP. (vllm-project#5084) [pref] qwen3_next add triton ops : fused_sigmoid_gating_delta_rule_update (vllm-project#4818) [Feature] Add token mask for DispatchGmmCombineDecode operator (vllm-project#5171) [CI] Improve CI (vllm-project#5078) [Refactor] remove some metadata variables in attention_v1. (vllm-project#5160) Add Qwen3-VL-235B-A22B-Instruct tutorials (vllm-project#5167) [Doc] Add a perf tune section (vllm-project#5127) [Image] Refactor image build (vllm-project#5175) [refactor] refactor weight trans nz and transpose (vllm-project#4878) [BugFix]Fix precision issue for LoRA feature (vllm-project#4141) 【Doc】Deepseekv3.1/R1 doc enhancement (vllm-project#4827) support basic long_seq feature st (vllm-project#5140) [Bugfix] install trition for test_custom_op (vllm-project#5112) [2/N][Pangu][MoE] Remove Pangu Related Code (vllm-project#5130) [bugfix] Use FUSED_MC2 MoE comm path for the op `dispatch_ffn_combine` (vllm-project#5156) [BugFix] Fix top_p,top_k issue with EAGLE and add top_p,top_k in EAGLE e2e (vllm-project#5131) [Doc][P/D] Fix MooncakeConnector's name (vllm-project#5172) [Bugfix] Fix in_profile_run in mtp_proposer dummy_run (vllm-project#5165) [Doc] Refact benchmark doc (vllm-project#5173) [Nightly] Avoid max_model_len being smaller than the decoder prompt to prevent single-node-accuray-tests from failing (vllm-project#5174) ... Signed-off-by: 白永斌 <baiyongbin3@h-partners.com>

) ### What this PR does / why we need it? This PR aims to fix failure of `enable_force_load_balance` caused by missing `in_profile_run` in `dummy_run` of mtp_proposer. ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? by ci - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e --------- Signed-off-by: Zetong Li <slippersss@126.com>

) ### What this PR does / why we need it? This PR aims to fix failure of `enable_force_load_balance` caused by missing `in_profile_run` in `dummy_run` of mtp_proposer. ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? by ci - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e --------- Signed-off-by: Zetong Li <slippersss@126.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

[Bugfix] Fix in_profile_run in mtp_proposer dummy_run

143608e

Signed-off-by: Zetong Li <slippersss@126.com>

gemini-code-assist bot reviewed Dec 18, 2025

View reviewed changes

slippersss added 2 commits December 18, 2025 16:48

fix typo

d49d9f1

Signed-off-by: Zetong Li <slippersss@126.com>

update is_profile in other proposers

123232e

Signed-off-by: Zetong Li <slippersss@126.com>

weijinqian0 added ready read for review ready-for-test start test by label for PR labels Dec 18, 2025

wangxiyuan approved these changes Dec 18, 2025

View reviewed changes

wangxiyuan merged commit 2304218 into vllm-project:main Dec 18, 2025
54 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] Fix in_profile_run in mtp_proposer dummy_run#5165

[Bugfix] Fix in_profile_run in mtp_proposer dummy_run#5165
wangxiyuan merged 3 commits intovllm-project:mainfrom
slippersss:bugfix_profile

slippersss commented Dec 18, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 18, 2025

Uh oh!

gemini-code-assist bot Dec 18, 2025

Uh oh!

github-actions bot commented Dec 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

slippersss commented Dec 18, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

slippersss commented Dec 18, 2025 •

edited by github-actions bot

Loading