[Main2Main] Upgrade vllm commit to 0120 by Meihan-chen · Pull Request #6040 · vllm-project/vllm-ascend

Meihan-chen · 2026-01-20T07:16:53Z

What this PR does / why we need it?

✅ Upgrade vllm commit to: 0115 (8471b27df97c3eb79f891802fc0e858f8f7ac6a0)
Modify import paths due to the refactors：
[Model Runner V2] Refactor Sampler vllm#32245
[4/N][Attention] Move MLA common to model_executor vllm#32060
Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913
✅Upgrade vllm commit to: 0119 (9a1f16da1e423ede2c2f52a9850cbfbb39cefe96)
Fix WorkerProc.__init__() missing 1 required positional argument: 'is_driver_worker' due to [TPU][Core] Enable Pipeline Parallelism on TPU backend vllm#28506
https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569
✅Upgrade vllm commit to: 0120(148117ea2e689cd43df4be6892671a17cdae5833)
1. Add skip_compiled param in set_forward_context due to [Core] Whisper support torch.compile vllm#30385
2. Modify tests/ut/spec_decode/test_eagle_proposer.py due to feat: spec decode with draft models vllm#24322
  change self.max_num_tokens = vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size
3. Modify UT import paths due to the refactors：[4/N][Attention] Move MLA common to model_executor vllm#32060

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.13.0
vLLM main: vllm-project/vllm@d682094

github-actions · 2026-01-20T07:17:08Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request updates the vLLM commit to 0.13.0, necessitating changes to import paths and refactoring of parallel configuration logic. The core changes involve adapting the codebase to the new vLLM library structure, particularly for attention metadata builders and sampling metadata. A significant refactoring in patch_multiproc_executor.py centralizes parallel size calculations and worker initialization, introducing version-aware compatibility. While these changes are essential for the upgrade, some aspects related to type safety, assertion preservation, and conditional logic for version compatibility require attention to ensure robustness and correctness.

gemini-code-assist · 2026-01-20T07:19:25Z

vllm_ascend/attention/mla_v1.py

 # isort: off
 if vllm_version_is('0.13.0'):
-    from vllm.v1.attention.backends.utils import AttentionCGSupport
+    from vllm.v1.attention.backends.mla.common import MLACommonMetadataBuilder  # type: ignore


The use of # type: ignore can mask actual type issues that might arise from API changes in vLLM 0.13.0. It is best practice to resolve type incompatibilities explicitly or provide a more specific reason for ignoring if it's a known, harmless difference, to prevent potential runtime errors.

gemini-code-assist · 2026-01-20T07:19:25Z

vllm_ascend/attention/sfa_v1.py

    from vllm.v1.core.sched.output import SchedulerOutput
 if vllm_version_is('0.13.0'):
-    from vllm.v1.attention.backends.utils import AttentionCGSupport
+    from vllm.v1.attention.backends.mla.common import MLACommonMetadataBuilder  # type: ignore


Similar to mla_v1.py, using # type: ignore for MLACommonMetadataBuilder and AttentionCGSupport might be masking actual type issues or subtle API changes in vLLM 0.13.0. It is recommended to address the underlying type problems directly for better code maintainability and to avoid potential runtime bugs.

gemini-code-assist · 2026-01-20T07:19:25Z

vllm_ascend/patch/platform/patch_multiproc_executor.py

+            self.futures_queue = deque[tuple[FutureWrapper, Callable]]()
+            self._post_init_executor()


The initialization of self.futures_queue and the call to self._post_init_executor() are moved to this position. It is critical to ensure that _post_init_executor() is called at the correct stage of the executor's initialization, especially if it has side effects or dependencies that must be met before success = True is set.

gemini-code-assist · 2026-01-20T07:19:25Z

vllm_ascend/patch/platform/patch_multiproc_executor.py

+        if not vllm_version_is('0.13.0'):
+            process_kwargs["is_driver_worker"] = is_driver_worker


The conditional logic if not vllm_version_is('0.13.0'): implies that is_driver_worker is needed for vLLM versions other than 0.13.0. If is_driver_worker is a new parameter introduced in a vLLM version after 0.13.0, this condition might be incorrect. Please clarify the exact vLLM version(s) for which is_driver_worker is required/not required to ensure proper compatibility.

github-actions · 2026-01-22T10:01:21Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

github-actions · 2026-01-22T10:01:21Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: wjunLu <wjunlu217@gmail.com>

Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>

Meihan-chen requested review from LCAIZJ, MengqingCao, Yikun, wangxiyuan and weijinqian0 as code owners January 20, 2026 07:16

github-actions bot added documentation Improvements or additions to documentation ci/build labels Jan 20, 2026

gemini-code-assist bot reviewed Jan 20, 2026

View reviewed changes

wxsIcey added ready read for review ready-for-test start test by label for PR labels Jan 20, 2026

Meihan-chen force-pushed the main0120 branch 5 times, most recently from 186bc57 to de7731b Compare January 21, 2026 09:49

github-actions bot added the merge-conflicts label Jan 22, 2026

Meihan-chen force-pushed the main0120 branch from de7731b to 4b9237a Compare January 26, 2026 07:10

github-actions bot removed the merge-conflicts label Jan 26, 2026

Meihan-chen force-pushed the main0120 branch from e9319de to ca5693c Compare January 26, 2026 08:34

wjunLu and others added 8 commits January 26, 2026 17:05

[Main2Main] Upgrade vllm commit to 0116

1626bef

Signed-off-by: wjunLu <wjunlu217@gmail.com>

[Main2Main] Update vllm commit to 0119

fc4c4f1

Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>

update patch_multiproc_executor

870faea

Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>

fix 0120 bug

2cf3c88

Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>

[Main2Main] Upgrade vllm commit to 0121

381e951

Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>

fix main0121

383e461

Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>

fix maybe_setup_kv_connector

ada2931

Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>

fix bs_to_padded_graph_size

5aacf30

Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>

Meihan-chen added 4 commits January 26, 2026 17:05

[Main2Main] Upgrade vllm commit to 0122

abaa10c

Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>

fix FusedMoEParallelConfig

4f2edf4

Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>

[Main2Main] Upgrade vllm commit to 0123

302a1d1

Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>

fix test_qwen3_performance by #32723

f9d6bed

Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>

Meihan-chen force-pushed the main0120 branch from ca5693c to 6251e57 Compare January 26, 2026 09:06

[Main2Main] Upgrade vllm commit to 0126

61d65c4

Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>

Meihan-chen force-pushed the main0120 branch from 6251e57 to 61d65c4 Compare January 26, 2026 10:42

wangxiyuan closed this Jan 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Main2Main] Upgrade vllm commit to 0120 #6040

[Main2Main] Upgrade vllm commit to 0120 #6040
Meihan-chen wants to merge 13 commits intovllm-project:mainfrom
Meihan-chen:main0120

Meihan-chen commented Jan 20, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 20, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 20, 2026

Uh oh!

gemini-code-assist bot Jan 20, 2026

Uh oh!

gemini-code-assist bot Jan 20, 2026

Uh oh!

gemini-code-assist bot Jan 20, 2026

Uh oh!

github-actions bot commented Jan 22, 2026

Uh oh!

github-actions bot commented Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		self.futures_queue = deque[tuple[FutureWrapper, Callable]]()
		self._post_init_executor()

		if not vllm_version_is('0.13.0'):
		process_kwargs["is_driver_worker"] = is_driver_worker

Conversation

Meihan-chen commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Jan 20, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 22, 2026

Uh oh!

github-actions bot commented Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Meihan-chen commented Jan 20, 2026 •

edited

Loading