upgrade vLLM to main by wangxiyuan · Pull Request #4608 · vllm-project/vllm-ascend

wangxiyuan · 2025-12-01T10:58:50Z

fix Update rope_scaling to rope_parameters in preparation for Transformers v5 vllm#28542
The model structure modifications we involved in are:
- Qwen2.5-VL(still exist some patch)
- Qwen2-VL
- Qwen2
- DeepSeek series
- Qwen-moe series
fix Revert "[Redo] #26368 (#28771)" vllm#29121
the output token now type changed from np to list[list[int]]
fix [Core] Deprecate xformers vllm#29262
xformers backend for multimodal now has been deprecated
fix [Attention] Remove imports from vllm/attention/__init__.py vllm#29342
fix [Core] Refactor padding logic and pad for CUDA graphs before attention metadata building vllm#28579
fix [Feature] Prefill Context Parallel (PCP) basic support vllm#28718
fix [Config] Clean up SchedulerConfig initialization vllm#28665
fix [Frontend][torch.compile] CompilationConfig Overhaul (#20283): Set up -O infrastructure vllm#26847
vllm introduced the optimization-level, some default config has been changed, and the param --enforce-eager has been deprecated
fix https://github.com/vllm-project/vllm/pull/29223 it retuns tuple for sampler.
fix Remove upstream fa checks vllm#29471 we'll remove the related patch to avoid this kind of error.

Co-authored-by: hfadzxy starmoon_zhang@163.com
Co-authored-by: wangli wangli858794774@gmail.com

vLLM version: v0.11.2
vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2

github-actions · 2025-12-01T10:59:01Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request upgrades vLLM to main, which involves a lot of refactoring to align with upstream changes. Most changes are related to module path updates, API signature changes (e.g., rope_parameters), and data type changes (from numpy arrays to Python lists). I've found a critical issue in vllm_ascend/spec_decode/eagle_proposer.py where a hardcoded index is used instead of iterating through the batch, which will lead to incorrect behavior in speculative decoding. Please address this issue.

github-actions · 2025-12-01T23:39:35Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

MengqingCao · 2025-12-02T02:12:01Z

-        self.mock_vllm_config.scheduler_config = SchedulerConfig(
-            max_num_seqs=8, chunked_prefill_enabled=True)
+        mock_scheduler_config = MagicMock(spec=SchedulerConfig)
+        mock_scheduler_config.max_num_seqs = 8  # 设置为整数，不是 MagicMock


plz use english comment

MengqingCao

Plz fix the above comment and LGTM if CI passes

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

Signed-off-by: wangli <wangli858794774@gmail.com>

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

Signed-off-by: wangli <wangli858794774@gmail.com>

Signed-off-by: hfadzxy <starmoon_zhang@163.com>

Signed-off-by: wangli <wangli858794774@gmail.com>

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

Signed-off-by: wangli <wangli858794774@gmail.com>

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

1. fix vllm-project/vllm#28542 The model structure modifications we involved in are: - Qwen2.5-VL(still exist some patch) - Qwen2-VL - Qwen2 - DeepSeek series - Qwen-moe series 2. fix vllm-project/vllm#29121 the output token now type changed from np to `list[list[int]]` 3. fix vllm-project/vllm#29262 `xformers` backend for multimodal now has been deprecated 4. fix vllm-project/vllm#29342 5. fix vllm-project/vllm#28579 6. fix vllm-project/vllm#28718 7. fix vllm-project/vllm#28665 8. fix vllm-project/vllm#26847 vllm introduced the `optimization-level`, some default config has been changed, and the param `--enforce-eager` has been deprecated 9. fix https://github.com/vllm-project/vllm/pull/29223 it retuns tuple for sampler. 10. fix vllm-project/vllm#29471 we'll remove the related patch to avoid this kind of error. Co-authored-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: wangli <wangli858794774@gmail.com> - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: wangli <wangli858794774@gmail.com> Co-authored-by: hfadzxy <starmoon_zhang@163.com>

1. fix vllm-project/vllm#28542 The model structure modifications we involved in are: - Qwen2.5-VL(still exist some patch) - Qwen2-VL - Qwen2 - DeepSeek series - Qwen-moe series 2. fix vllm-project/vllm#29121 the output token now type changed from np to `list[list[int]]` 3. fix vllm-project/vllm#29262 `xformers` backend for multimodal now has been deprecated 4. fix vllm-project/vllm#29342 5. fix vllm-project/vllm#28579 6. fix vllm-project/vllm#28718 7. fix vllm-project/vllm#28665 8. fix vllm-project/vllm#26847 vllm introduced the `optimization-level`, some default config has been changed, and the param `--enforce-eager` has been deprecated 9. fix https://github.com/vllm-project/vllm/pull/29223 it retuns tuple for sampler. 10. fix vllm-project/vllm#29471 we'll remove the related patch to avoid this kind of error. Co-authored-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: wangli <wangli858794774@gmail.com> - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: wangli <wangli858794774@gmail.com> Co-authored-by: hfadzxy <starmoon_zhang@163.com> Signed-off-by: Che Ruan <cr623@ic.ac.uk>

1. fix vllm-project/vllm#28542 The model structure modifications we involved in are: - Qwen2.5-VL(still exist some patch) - Qwen2-VL - Qwen2 - DeepSeek series - Qwen-moe series 2. fix vllm-project/vllm#29121 the output token now type changed from np to `list[list[int]]` 3. fix vllm-project/vllm#29262 `xformers` backend for multimodal now has been deprecated 4. fix vllm-project/vllm#29342 5. fix vllm-project/vllm#28579 6. fix vllm-project/vllm#28718 7. fix vllm-project/vllm#28665 8. fix vllm-project/vllm#26847 vllm introduced the `optimization-level`, some default config has been changed, and the param `--enforce-eager` has been deprecated 9. fix https://github.com/vllm-project/vllm/pull/29223 it retuns tuple for sampler. 10. fix vllm-project/vllm#29471 we'll remove the related patch to avoid this kind of error. Co-authored-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: wangli <wangli858794774@gmail.com> - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: wangli <wangli858794774@gmail.com> Co-authored-by: hfadzxy <starmoon_zhang@163.com>

1. fix vllm-project/vllm#28542 The model structure modifications we involved in are: - Qwen2.5-VL(still exist some patch) - Qwen2-VL - Qwen2 - DeepSeek series - Qwen-moe series 2. fix vllm-project/vllm#29121 the output token now type changed from np to `list[list[int]]` 3. fix vllm-project/vllm#29262 `xformers` backend for multimodal now has been deprecated 4. fix vllm-project/vllm#29342 5. fix vllm-project/vllm#28579 6. fix vllm-project/vllm#28718 7. fix vllm-project/vllm#28665 8. fix vllm-project/vllm#26847 vllm introduced the `optimization-level`, some default config has been changed, and the param `--enforce-eager` has been deprecated 9. fix https://github.com/vllm-project/vllm/pull/29223 it retuns tuple for sampler. 10. fix vllm-project/vllm#29471 we'll remove the related patch to avoid this kind of error. Co-authored-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: wangli <wangli858794774@gmail.com> - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: wangli <wangli858794774@gmail.com> Co-authored-by: hfadzxy <starmoon_zhang@163.com> Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

1. fix vllm-project/vllm#28542 The model structure modifications we involved in are: - Qwen2.5-VL(still exist some patch) - Qwen2-VL - Qwen2 - DeepSeek series - Qwen-moe series 2. fix vllm-project/vllm#29121 the output token now type changed from np to `list[list[int]]` 3. fix vllm-project/vllm#29262 `xformers` backend for multimodal now has been deprecated 4. fix vllm-project/vllm#29342 5. fix vllm-project/vllm#28579 6. fix vllm-project/vllm#28718 7. fix vllm-project/vllm#28665 8. fix vllm-project/vllm#26847 vllm introduced the `optimization-level`, some default config has been changed, and the param `--enforce-eager` has been deprecated 9. fix https://github.com/vllm-project/vllm/pull/29223 it retuns tuple for sampler. 10. fix vllm-project/vllm#29471 we'll remove the related patch to avoid this kind of error. Co-authored-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: wangli <wangli858794774@gmail.com> - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: wangli <wangli858794774@gmail.com> Co-authored-by: hfadzxy <starmoon_zhang@163.com>

github-actions bot added module:ops module:core labels Dec 1, 2025

gemini-code-assist bot reviewed Dec 1, 2025

View reviewed changes

Comment thread vllm_ascend/spec_decode/eagle_proposer.py Outdated

wangxiyuan force-pushed the 4527 branch from 3248d20 to 6e22b68 Compare December 1, 2025 11:15

github-actions bot added the documentation Improvements or additions to documentation label Dec 1, 2025

wangxiyuan mentioned this pull request Dec 1, 2025

[Main] Upgrade vllm commit to 2025_12_01 #4527

Closed

github-actions bot added the module:tests label Dec 1, 2025

wangxiyuan added ready read for review ready-for-test start test by label for PR labels Dec 1, 2025

github-actions bot added the merge-conflicts label Dec 1, 2025

wangxiyuan force-pushed the 4527 branch from 844de00 to 2621805 Compare December 2, 2025 00:48

github-actions bot removed the merge-conflicts label Dec 2, 2025

wangxiyuan added the vllm-break label Dec 2, 2025

MengqingCao reviewed Dec 2, 2025

View reviewed changes

MengqingCao approved these changes Dec 2, 2025

View reviewed changes

wangxiyuan force-pushed the 4527 branch from 953f9f1 to 9202a0f Compare December 2, 2025 05:18

wangxiyuan and others added 12 commits December 2, 2025 17:03

upgrade vLLM to main

008ea07

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

fix logger import error

9bb441e

Signed-off-by: wangli <wangli858794774@gmail.com>

fix aclgraph error

8aadb23

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

fix ut

87c35d3

Signed-off-by: wangli <wangli858794774@gmail.com>

mock torch.device

a1a49bc

Signed-off-by: wangli <wangli858794774@gmail.com>

fix torchair ut

9f163b8

Signed-off-by: wangli <wangli858794774@gmail.com>

fix eagle ut

34a812c

Signed-off-by: wangli <wangli858794774@gmail.com>

fix kv_connector ut

2eab306

Signed-off-by: wangli <wangli858794774@gmail.com>

fix mla_v1 acl_graph scheduler ut test

dc612d8

Signed-off-by: hfadzxy <starmoon_zhang@163.com>

fix mla ut

7418f20

Signed-off-by: wangli <wangli858794774@gmail.com>

fix mla

a61bf08

Signed-off-by: wangli <wangli858794774@gmail.com>

fix lint

a5dc782

Signed-off-by: wangli <wangli858794774@gmail.com>

wangxiyuan and others added 13 commits December 2, 2025 17:03

fix cp config

b36c553

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

fix vl patch

fc21515

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

fix qwen3-vl get_repo patch

af399e0

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

fix mtp aclgraph error

6083d34

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

fix qwen3-vl

0f71d74

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

fix sfa ut

e9f636f

Signed-off-by: wangli <wangli858794774@gmail.com>

fix sfa ut

f284089

Signed-off-by: wangli <wangli858794774@gmail.com>

fix

29331d2

Signed-off-by: wangli <wangli858794774@gmail.com>

fix mla ut

b099498

Signed-off-by: wangli <wangli858794774@gmail.com>

fix mla

4a792aa

Signed-off-by: wangli <wangli858794774@gmail.com>

fix ut

3bf5a11

Signed-off-by: wangli <wangli858794774@gmail.com>

rm redundant lines

fd860ff

Signed-off-by: wangli <wangli858794774@gmail.com>

fix mtp error

307af29

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

wangxiyuan force-pushed the 4527 branch from 0469d8f to 307af29 Compare December 2, 2025 09:04

fix torchair mtp

f9893d6

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

wangxiyuan merged commit 7f2673e into vllm-project:main Dec 2, 2025
24 checks passed

wangxiyuan deleted the 4527 branch December 4, 2025 07:04

Potabk mentioned this pull request Dec 4, 2025

[Usage]: Elegant main2main scrolling upgrade #4709

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

upgrade vLLM to main#4608

upgrade vLLM to main#4608
wangxiyuan merged 26 commits intovllm-project:mainfrom
wangxiyuan:4527

wangxiyuan commented Dec 1, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Dec 1, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

github-actions bot commented Dec 1, 2025

Uh oh!

MengqingCao Dec 2, 2025

Uh oh!

MengqingCao left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

wangxiyuan commented Dec 1, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 1, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

github-actions bot commented Dec 1, 2025

Uh oh!

MengqingCao Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

MengqingCao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wangxiyuan commented Dec 1, 2025 •

edited by github-actions bot

Loading