Skip to content

[Main2Main] Upgrade vLLM to 0303#6944

Merged
wangxiyuan merged 11 commits intovllm-project:mainfrom
MrZ20:main_0302
Mar 6, 2026
Merged

[Main2Main] Upgrade vLLM to 0303#6944
wangxiyuan merged 11 commits intovllm-project:mainfrom
MrZ20:main_0302

Conversation

@MrZ20
Copy link
Copy Markdown
Contributor

@MrZ20 MrZ20 commented Mar 3, 2026

What this PR does / why we need it?

break:

Important change:

matcher_utils directly accesses torch.ops._C.* during the import phase. In the Ascend environment, some unregistered ops trigger AttributeError, causing e2e initialization failure.
https://github.com/vllm-project/vllm-ascend/actions/runs/22607260487/job/65502047131#step:10:2323
https://github.com/vllm-project/vllm/blob/main/vllm/compilation/passes/fusion/matcher_utils.py#L29

This PR adds temporary compatibility placeholders (rms_norm, fused_add_rms_norm, rotate_embedding, static/dynamic fp8 quant, silu_and_mul) to vllm_ascend/patch/platform/patch_fusion_matcher_compat_ops.py to ensure no crashes during the import phase. Upstream repairs will be considered later.

Does this PR introduce any user-facing change?

How was this patch tested?

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Note

Gemini is unable to generate a summary for this pull request due to the file types involved not being currently supported.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 3, 2026

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@Potabk Potabk added ready read for review ready-for-test start test by label for PR labels Mar 3, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 3, 2026

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@MrZ20 MrZ20 marked this pull request as ready for review March 3, 2026 12:51
@gcanlin gcanlin requested a review from LCAIZJ as a code owner March 3, 2026 15:20
@MrZ20 MrZ20 force-pushed the main_0302 branch 7 times, most recently from b919e3c to b097c45 Compare March 5, 2026 04:36
Meihan-chen and others added 2 commits March 5, 2026 17:55
Root causes:
- CudagraphDispatcher.dispatch() disable_full replaced with valid_modes/invalid_modes (PR #34102)
- compile_or_warm_up_model() now returns float compilation_time (PR #35503)
- MMEncoderAttention forward methods added sequence_lengths param (PR #35564)
- Removed auto-forcing of +rms_norm for sequence parallelism (PR #35410)

Upstream commit range: 15d76f74e2fdb12a95ea00f0ca283acf6219a2b7..6290470843c131681e3e1318ae71070a34f33225

Co-Authored-By: Claude Code <noreply@anthropic.com>
Signed-off-by: MrZ20 <2609716663@qq.com>
Root causes:
- CudagraphDispatcher.dispatch() API changed: disable_full -> valid_modes/invalid_modes (#34102)
- compile_or_warm_up_model() must return float compilation_time (#35503)
- MMEncoderAttention.forward_oot() gained new sequence_lengths param (#34580)
- +rms_norm no longer auto-forced for SP, breaks Ascend without CUDA _C ops (#35410)

Upstream commit range: 15d76f74e2fdb12a95ea00f0ca283acf6219a2b7..6290470843c131681e3e1318ae71070a34f33225

Co-Authored-By: Claude Code <noreply@anthropic.com>
Signed-off-by: MrZ20 <2609716663@qq.com>
MrZ20 and others added 9 commits March 5, 2026 17:55
Signed-off-by: MrZ20 <2609716663@qq.com>
Signed-off-by: MrZ20 <2609716663@qq.com>
Signed-off-by: MrZ20 <2609716663@qq.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: MrZ20 <2609716663@qq.com>
Signed-off-by: MrZ20 <2609716663@qq.com>
Signed-off-by: MrZ20 <2609716663@qq.com>
Signed-off-by: MrZ20 <2609716663@qq.com>
@wangxiyuan wangxiyuan merged commit bd571cf into vllm-project:main Mar 6, 2026
54 of 55 checks passed
@MrZ20 MrZ20 deleted the main_0302 branch March 6, 2026 01:22
LCAIZJ pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Mar 7, 2026
### What this PR does / why we need it?
break:
- vllm-project/vllm#34102 
Disable_full param replaced with valid_modes/invalid_modes API
- vllm-project/vllm#35503
Now must return float compilation_time
- vllm-project/vllm#35564
New sequence_lengths param added
- vllm-project/vllm#33807
A check was performed (if runner_backend != "auto")
- vllm-project/vllm#34861
`BaseDeviceCommunicator` now accesses PyTorch's internal `pg_map` to
check process group state
- vllm-project/vllm#35274

**Important change:**
- vllm-project/vllm#28672

`matcher_utils` directly accesses `torch.ops._C.*` during the import
phase. In the Ascend environment, some unregistered ops trigger
`AttributeError`, causing e2e initialization failure.

https://github.com/vllm-project/vllm-ascend/actions/runs/22607260487/job/65502047131#step:10:2323

https://github.com/vllm-project/vllm/blob/main/vllm/compilation/passes/fusion/matcher_utils.py#L29

This PR adds temporary compatibility placeholders (rms_norm,
fused_add_rms_norm, rotate_embedding, static/dynamic fp8 quant,
silu_and_mul) to
`vllm_ascend/patch/platform/patch_fusion_matcher_compat_ops.py` to
ensure no crashes during the import phase. Upstream repairs will be
considered later.

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.16.0
- vLLM main:
vllm-project/vllm@15d76f7

---------

Signed-off-by: MrZ20 <2609716663@qq.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Co-authored-by: Meihan-chen <jcccx.cmh@gmail.com>
Co-authored-by: Claude Code <noreply@anthropic.com>
Co-authored-by: gcanlin <canlinguosdu@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants