[Main] Upgrade PTA to 2.9.0 by wjunLu · Pull Request #6112 · vllm-project/vllm-ascend

wjunLu · 2026-01-22T03:07:25Z

What this PR does / why we need it?

Upgrade PTA to 2.9.0

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.13.0
vLLM main: vllm-project/vllm@d682094

github-actions · 2026-01-22T03:07:39Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request upgrades the PyTorch Ascend (PTA) dependency to version 2.9.0. The changes consistently update the version number across various configuration files and documentation. My review identified a couple of outdated documentation links that were updated in this PR but still point to old versions. Correcting these will improve the user experience. Additionally, I noticed another outdated link in docs/source/installation.md that was not part of this PR's changes; you may want to address that as well for consistency.

gemini-code-assist · 2026-01-22T03:08:42Z

    - Python >= 3.10, < 3.12
-    - CANN == 8.3.rc2 (Ascend HDK version refers to [here](https://www.hiascend.com/document/detail/zh/canncommercial/83RC2/releasenote/releasenote_0000.html))
-    - PyTorch == 2.8.0, torch-npu == 2.8.0
+    - CANN == 8.5.0 (Ascend HDK version refers to [here](https://www.hiascend.com/document/detail/zh/canncommercial/83RC2/releasenote/releasenote_0000.html))


The link to the Ascend HDK version release notes is outdated. It still points to the documentation for CANN 8.3.RC2, but the required version has been updated to 8.5.0. This could cause confusion for users trying to set up the correct environment. Please update the link to point to the release notes for CANN 8.5.0.

Suggested change

- CANN == 8.5.0 (Ascend HDK version refers to [here](https://www.hiascend.com/document/detail/zh/canncommercial/83RC2/releasenote/releasenote_0000.html))

- CANN == 8.5.0 (Ascend HDK version refers to [here](https://www.hiascend.com/document/detail/zh/canncommercial/850/releasenote/releasenote_0000.html))

gemini-code-assist · 2026-01-22T03:08:42Z

    - Python >= 3.10, < 3.12
-    - CANN == 8.3.rc2 (Ascend HDK 版本参考[这里](https://www.hiascend.com/document/detail/zh/canncommercial/83RC2/releasenote/releasenote_0000.html))
-    - PyTorch == 2.8.0, torch-npu == 2.8.0
+    - CANN == 8.5.0 (Ascend HDK 版本参考[这里](https://www.hiascend.com/document/detail/zh/canncommercial/83RC2/releasenote/releasenote_0000.html))


The link to the Ascend HDK version release notes is outdated. It still points to the documentation for CANN 8.3.RC2, but the required version has been updated to 8.5.0. This could cause confusion for users. Please update the link to point to the release notes for CANN 8.5.0.

Suggested change

- CANN == 8.5.0 (Ascend HDK 版本参考[这里](https://www.hiascend.com/document/detail/zh/canncommercial/83RC2/releasenote/releasenote_0000.html))

- CANN == 8.5.0 (Ascend HDK 版本参考[这里](https://www.hiascend.com/document/detail/zh/canncommercial/850/releasenote/releasenote_0000.html))

wjunLu · 2026-01-22T05:34:48Z

FAILED tests/e2e/singlecard/test_aclgraph_accuracy.py::test_npugraph_ex_res_consistency[cur_case1] - AssertionError: Test0:

E           AssertionError: Test0:
E           output_origin:	'\n\nSelect an assignment template'
E           outputs_gen:	"\n\nYour answer seems reasonable. Find out if you're right!\n\nSign up to access problem solutions.\n\nThat seems reasonable. Find out"

tests/e2e/model_utils.py:51: AssertionError

wangxiyuan · 2026-01-22T06:09:24Z

test_npugraph_ex_res_consistency

I think we can update this test together

wjunLu · 2026-01-22T06:31:43Z

2/4-cards are all passed: https://github.com/vllm-project/vllm-ascend/actions/runs/21235020255/job/61107478634?pr=6112

wjunLu · 2026-01-22T06:33:43Z

test_npugraph_ex_res_consistency

OK, I'll update this

Signed-off-by: wjunLu <wjunlu217@gmail.com>

wjunLu · 2026-01-22T07:27:29Z

test_npugraph_ex_res_consistency

I think we can update this test together

I have to skip this case first since the outputs are not stable, see

2nd running result is:

============================== slowest durations ===============================
113.63s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_piecewise_res_consistency[cur_case1]
97.59s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_piecewise_res_consistency[cur_case0]
78.75s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_npugraph_ex_res_consistency[cur_case1]
70.35s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_npugraph_ex_res_consistency[cur_case0]
64.76s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_full_decode_only_res_consistency[cur_case1]
61.77s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_full_decode_only_res_consistency[cur_case0]

(12 durations < 0.005s hidden.  Use -vv to show these durations.)
=========================== short test summary info ============================
FAILED tests/e2e/singlecard/test_aclgraph_accuracy.py::test_npugraph_ex_res_consistency[cur_case1] - AssertionError: Test1:
output_origin:	"\n\nYour answer seems reasonable. Find out if you're right!\n\nSign up to access problem solutions.\n\nThat seems reasonable. Find out"
outputs_gen:	"\n\nI'm not sure how to approach this problem. I'm not sure if I should use the law of total probability or if I should use"
============= 1 failed, 5 passed, 2 warnings in 487.13s (0:08:07) ==============

1st running result is:

============================== slowest durations ===============================
120.80s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_piecewise_res_consistency[cur_case0]
105.53s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_piecewise_res_consistency[cur_case1]
86.36s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_npugraph_ex_res_consistency[cur_case1]
70.51s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_npugraph_ex_res_consistency[cur_case0]
69.33s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_full_decode_only_res_consistency[cur_case1]
67.09s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_full_decode_only_res_consistency[cur_case0]

(12 durations < 0.005s hidden.  Use -vv to show these durations.)
=========================== short test summary info ============================
FAILED tests/e2e/singlecard/test_aclgraph_accuracy.py::test_npugraph_ex_res_consistency[cur_case1] - AssertionError: Test0:
output_origin:	'\n\nSelect an assignment template'
outputs_gen:	"\n\nYour answer seems reasonable. Find out if you're right!\n\nSign up to access problem solutions.\n\nThat seems reasonable. Find out"
============= 1 failed, 5 passed, 2 warnings in 519.88s (0:08:39) ==============

But I will find out if all 3 prompts should keep the same golden result or not.

wjunLu · 2026-01-22T09:37:03Z

[local error] tests/e2e/singlecard/spec_decode/test_v1_spec_decode.py

======================================================== warnings summary ========================================================
<frozen importlib._bootstrap>:241
  <frozen importlib._bootstrap>:241: DeprecationWarning: builtin type SwigPyPacked has no __module__ attribute

<frozen importlib._bootstrap>:241
  <frozen importlib._bootstrap>:241: DeprecationWarning: builtin type SwigPyObject has no __module__ attribute

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
==================================================== short test summary info =====================================================
FAILED tests/e2e/singlecard/spec_decode/test_v1_spec_decode.py::test_llama_qwen_eagle_acceptance[True-False-None-3-eagle] - assert False

Adding requests: 100%|██████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 38.08it/s]
Processed prompts:   0%|                                | 0/4 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s](Worker pid=18518) INFO 01-22 09:20:59 [acl_graph.py:188] Replaying aclgraph
Processed prompts: 100%|██████████████████████| 4/4 [00:04<00:00,  1.25s/it, est. speed input: 33.32 toks/s, output: 62.23 toks/s]
(Worker pid=18518) INFO 01-22 09:21:04 [multiproc_executor.py:707] Parent process exited, terminating worker
acceptance_per_pos: [0.7205882352941176, 0.375, 0.17647058823529413]
golden: [0.74, 0.44, 0.29]
FAILED

…to qwen3next_rebase * 'main' of https://github.com/vllm-project/vllm-ascend: (51 commits) [Bugfix] Remove `use_aclgraph` in mtp_proposer and use `use_cuda_graph` (vllm-project#6032) [BugFix] fix 3vl dense model load quant weight (vllm-project#6100) [CP&SP] Integrate FIA operator in mla_cp._forward_decode (vllm-project#5641) [CI][Doc] Upgrade wheel building's CANN to 8.5.0 and update the Docs (vllm-project#6145) [CI]Install clang in dokerfile for triton ascend (vllm-project#4409) [Main] Upgrade PTA to 2.9.0 (vllm-project#6112) [Graph][Fusion] Add QKVNormRope and QKVNormRopeWithBias (vllm-project#5721) [P/D][PCP]bugfix pcp force free twice caused logger error (vllm-project#6124) [BugFix]converting pa get_workspace back to capturing (vllm-project#5833) [CI] optimize lint term (vllm-project#5986) [Bugfix] Fix Triton operator usage for multimodal models based on `the mrope_interleaved` parameter (vllm-project#6042) [bugfix][npugraph_ex]fix the model output type issue caused by manually modify FX graph (vllm-project#6015) [BugFix] Support setting tp=1 for the Eagle draft model to take effect (vllm-project#6097) [Misc] Bump mooncake version to v0.3.8.post1 (vllm-project#6110) [Feature]Enable DispatchGmmCombineDecode when eagle is moe with w8a8 or not moe [RFC: issue 5476] (vllm-project#5758) [bugfix] adapt_remote_request_id (vllm-project#6051) [Feature] Add support of new W4A4_LAOS_DYNAMIC quantization method (vllm-project#5143) [Feature] Support DSA-CP for Hybrid scenario (vllm-project#5702) [CI] Upgrade CANN to 8.5.0 (vllm-project#6070) Default enable MLAPO (vllm-project#5952) ...

### What this PR does / why we need it? Upgrade PTA to 2.9.0 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@d682094 --------- Signed-off-by: wjunLu <wjunlu217@gmail.com>

### What this PR does / why we need it? Upgrade PTA to 2.9.0 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@d682094 --------- Signed-off-by: wjunLu <wjunlu217@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

### What this PR does / why we need it? Upgrade PTA to 2.9.0 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@d682094 --------- Signed-off-by: wjunLu <wjunlu217@gmail.com>

### What this PR does / why we need it? Upgrade PTA to 2.9.0 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@d682094 --------- Signed-off-by: wjunLu <wjunlu217@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

### What this PR does / why we need it? Upgrade PTA to 2.9.0 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@d682094 --------- Signed-off-by: wjunLu <wjunlu217@gmail.com>

wjunLu requested review from LCAIZJ, Yikun, wangxiyuan and zzzzwwjj as code owners January 22, 2026 03:07

github-actions bot added the documentation Improvements or additions to documentation label Jan 22, 2026

wjunLu added ready read for review ready-for-test start test by label for PR labels Jan 22, 2026

gemini-code-assist bot reviewed Jan 22, 2026

View reviewed changes

wangxiyuan added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels Jan 22, 2026

Upgrade PTA to 2.9.0

4eb2b2a

Signed-off-by: wjunLu <wjunlu217@gmail.com>

wjunLu force-pushed the main-pta branch from 5951046 to 4eb2b2a Compare January 22, 2026 06:48

skip tests due to accuray errors

3034554

Signed-off-by: wjunLu <wjunlu217@gmail.com>

wangxiyuan merged commit a7d781f into vllm-project:main Jan 22, 2026
20 checks passed

Yikun mentioned this pull request Feb 5, 2026

[v0.13.0rc2] FAQ / Feedback | 问题/反馈 #6186

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Main] Upgrade PTA to 2.9.0#6112

[Main] Upgrade PTA to 2.9.0#6112
wangxiyuan merged 2 commits intovllm-project:mainfrom
wjunLu:main-pta

wjunLu commented Jan 22, 2026 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jan 22, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 22, 2026

Uh oh!

gemini-code-assist bot Jan 22, 2026

Uh oh!

wjunLu commented Jan 22, 2026

Uh oh!

wangxiyuan commented Jan 22, 2026

Uh oh!

wjunLu commented Jan 22, 2026

Uh oh!

wjunLu commented Jan 22, 2026

Uh oh!

wjunLu commented Jan 22, 2026

Uh oh!

wjunLu commented Jan 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	- CANN == 8.5.0 (Ascend HDK version refers to [here](https://www.hiascend.com/document/detail/zh/canncommercial/83RC2/releasenote/releasenote_0000.html))
	- CANN == 8.5.0 (Ascend HDK version refers to [here](https://www.hiascend.com/document/detail/zh/canncommercial/850/releasenote/releasenote_0000.html))

	- CANN == 8.5.0 (Ascend HDK 版本参考[这里](https://www.hiascend.com/document/detail/zh/canncommercial/83RC2/releasenote/releasenote_0000.html))
	- CANN == 8.5.0 (Ascend HDK 版本参考[这里](https://www.hiascend.com/document/detail/zh/canncommercial/850/releasenote/releasenote_0000.html))

Conversation

wjunLu commented Jan 22, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Jan 22, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

wjunLu commented Jan 22, 2026

Uh oh!

wangxiyuan commented Jan 22, 2026

Uh oh!

wjunLu commented Jan 22, 2026

Uh oh!

wjunLu commented Jan 22, 2026

Uh oh!

wjunLu commented Jan 22, 2026

Uh oh!

wjunLu commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wjunLu commented Jan 22, 2026 •

edited by github-actions bot

Loading

wjunLu commented Jan 22, 2026 •

edited

Loading