Skip to content

[Main] Upgrade PTA to 2.9.0#6112

Merged
wangxiyuan merged 2 commits intovllm-project:mainfrom
wjunLu:main-pta
Jan 22, 2026
Merged

[Main] Upgrade PTA to 2.9.0#6112
wangxiyuan merged 2 commits intovllm-project:mainfrom
wjunLu:main-pta

Conversation

@wjunLu
Copy link
Copy Markdown
Collaborator

@wjunLu wjunLu commented Jan 22, 2026

What this PR does / why we need it?

Upgrade PTA to 2.9.0

Does this PR introduce any user-facing change?

How was this patch tested?

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jan 22, 2026
@github-actions
Copy link
Copy Markdown
Contributor

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@wjunLu wjunLu added ready read for review ready-for-test start test by label for PR labels Jan 22, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request upgrades the PyTorch Ascend (PTA) dependency to version 2.9.0. The changes consistently update the version number across various configuration files and documentation. My review identified a couple of outdated documentation links that were updated in this PR but still point to old versions. Correcting these will improve the user experience. Additionally, I noticed another outdated link in docs/source/installation.md that was not part of this PR's changes; you may want to address that as well for consistency.

Comment thread README.md
- Python >= 3.10, < 3.12
- CANN == 8.3.rc2 (Ascend HDK version refers to [here](https://www.hiascend.com/document/detail/zh/canncommercial/83RC2/releasenote/releasenote_0000.html))
- PyTorch == 2.8.0, torch-npu == 2.8.0
- CANN == 8.5.0 (Ascend HDK version refers to [here](https://www.hiascend.com/document/detail/zh/canncommercial/83RC2/releasenote/releasenote_0000.html))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The link to the Ascend HDK version release notes is outdated. It still points to the documentation for CANN 8.3.RC2, but the required version has been updated to 8.5.0. This could cause confusion for users trying to set up the correct environment. Please update the link to point to the release notes for CANN 8.5.0.

Suggested change
- CANN == 8.5.0 (Ascend HDK version refers to [here](https://www.hiascend.com/document/detail/zh/canncommercial/83RC2/releasenote/releasenote_0000.html))
- CANN == 8.5.0 (Ascend HDK version refers to [here](https://www.hiascend.com/document/detail/zh/canncommercial/850/releasenote/releasenote_0000.html))

Comment thread README.zh.md
- Python >= 3.10, < 3.12
- CANN == 8.3.rc2 (Ascend HDK 版本参考[这里](https://www.hiascend.com/document/detail/zh/canncommercial/83RC2/releasenote/releasenote_0000.html))
- PyTorch == 2.8.0, torch-npu == 2.8.0
- CANN == 8.5.0 (Ascend HDK 版本参考[这里](https://www.hiascend.com/document/detail/zh/canncommercial/83RC2/releasenote/releasenote_0000.html))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The link to the Ascend HDK version release notes is outdated. It still points to the documentation for CANN 8.3.RC2, but the required version has been updated to 8.5.0. This could cause confusion for users. Please update the link to point to the release notes for CANN 8.5.0.

Suggested change
- CANN == 8.5.0 (Ascend HDK 版本参考[这里](https://www.hiascend.com/document/detail/zh/canncommercial/83RC2/releasenote/releasenote_0000.html))
- CANN == 8.5.0 (Ascend HDK 版本参考[这里](https://www.hiascend.com/document/detail/zh/canncommercial/850/releasenote/releasenote_0000.html))

@wangxiyuan wangxiyuan added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels Jan 22, 2026
@wjunLu
Copy link
Copy Markdown
Collaborator Author

wjunLu commented Jan 22, 2026

  • FAILED tests/e2e/singlecard/test_aclgraph_accuracy.py::test_npugraph_ex_res_consistency[cur_case1] - AssertionError: Test0:
E           AssertionError: Test0:
E           output_origin:	'\n\nSelect an assignment template'
E           outputs_gen:	"\n\nYour answer seems reasonable. Find out if you're right!\n\nSign up to access problem solutions.\n\nThat seems reasonable. Find out"

tests/e2e/model_utils.py:51: AssertionError

@wangxiyuan
Copy link
Copy Markdown
Collaborator

  • test_npugraph_ex_res_consistency

I think we can update this test together

@wjunLu
Copy link
Copy Markdown
Collaborator Author

wjunLu commented Jan 22, 2026

@wjunLu
Copy link
Copy Markdown
Collaborator Author

wjunLu commented Jan 22, 2026

  • test_npugraph_ex_res_consistency

OK, I'll update this

Signed-off-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: wjunLu <wjunlu217@gmail.com>
@wjunLu
Copy link
Copy Markdown
Collaborator Author

wjunLu commented Jan 22, 2026

  • test_npugraph_ex_res_consistency

I think we can update this test together

I have to skip this case first since the outputs are not stable, see

2nd running result is:

============================== slowest durations ===============================
113.63s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_piecewise_res_consistency[cur_case1]
97.59s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_piecewise_res_consistency[cur_case0]
78.75s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_npugraph_ex_res_consistency[cur_case1]
70.35s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_npugraph_ex_res_consistency[cur_case0]
64.76s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_full_decode_only_res_consistency[cur_case1]
61.77s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_full_decode_only_res_consistency[cur_case0]

(12 durations < 0.005s hidden.  Use -vv to show these durations.)
=========================== short test summary info ============================
FAILED tests/e2e/singlecard/test_aclgraph_accuracy.py::test_npugraph_ex_res_consistency[cur_case1] - AssertionError: Test1:
output_origin:	"\n\nYour answer seems reasonable. Find out if you're right!\n\nSign up to access problem solutions.\n\nThat seems reasonable. Find out"
outputs_gen:	"\n\nI'm not sure how to approach this problem. I'm not sure if I should use the law of total probability or if I should use"
============= 1 failed, 5 passed, 2 warnings in 487.13s (0:08:07) ==============

1st running result is:

============================== slowest durations ===============================
120.80s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_piecewise_res_consistency[cur_case0]
105.53s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_piecewise_res_consistency[cur_case1]
86.36s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_npugraph_ex_res_consistency[cur_case1]
70.51s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_npugraph_ex_res_consistency[cur_case0]
69.33s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_full_decode_only_res_consistency[cur_case1]
67.09s call     tests/e2e/singlecard/test_aclgraph_accuracy.py::test_full_decode_only_res_consistency[cur_case0]

(12 durations < 0.005s hidden.  Use -vv to show these durations.)
=========================== short test summary info ============================
FAILED tests/e2e/singlecard/test_aclgraph_accuracy.py::test_npugraph_ex_res_consistency[cur_case1] - AssertionError: Test0:
output_origin:	'\n\nSelect an assignment template'
outputs_gen:	"\n\nYour answer seems reasonable. Find out if you're right!\n\nSign up to access problem solutions.\n\nThat seems reasonable. Find out"
============= 1 failed, 5 passed, 2 warnings in 519.88s (0:08:39) ==============

But I will find out if all 3 prompts should keep the same golden result or not.

@wjunLu
Copy link
Copy Markdown
Collaborator Author

wjunLu commented Jan 22, 2026

  • [local error] tests/e2e/singlecard/spec_decode/test_v1_spec_decode.py
======================================================== warnings summary ========================================================
<frozen importlib._bootstrap>:241
  <frozen importlib._bootstrap>:241: DeprecationWarning: builtin type SwigPyPacked has no __module__ attribute

<frozen importlib._bootstrap>:241
  <frozen importlib._bootstrap>:241: DeprecationWarning: builtin type SwigPyObject has no __module__ attribute

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
==================================================== short test summary info =====================================================
FAILED tests/e2e/singlecard/spec_decode/test_v1_spec_decode.py::test_llama_qwen_eagle_acceptance[True-False-None-3-eagle] - assert False
Adding requests: 100%|██████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 38.08it/s]
Processed prompts:   0%|                                | 0/4 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s](Worker pid=18518) INFO 01-22 09:20:59 [acl_graph.py:188] Replaying aclgraph
Processed prompts: 100%|██████████████████████| 4/4 [00:04<00:00,  1.25s/it, est. speed input: 33.32 toks/s, output: 62.23 toks/s]
(Worker pid=18518) INFO 01-22 09:21:04 [multiproc_executor.py:707] Parent process exited, terminating worker
acceptance_per_pos: [0.7205882352941176, 0.375, 0.17647058823529413]
golden: [0.74, 0.44, 0.29]
FAILED

@wangxiyuan wangxiyuan merged commit a7d781f into vllm-project:main Jan 22, 2026
20 checks passed
845473182 pushed a commit to 845473182/vllm-ascend that referenced this pull request Jan 22, 2026
…to qwen3next_rebase

* 'main' of https://github.com/vllm-project/vllm-ascend: (51 commits)
  [Bugfix] Remove `use_aclgraph` in mtp_proposer and use `use_cuda_graph` (vllm-project#6032)
  [BugFix] fix 3vl dense model load quant weight (vllm-project#6100)
  [CP&SP] Integrate FIA operator in mla_cp._forward_decode (vllm-project#5641)
  [CI][Doc] Upgrade wheel building's CANN to 8.5.0 and update the Docs (vllm-project#6145)
  [CI]Install clang in dokerfile for triton ascend (vllm-project#4409)
  [Main] Upgrade PTA to 2.9.0 (vllm-project#6112)
  [Graph][Fusion] Add QKVNormRope and QKVNormRopeWithBias (vllm-project#5721)
  [P/D][PCP]bugfix pcp force free twice caused logger error (vllm-project#6124)
  [BugFix]converting pa get_workspace back to capturing (vllm-project#5833)
  [CI] optimize lint term (vllm-project#5986)
  [Bugfix] Fix Triton operator usage for multimodal models based on `the mrope_interleaved` parameter (vllm-project#6042)
  [bugfix][npugraph_ex]fix the model output type issue caused by manually modify FX graph (vllm-project#6015)
  [BugFix] Support setting tp=1 for the Eagle draft model to take effect (vllm-project#6097)
  [Misc] Bump mooncake version to v0.3.8.post1 (vllm-project#6110)
  [Feature]Enable DispatchGmmCombineDecode when eagle is moe with w8a8 or not moe [RFC: issue 5476] (vllm-project#5758)
  [bugfix] adapt_remote_request_id (vllm-project#6051)
  [Feature] Add support of new W4A4_LAOS_DYNAMIC quantization method (vllm-project#5143)
  [Feature] Support DSA-CP for Hybrid scenario (vllm-project#5702)
  [CI] Upgrade CANN to 8.5.0 (vllm-project#6070)
  Default enable MLAPO (vllm-project#5952)
  ...
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
### What this PR does / why we need it?
Upgrade PTA to 2.9.0

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: wjunLu <wjunlu217@gmail.com>
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
### What this PR does / why we need it?
Upgrade PTA to 2.9.0

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: wjunLu <wjunlu217@gmail.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Feb 28, 2026
### What this PR does / why we need it?
Upgrade PTA to 2.9.0

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
maoxx241 pushed a commit to maoxx241/vllm-ascend that referenced this pull request Mar 2, 2026
### What this PR does / why we need it?
Upgrade PTA to 2.9.0

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: wjunLu <wjunlu217@gmail.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Mar 4, 2026
### What this PR does / why we need it?
Upgrade PTA to 2.9.0

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
LCAIZJ pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Mar 7, 2026
### What this PR does / why we need it?
Upgrade PTA to 2.9.0

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: wjunLu <wjunlu217@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants