Skip to content

[Main2Main] Upgrade vllm commit to 0109#5752

Merged
wangxiyuan merged 1 commit intovllm-project:mainfrom
zhangxinyuehfad:main0109
Jan 13, 2026
Merged

[Main2Main] Upgrade vllm commit to 0109#5752
wangxiyuan merged 1 commit intovllm-project:mainfrom
zhangxinyuehfad:main0109

Conversation

@zhangxinyuehfad
Copy link
Copy Markdown
Collaborator

@zhangxinyuehfad zhangxinyuehfad commented Jan 9, 2026

What this PR does / why we need it?

Upgrade vllm commit to 0109 (bde38c11df0ea066a740efe9b77fff5418be45df)

  1. remove init_cached_hf_modules due to [Chore] Try remove init_cached_hf_modules vllm#31786
  2. fix spec_decode e2e test due to [Perf] Async Scheduling + Speculative Decoding + Structured Outputs vllm#29821 break
  3. fix vllm.v1.attention.backends.utils duo to [Chore] Migrate V0 attention utils vllm#31891
  4. fix self.seq_lens - query_lens on same device due to [Attention][1/n] Remove usage of deprecated seq_lens_cpu and num_computed_tokens_cpu CommonAttentionMetadata properties vllm#31773
  5. skip model_runner_v2 e2e test due to '_OpNamespace' '_C' object has no attribute 'get_cuda_view_from_cpu_tensor'

Does this PR introduce any user-facing change?

How was this patch tested?

@vllm-ascend-ci vllm-ascend-ci added ready read for review ready-for-test start test by label for PR labels Jan 9, 2026
@github-actions github-actions bot added documentation Improvements or additions to documentation ci/build module:tests module:ops labels Jan 9, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 9, 2026

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the vllm dependency and introduces compatibility code. The changes look reasonable, but there is a critical issue with how the vllm version is being checked. The new code uses vllm_version_is, which performs an exact version match. This is very brittle and will likely break with older or newer patch/minor versions of vllm. I've left comments with suggestions to use version range comparisons for more robust and future-proof code.

@zhangxinyuehfad zhangxinyuehfad force-pushed the main0109 branch 2 times, most recently from 1d83826 to 07020bf Compare January 9, 2026 06:04
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 9, 2026

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
@wangxiyuan wangxiyuan merged commit f7b9046 into vllm-project:main Jan 13, 2026
21 of 23 checks passed
845473182 pushed a commit to 845473182/vllm-ascend that referenced this pull request Jan 14, 2026
…to eplb_refactor

* 'main' of https://github.com/vllm-project/vllm-ascend:
  [CI] Fix lint CI (vllm-project#5880)
  [Feature] implement eagle spec decoding for model runner v2 (vllm-project#5840)
  [Quantization] Support compressed tensors moe w8a8 int8 dynamic weight (vllm-project#5718)
  [EPLB][Bugfix] Get expert map from layers (vllm-project#5817)
  [Bugfix] Fixed an accuracy problem of sp with eagle3 (vllm-project#5816)
  [P/D] bugfix for p node force free requset (vllm-project#5431)
  [Lint]Style: Convert `example` to `ruff format` (vllm-project#5863)
  [Main2Main] Upgrade vllm commit to 0109 (vllm-project#5752)
  [Bugfix][P/D] fix layerwise connector for decoder tp size > num kv heads (vllm-project#5846)
  [Test][e2e][LoRA] Add more e2e tests to cover scenarios of LoRA (vllm-project#4075)
  [CustomOp][Perf] Merge Q/K split to simplify AscendApplyRotaryEmb for better performance (vllm-project#5799)
  [Lint]Style: Convert `root`, `benchmarks`, `tools` and `docs` to `ruff format` (vllm-project#5843)
  enable ep32 for dispatch_ffn_combine (vllm-project#5787)
aipaes pushed a commit to aipaes/vllm-ascend that referenced this pull request Jan 15, 2026
### What this PR does / why we need it?
Upgrade vllm commit to 0109 (bde38c11df0ea066a740efe9b77fff5418be45df)

1. remove `init_cached_hf_modules ` due to
vllm-project/vllm#31786
2. fix spec_decode e2e test due to
vllm-project/vllm#29821 break
3. fix `vllm.v1.attention.backends.utils` duo to
vllm-project/vllm#31891
4. fix `self.seq_lens - query_lens` on same device due to
vllm-project/vllm#31773
5. skip model_runner_v2 e2e test due to `'_OpNamespace' '_C' object has
no attribute 'get_cuda_view_from_cpu_tensor'`

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2f4e654

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
### What this PR does / why we need it?
Upgrade vllm commit to 0109 (bde38c11df0ea066a740efe9b77fff5418be45df)

1. remove `init_cached_hf_modules ` due to
vllm-project/vllm#31786
2. fix spec_decode e2e test due to
vllm-project/vllm#29821 break
3. fix `vllm.v1.attention.backends.utils` duo to
vllm-project/vllm#31891
4. fix `self.seq_lens - query_lens` on same device due to
vllm-project/vllm#31773
5. skip model_runner_v2 e2e test due to `'_OpNamespace' '_C' object has
no attribute 'get_cuda_view_from_cpu_tensor'`

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2f4e654

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
### What this PR does / why we need it?
Upgrade vllm commit to 0109 (bde38c11df0ea066a740efe9b77fff5418be45df)

1. remove `init_cached_hf_modules ` due to
vllm-project/vllm#31786
2. fix spec_decode e2e test due to
vllm-project/vllm#29821 break
3. fix `vllm.v1.attention.backends.utils` duo to
vllm-project/vllm#31891
4. fix `self.seq_lens - query_lens` on same device due to
vllm-project/vllm#31773
5. skip model_runner_v2 e2e test due to `'_OpNamespace' '_C' object has
no attribute 'get_cuda_view_from_cpu_tensor'`

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2f4e654

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Feb 28, 2026
### What this PR does / why we need it?
Upgrade vllm commit to 0109 (bde38c11df0ea066a740efe9b77fff5418be45df)

1. remove `init_cached_hf_modules ` due to
vllm-project/vllm#31786
2. fix spec_decode e2e test due to
vllm-project/vllm#29821 break
3. fix `vllm.v1.attention.backends.utils` duo to
vllm-project/vllm#31891
4. fix `self.seq_lens - query_lens` on same device due to
vllm-project/vllm#31773
5. skip model_runner_v2 e2e test due to `'_OpNamespace' '_C' object has
no attribute 'get_cuda_view_from_cpu_tensor'`

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2f4e654

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
maoxx241 pushed a commit to maoxx241/vllm-ascend that referenced this pull request Mar 2, 2026
### What this PR does / why we need it?
Upgrade vllm commit to 0109 (bde38c11df0ea066a740efe9b77fff5418be45df)

1. remove `init_cached_hf_modules ` due to
vllm-project/vllm#31786
2. fix spec_decode e2e test due to
vllm-project/vllm#29821 break
3. fix `vllm.v1.attention.backends.utils` duo to
vllm-project/vllm#31891
4. fix `self.seq_lens - query_lens` on same device due to
vllm-project/vllm#31773
5. skip model_runner_v2 e2e test due to `'_OpNamespace' '_C' object has
no attribute 'get_cuda_view_from_cpu_tensor'`

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2f4e654

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Mar 4, 2026
### What this PR does / why we need it?
Upgrade vllm commit to 0109 (bde38c11df0ea066a740efe9b77fff5418be45df)

1. remove `init_cached_hf_modules ` due to
vllm-project/vllm#31786
2. fix spec_decode e2e test due to
vllm-project/vllm#29821 break
3. fix `vllm.v1.attention.backends.utils` duo to
vllm-project/vllm#31891
4. fix `self.seq_lens - query_lens` on same device due to
vllm-project/vllm#31773
5. skip model_runner_v2 e2e test due to `'_OpNamespace' '_C' object has
no attribute 'get_cuda_view_from_cpu_tensor'`

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2f4e654

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
LCAIZJ pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Mar 7, 2026
### What this PR does / why we need it?
Upgrade vllm commit to 0109 (bde38c11df0ea066a740efe9b77fff5418be45df)

1. remove `init_cached_hf_modules ` due to
vllm-project/vllm#31786
2. fix spec_decode e2e test due to
vllm-project/vllm#29821 break
3. fix `vllm.v1.attention.backends.utils` duo to
vllm-project/vllm#31891
4. fix `self.seq_lens - query_lens` on same device due to
vllm-project/vllm#31773
5. skip model_runner_v2 e2e test due to `'_OpNamespace' '_C' object has
no attribute 'get_cuda_view_from_cpu_tensor'`

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2f4e654

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
@zhangxinyuehfad zhangxinyuehfad deleted the main0109 branch March 19, 2026 02:09
winson-00178005 added a commit to winson-00178005/vllm-ascend that referenced this pull request Mar 26, 2026
- Remove is_skipped flag from tests/e2e/singlecard/model_runner_v2/test_basic.py
- Test was originally skipped due to get_cuda_view_from_cpu_tensor error (vllm-project#5752)
- Recent model_runner_v2 improvements may have resolved the issue:
  - vllm-project#7110: Added aclgraph support
  - vllm-project#7496: Optimized post_update performance
  - vllm-project#7221: Optimized _topk_log_softmax_kernel performance
- CI will verify if the test now passes successfully

Signed-off-by: hejianping <hejianping7@huawei.com>
winson-00178005 added a commit to winson-00178005/vllm-ascend that referenced this pull request Mar 26, 2026
- Remove is_skipped flag from tests/e2e/singlecard/model_runner_v2/test_basic.py
- Test was originally skipped due to get_cuda_view_from_cpu_tensor error (vllm-project#5752)
- Recent model_runner_v2 improvements may have resolved the issue:
  - vllm-project#7110: Added aclgraph support
  - vllm-project#7496: Optimized post_update performance
  - vllm-project#7221: Optimized _topk_log_softmax_kernel performance
- CI will verify if test now passes successfully

Signed-off-by: hejianping <hejianping7@huawei.com>
winson-00178005 added a commit to winson-00178005/vllm-ascend that referenced this pull request Mar 26, 2026
- Remove is_skipped flag from tests/e2e/singlecard/model_runner_v2/test_basic.py
- Test was originally skipped due to get_cuda_view_from_cpu_tensor error (vllm-project#5752)
- Recent model_runner_v2 improvements may have resolved the issue:
  - vllm-project#7110: Added aclgraph support
  - vllm-project#7496: Optimized post_update performance
  - vllm-project#7221: Optimized _topk_log_softmax_kernel performance
- CI will verify if the test now passes successfully

Signed-off-by: hejianping <hejianping7@huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build documentation Improvements or additions to documentation module:ops module:tests ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants