Skip to content

[Main2Main] Upgrade vllm commit to 0108#5727

Closed
zhangxinyuehfad wants to merge 2 commits intovllm-project:mainfrom
zhangxinyuehfad:main0108
Closed

[Main2Main] Upgrade vllm commit to 0108#5727
zhangxinyuehfad wants to merge 2 commits intovllm-project:mainfrom
zhangxinyuehfad:main0108

Conversation

@zhangxinyuehfad
Copy link
Copy Markdown
Collaborator

@zhangxinyuehfad zhangxinyuehfad commented Jan 8, 2026

What this PR does / why we need it?

Upgrade vllm commit to 0108 (eac3b96)

  1. remove init_cached_hf_modules due to [Chore] Try remove init_cached_hf_modules vllm#31786
  2. skip spec_decode e2e test due to [Perf] Async Scheduling + Speculative Decoding + Structured Outputs vllm#29821 break
  3. fix vllm.v1.attention.backends.utils duo to [Chore] Migrate V0 attention utils vllm#31891
  4. skip test_qwen3_next_distributed_mp_full_decode_only_tp4 due to [Attention][1/n] Remove usage of deprecated seq_lens_cpu and num_computed_tokens_cpu CommonAttentionMetadata properties vllm#31773 ([Bugfix] Keep all tensors to be on the same device vllm#31958 will fix)

Does this PR introduce any user-facing change?

How was this patch tested?

@vllm-ascend-ci vllm-ascend-ci added ready read for review ready-for-test start test by label for PR labels Jan 8, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request upgrades the vLLM commit hash and introduces compatibility shims for changes in the new version. The changes correctly identify areas that need adaptation, such as the import path for PAD_SLOT_ID and the usage of init_cached_hf_modules.

However, there are some critical and high-severity issues. The compatibility logic relies on an exact version match (vllm_version_is), which is brittle and will likely fail for versions other than the one specified. This needs to be replaced with more robust version range checks. Additionally, the conditional import logic is duplicated across multiple files, which impacts maintainability. I've left specific comments with suggestions to address these points by refactoring the version checking utility and centralizing compatibility imports.

Comment thread tests/ut/worker/test_worker_v1.py
Comment thread vllm_ascend/attention/mla_v1.py Outdated
Comment thread vllm_ascend/worker/worker.py
Comment thread vllm_ascend/attention/mla_v1.py Outdated
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 8, 2026

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@zhangxinyuehfad zhangxinyuehfad force-pushed the main0108 branch 2 times, most recently from 9ee9933 to 7fbcbb0 Compare January 9, 2026 03:43
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 9, 2026

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: hfadzxy <starmoon_zhang@163.com>
@wjunLu
Copy link
Copy Markdown
Collaborator

wjunLu commented Jan 12, 2026

We don't need this PR anymore, please just keep 0112

@zhangxinyuehfad zhangxinyuehfad deleted the main0108 branch March 19, 2026 02:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build documentation Improvements or additions to documentation module:ops module:tests ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants