Skip to content

[Main2Main] Upgrade vllm commit to 0120 #6040

Closed
Meihan-chen wants to merge 13 commits intovllm-project:mainfrom
Meihan-chen:main0120
Closed

[Main2Main] Upgrade vllm commit to 0120 #6040
Meihan-chen wants to merge 13 commits intovllm-project:mainfrom
Meihan-chen:main0120

Conversation

@Meihan-chen
Copy link
Contributor

@Meihan-chen Meihan-chen commented Jan 20, 2026

What this PR does / why we need it?

  1. ✅ Upgrade vllm commit to: 0115 (8471b27df97c3eb79f891802fc0e858f8f7ac6a0)
    Modify import paths due to the refactors:
    [Model Runner V2] Refactor Sampler vllm#32245
    [4/N][Attention] Move MLA common to model_executor vllm#32060
    Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913
  2. ✅Upgrade vllm commit to: 0119 (9a1f16da1e423ede2c2f52a9850cbfbb39cefe96)
    Fix WorkerProc.__init__() missing 1 required positional argument: 'is_driver_worker' due to [TPU][Core] Enable Pipeline Parallelism on TPU backend vllm#28506
    https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569
  3. ✅Upgrade vllm commit to: 0120(148117ea2e689cd43df4be6892671a17cdae5833)
    1. Add skip_compiled param in set_forward_context due to [Core] Whisper support torch.compile vllm#30385
    2. Modify tests/ut/spec_decode/test_eagle_proposer.py due to feat: spec decode with draft models vllm#24322
      change self.max_num_tokens = vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size
    3. Modify UT import paths due to the refactors:[4/N][Attention] Move MLA common to model_executor vllm#32060

Does this PR introduce any user-facing change?

How was this patch tested?

@github-actions github-actions bot added documentation Improvements or additions to documentation ci/build labels Jan 20, 2026
@github-actions
Copy link
Contributor

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the vLLM commit to 0.13.0, necessitating changes to import paths and refactoring of parallel configuration logic. The core changes involve adapting the codebase to the new vLLM library structure, particularly for attention metadata builders and sampling metadata. A significant refactoring in patch_multiproc_executor.py centralizes parallel size calculations and worker initialization, introducing version-aware compatibility. While these changes are essential for the upgrade, some aspects related to type safety, assertion preservation, and conditional logic for version compatibility require attention to ensure robustness and correctness.

# isort: off
if vllm_version_is('0.13.0'):
from vllm.v1.attention.backends.utils import AttentionCGSupport
from vllm.v1.attention.backends.mla.common import MLACommonMetadataBuilder # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The use of # type: ignore can mask actual type issues that might arise from API changes in vLLM 0.13.0. It is best practice to resolve type incompatibilities explicitly or provide a more specific reason for ignoring if it's a known, harmless difference, to prevent potential runtime errors.

from vllm.v1.core.sched.output import SchedulerOutput
if vllm_version_is('0.13.0'):
from vllm.v1.attention.backends.utils import AttentionCGSupport
from vllm.v1.attention.backends.mla.common import MLACommonMetadataBuilder # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Similar to mla_v1.py, using # type: ignore for MLACommonMetadataBuilder and AttentionCGSupport might be masking actual type issues or subtle API changes in vLLM 0.13.0. It is recommended to address the underlying type problems directly for better code maintainability and to avoid potential runtime bugs.

Comment on lines +117 to +118
self.futures_queue = deque[tuple[FutureWrapper, Callable]]()
self._post_init_executor()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The initialization of self.futures_queue and the call to self._post_init_executor() are moved to this position. It is critical to ensure that _post_init_executor() is called at the correct stage of the executor's initialization, especially if it has side effects or dependencies that must be met before success = True is set.

Comment on lines +182 to +183
if not vllm_version_is('0.13.0'):
process_kwargs["is_driver_worker"] = is_driver_worker
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The conditional logic if not vllm_version_is('0.13.0'): implies that is_driver_worker is needed for vLLM versions other than 0.13.0. If is_driver_worker is a new parameter introduced in a vLLM version after 0.13.0, this condition might be incorrect. Please clarify the exact vLLM version(s) for which is_driver_worker is required/not required to ensure proper compatibility.

@wxsIcey wxsIcey added ready read for review ready-for-test start test by label for PR labels Jan 20, 2026
@Meihan-chen Meihan-chen force-pushed the main0120 branch 5 times, most recently from 186bc57 to de7731b Compare January 21, 2026 09:49
@github-actions
Copy link
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

1 similar comment
@github-actions
Copy link
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

wjunLu and others added 8 commits January 26, 2026 17:05
Signed-off-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build documentation Improvements or additions to documentation ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants