[Main2Main] Upgrade vllm commit to 0120 #6040
[Main2Main] Upgrade vllm commit to 0120 #6040Meihan-chen wants to merge 13 commits intovllm-project:mainfrom
Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Code Review
This pull request updates the vLLM commit to 0.13.0, necessitating changes to import paths and refactoring of parallel configuration logic. The core changes involve adapting the codebase to the new vLLM library structure, particularly for attention metadata builders and sampling metadata. A significant refactoring in patch_multiproc_executor.py centralizes parallel size calculations and worker initialization, introducing version-aware compatibility. While these changes are essential for the upgrade, some aspects related to type safety, assertion preservation, and conditional logic for version compatibility require attention to ensure robustness and correctness.
| # isort: off | ||
| if vllm_version_is('0.13.0'): | ||
| from vllm.v1.attention.backends.utils import AttentionCGSupport | ||
| from vllm.v1.attention.backends.mla.common import MLACommonMetadataBuilder # type: ignore |
There was a problem hiding this comment.
| from vllm.v1.core.sched.output import SchedulerOutput | ||
| if vllm_version_is('0.13.0'): | ||
| from vllm.v1.attention.backends.utils import AttentionCGSupport | ||
| from vllm.v1.attention.backends.mla.common import MLACommonMetadataBuilder # type: ignore |
There was a problem hiding this comment.
Similar to mla_v1.py, using # type: ignore for MLACommonMetadataBuilder and AttentionCGSupport might be masking actual type issues or subtle API changes in vLLM 0.13.0. It is recommended to address the underlying type problems directly for better code maintainability and to avoid potential runtime bugs.
| self.futures_queue = deque[tuple[FutureWrapper, Callable]]() | ||
| self._post_init_executor() |
There was a problem hiding this comment.
The initialization of self.futures_queue and the call to self._post_init_executor() are moved to this position. It is critical to ensure that _post_init_executor() is called at the correct stage of the executor's initialization, especially if it has side effects or dependencies that must be met before success = True is set.
| if not vllm_version_is('0.13.0'): | ||
| process_kwargs["is_driver_worker"] = is_driver_worker |
There was a problem hiding this comment.
The conditional logic if not vllm_version_is('0.13.0'): implies that is_driver_worker is needed for vLLM versions other than 0.13.0. If is_driver_worker is a new parameter introduced in a vLLM version after 0.13.0, this condition might be incorrect. Please clarify the exact vLLM version(s) for which is_driver_worker is required/not required to ensure proper compatibility.
186bc57 to
de7731b
Compare
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
1 similar comment
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Signed-off-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
What this PR does / why we need it?
Modify import paths due to the refactors:
[Model Runner V2] Refactor Sampler vllm#32245
[4/N][Attention] Move MLA common to model_executor vllm#32060
Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913
Fix
WorkerProc.__init__() missing 1 required positional argument: 'is_driver_worker'due to [TPU][Core] Enable Pipeline Parallelism on TPU backend vllm#28506https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569
skip_compiledparam inset_forward_contextdue to [Core] Whisper supporttorch.compilevllm#30385tests/ut/spec_decode/test_eagle_proposer.pydue to feat: spec decode with draft models vllm#24322change
self.max_num_tokens = vllm_config.scheduler_config.max_num_batched_tokens + max_batch_sizeDoes this PR introduce any user-facing change?
How was this patch tested?