[Main2Main] Upgrade vllm commit to 0123#6169
Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Code Review
This pull request upgrades the vLLM dependency to a newer commit and adapts the codebase to the corresponding upstream changes. The modifications span across documentation, tests, and core logic, primarily addressing API changes in vLLM. Key adaptations include conditional imports for moved classes like MLACommonMetadataBuilder and AttentionMetadataBuilder, updated function signatures for context management and worker process creation, and support for new features like M-RoPE. The changes appear to be correctly implemented for version compatibility and maintain functionality. The codebase adaptations are thorough and well-executed, with no high or critical severity issues found.
90a9aa2 to
8b3af93
Compare
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Signed-off-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
25eeebc to
30f3057
Compare
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
d8c28b9 to
02fcce2
Compare
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
| total_steps = max_tokens + 1 # this includes the 1 and 2 above | ||
| expected_exec_model = (total_steps + 1) * dp_size | ||
| # vllm default enables Async scheduler, this will take 1 more steps | ||
| expected_exec_model = (total_steps + 1 + 1) * dp_size |
There was a problem hiding this comment.
0.14.0 enable async scheduler already. why it's fine before?
There was a problem hiding this comment.
To adapt to previous upgrades, it was set to false in https://github.com/vllm-project/vllm-ascend/pull/6169/changes/BASE..181a3a373cc9e8f37908c3d14ae83d60dd8b4f6e#diff-0e3dd1c713072facdea0241f1b908c2121957baf6f0400fd71f9b8224f2c98d1L112; this config should remov when the VLLM version is >= 0.14.0.
| "random_input_len": 128, | ||
| "max_concurrency": 40, | ||
| "random_output_len": 100, | ||
| "temperature": 0.0, |
There was a problem hiding this comment.
I think we should add it to benchamark module? @wxsIcey
| global_start_rank = self.local_world_size * self.parallel_config.node_rank_within_dp | ||
| for local_rank in range(self.local_world_size): | ||
| global_rank = global_start_rank + local_rank | ||
| is_driver_worker = self._is_driver_worker(global_rank) |
There was a problem hiding this comment.
is this change work with 0.14.0?
There was a problem hiding this comment.
is_driver_worker is vllm-ascend's own make_worker_process method which always accepts this parameter, while AscendWorkerProc conditionally passes it to vllm's native WorkerProc.init which only accepts it in newer versions,so only https://github.com/vllm-project/vllm-ascend/pull/6169/changes/BASE..181a3a373cc9e8f37908c3d14ae83d60dd8b4f6e#diff-b25d0746a873edc4bc07a475dd926377fdda5e51a437fc329123eea9202cfdaaR181 needs version checking.
| import torch | ||
| from vllm.triton_utils import tl, triton | ||
| from vllm.v1.worker.gpu.sample.metadata import SamplingMetadata | ||
| from vllm.v1.sample.metadata import SamplingMetadata |
…to qwen3next_rebase * 'main' of https://github.com/vllm-project/vllm-ascend: (86 commits) [refactor] refactor excute_model and _dymmy_run method (vllm-project#6043) [Refactor] profiler config optimze (vllm-project#6141) [Graph][Fusion] Add MatmulAllReduceAddRMSNorm graph fusion for npugraph_ex. (vllm-project#6006) [UT]: refactoring 310p ops ut (vllm-project#6296) [Refact.]: refactoring 310p-kv cache allocator, align with main branch (vllm-project#6270) [Misc] Removes unnecessary graph size re-initialization (vllm-project#6280) [Main2Main] Upgrade vllm commit to 0123 (vllm-project#6169) [BugFix] Fix wheel package build workflow (vllm-project#6276) [CI][BugFix] Qwen3-Next nightly test fix. (vllm-project#6247) [Doc] quick fix for vllm-ascend version (vllm-project#6278) [Community] Nominate whx-sjtu as maintainer (vllm-project#6268) [Lint] Fix mypy issue to make CI happy (vllm-project#6272) BugFix: Fix moe_load accumulation error in ACL graph mode (vllm-project#6182) [Patch] Remove the patch of ECExampleConnector (vllm-project#5976) [Bugfix] Fix PP+PCP and PP+flashcomm1 bugs (vllm-project#5416) [Feat] proxy delay to remove instances (vllm-project#5934) [CI] Add workfolw_dispatch for nightly image build (vllm-project#6269) [bugfix][npugraph_ex]fix static kernel uninstall issue (vllm-project#6128) [Doc] 310P Documents update (vllm-project#6246) [Feature] Mooncake connector get remote ptp size (vllm-project#5822) ...
### What this PR does / why we need it? 1. ✅ Upgrade vllm commit to: 0115 (8471b27) Modify import paths due to the refactors: vllm-project/vllm#32245 vllm-project/vllm#32060 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913 2. ✅Upgrade vllm commit to: 0119 (9a1f16d) Fix `WorkerProc.__init__() missing 1 required positional argument: 'is_driver_worker'` due to vllm-project/vllm#28506 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569 3. ✅Upgrade vllm commit to: 0120(148117e) 1. Add `skip_compiled` param in `set_forward_context` due to vllm-project/vllm#30385 2. Modify `tests/ut/spec_decode/test_eagle_proposer.py` due to vllm-project/vllm#24322 change `self.max_num_tokens = vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size` 3. Modify UT import paths due to the refactors:vllm-project/vllm#32060 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21204851770/job/60999046946 4. ✅Upgrade vllm commit to: 0121(f23fb5a) 1. vLLM switched `uses_mrope` from target to draft model config, making `positions`/`mrope_positions` mutually exclusive, breaking vllm-ascend's direct self.positions access and tests missing `draft_model_config.uses_mrope`. vllm-project/vllm#32048 2. Moved bs_to_padded_graph_size from CompilationConfig to CudagraphDispatcher due to the refactor vllm-project/vllm#30143 3. Remove unused `maybe_setup_kv_connector` due to vllm-project/vllm#32077 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21217728738/job/61043738834 6. ✅Upgrade vllm commit to: 0122(8ebf271) Updating FusedMoEParallelConfig (added enable_eplb) and FusedMoEConfig due to vllm-project/vllm#32414 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21249922546/job/61148613054 8. ✅Upgrade vllm commit to: 0123(dc917cc) Setting temperature=0.0 due to the removal of the default temperature value in vllm-project/vllm#32723 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21280796875 ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.14.0 - vLLM main: vllm-project/vllm@d682094 --------- Signed-off-by: wjunLu <wjunlu217@gmail.com> Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com> Co-authored-by: wjunLu <wjunlu217@gmail.com>
### What this PR does / why we need it? 1. ✅ Upgrade vllm commit to: 0115 (8471b27) Modify import paths due to the refactors: vllm-project/vllm#32245 vllm-project/vllm#32060 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913 2. ✅Upgrade vllm commit to: 0119 (9a1f16d) Fix `WorkerProc.__init__() missing 1 required positional argument: 'is_driver_worker'` due to vllm-project/vllm#28506 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569 3. ✅Upgrade vllm commit to: 0120(148117e) 1. Add `skip_compiled` param in `set_forward_context` due to vllm-project/vllm#30385 2. Modify `tests/ut/spec_decode/test_eagle_proposer.py` due to vllm-project/vllm#24322 change `self.max_num_tokens = vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size` 3. Modify UT import paths due to the refactors:vllm-project/vllm#32060 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21204851770/job/60999046946 4. ✅Upgrade vllm commit to: 0121(f23fb5a) 1. vLLM switched `uses_mrope` from target to draft model config, making `positions`/`mrope_positions` mutually exclusive, breaking vllm-ascend's direct self.positions access and tests missing `draft_model_config.uses_mrope`. vllm-project/vllm#32048 2. Moved bs_to_padded_graph_size from CompilationConfig to CudagraphDispatcher due to the refactor vllm-project/vllm#30143 3. Remove unused `maybe_setup_kv_connector` due to vllm-project/vllm#32077 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21217728738/job/61043738834 6. ✅Upgrade vllm commit to: 0122(8ebf271) Updating FusedMoEParallelConfig (added enable_eplb) and FusedMoEConfig due to vllm-project/vllm#32414 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21249922546/job/61148613054 8. ✅Upgrade vllm commit to: 0123(dc917cc) Setting temperature=0.0 due to the removal of the default temperature value in vllm-project/vllm#32723 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21280796875 ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.14.0 - vLLM main: vllm-project/vllm@d682094 --------- Signed-off-by: wjunLu <wjunlu217@gmail.com> Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com> Co-authored-by: wjunLu <wjunlu217@gmail.com>
### What this PR does / why we need it? 1. ✅ Upgrade vllm commit to: 0115 (8471b27) Modify import paths due to the refactors: vllm-project/vllm#32245 vllm-project/vllm#32060 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913 2. ✅Upgrade vllm commit to: 0119 (9a1f16d) Fix `WorkerProc.__init__() missing 1 required positional argument: 'is_driver_worker'` due to vllm-project/vllm#28506 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569 3. ✅Upgrade vllm commit to: 0120(148117e) 1. Add `skip_compiled` param in `set_forward_context` due to vllm-project/vllm#30385 2. Modify `tests/ut/spec_decode/test_eagle_proposer.py` due to vllm-project/vllm#24322 change `self.max_num_tokens = vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size` 3. Modify UT import paths due to the refactors:vllm-project/vllm#32060 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21204851770/job/60999046946 4. ✅Upgrade vllm commit to: 0121(f23fb5a) 1. vLLM switched `uses_mrope` from target to draft model config, making `positions`/`mrope_positions` mutually exclusive, breaking vllm-ascend's direct self.positions access and tests missing `draft_model_config.uses_mrope`. vllm-project/vllm#32048 2. Moved bs_to_padded_graph_size from CompilationConfig to CudagraphDispatcher due to the refactor vllm-project/vllm#30143 3. Remove unused `maybe_setup_kv_connector` due to vllm-project/vllm#32077 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21217728738/job/61043738834 6. ✅Upgrade vllm commit to: 0122(8ebf271) Updating FusedMoEParallelConfig (added enable_eplb) and FusedMoEConfig due to vllm-project/vllm#32414 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21249922546/job/61148613054 8. ✅Upgrade vllm commit to: 0123(dc917cc) Setting temperature=0.0 due to the removal of the default temperature value in vllm-project/vllm#32723 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21280796875 ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.14.0 - vLLM main: vllm-project/vllm@d682094 --------- Signed-off-by: wjunLu <wjunlu217@gmail.com> Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com> Co-authored-by: wjunLu <wjunlu217@gmail.com> Signed-off-by: momochenchuw <chenchuw@huawei.com>
### What this PR does / why we need it? 1. ✅ Upgrade vllm commit to: 0115 (8471b27) Modify import paths due to the refactors: vllm-project/vllm#32245 vllm-project/vllm#32060 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913 2. ✅Upgrade vllm commit to: 0119 (9a1f16d) Fix `WorkerProc.__init__() missing 1 required positional argument: 'is_driver_worker'` due to vllm-project/vllm#28506 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569 3. ✅Upgrade vllm commit to: 0120(148117e) 1. Add `skip_compiled` param in `set_forward_context` due to vllm-project/vllm#30385 2. Modify `tests/ut/spec_decode/test_eagle_proposer.py` due to vllm-project/vllm#24322 change `self.max_num_tokens = vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size` 3. Modify UT import paths due to the refactors:vllm-project/vllm#32060 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21204851770/job/60999046946 4. ✅Upgrade vllm commit to: 0121(f23fb5a) 1. vLLM switched `uses_mrope` from target to draft model config, making `positions`/`mrope_positions` mutually exclusive, breaking vllm-ascend's direct self.positions access and tests missing `draft_model_config.uses_mrope`. vllm-project/vllm#32048 2. Moved bs_to_padded_graph_size from CompilationConfig to CudagraphDispatcher due to the refactor vllm-project/vllm#30143 3. Remove unused `maybe_setup_kv_connector` due to vllm-project/vllm#32077 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21217728738/job/61043738834 6. ✅Upgrade vllm commit to: 0122(8ebf271) Updating FusedMoEParallelConfig (added enable_eplb) and FusedMoEConfig due to vllm-project/vllm#32414 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21249922546/job/61148613054 8. ✅Upgrade vllm commit to: 0123(dc917cc) Setting temperature=0.0 due to the removal of the default temperature value in vllm-project/vllm#32723 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21280796875 ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.14.0 - vLLM main: vllm-project/vllm@d682094 --------- Signed-off-by: wjunLu <wjunlu217@gmail.com> Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com> Co-authored-by: wjunLu <wjunlu217@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
### What this PR does / why we need it? 1. ✅ Upgrade vllm commit to: 0115 (8471b27) Modify import paths due to the refactors: vllm-project/vllm#32245 vllm-project/vllm#32060 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913 2. ✅Upgrade vllm commit to: 0119 (9a1f16d) Fix `WorkerProc.__init__() missing 1 required positional argument: 'is_driver_worker'` due to vllm-project/vllm#28506 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569 3. ✅Upgrade vllm commit to: 0120(148117e) 1. Add `skip_compiled` param in `set_forward_context` due to vllm-project/vllm#30385 2. Modify `tests/ut/spec_decode/test_eagle_proposer.py` due to vllm-project/vllm#24322 change `self.max_num_tokens = vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size` 3. Modify UT import paths due to the refactors:vllm-project/vllm#32060 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21204851770/job/60999046946 4. ✅Upgrade vllm commit to: 0121(f23fb5a) 1. vLLM switched `uses_mrope` from target to draft model config, making `positions`/`mrope_positions` mutually exclusive, breaking vllm-ascend's direct self.positions access and tests missing `draft_model_config.uses_mrope`. vllm-project/vllm#32048 2. Moved bs_to_padded_graph_size from CompilationConfig to CudagraphDispatcher due to the refactor vllm-project/vllm#30143 3. Remove unused `maybe_setup_kv_connector` due to vllm-project/vllm#32077 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21217728738/job/61043738834 6. ✅Upgrade vllm commit to: 0122(8ebf271) Updating FusedMoEParallelConfig (added enable_eplb) and FusedMoEConfig due to vllm-project/vllm#32414 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21249922546/job/61148613054 8. ✅Upgrade vllm commit to: 0123(dc917cc) Setting temperature=0.0 due to the removal of the default temperature value in vllm-project/vllm#32723 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21280796875 ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.14.0 - vLLM main: vllm-project/vllm@d682094 --------- Signed-off-by: wjunLu <wjunlu217@gmail.com> Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com> Co-authored-by: wjunLu <wjunlu217@gmail.com>
### What this PR does / why we need it? 1. ✅ Upgrade vllm commit to: 0115 (8471b27) Modify import paths due to the refactors: vllm-project/vllm#32245 vllm-project/vllm#32060 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913 2. ✅Upgrade vllm commit to: 0119 (9a1f16d) Fix `WorkerProc.__init__() missing 1 required positional argument: 'is_driver_worker'` due to vllm-project/vllm#28506 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569 3. ✅Upgrade vllm commit to: 0120(148117e) 1. Add `skip_compiled` param in `set_forward_context` due to vllm-project/vllm#30385 2. Modify `tests/ut/spec_decode/test_eagle_proposer.py` due to vllm-project/vllm#24322 change `self.max_num_tokens = vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size` 3. Modify UT import paths due to the refactors:vllm-project/vllm#32060 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21204851770/job/60999046946 4. ✅Upgrade vllm commit to: 0121(f23fb5a) 1. vLLM switched `uses_mrope` from target to draft model config, making `positions`/`mrope_positions` mutually exclusive, breaking vllm-ascend's direct self.positions access and tests missing `draft_model_config.uses_mrope`. vllm-project/vllm#32048 2. Moved bs_to_padded_graph_size from CompilationConfig to CudagraphDispatcher due to the refactor vllm-project/vllm#30143 3. Remove unused `maybe_setup_kv_connector` due to vllm-project/vllm#32077 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21217728738/job/61043738834 6. ✅Upgrade vllm commit to: 0122(8ebf271) Updating FusedMoEParallelConfig (added enable_eplb) and FusedMoEConfig due to vllm-project/vllm#32414 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21249922546/job/61148613054 8. ✅Upgrade vllm commit to: 0123(dc917cc) Setting temperature=0.0 due to the removal of the default temperature value in vllm-project/vllm#32723 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21280796875 ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.14.0 - vLLM main: vllm-project/vllm@d682094 --------- Signed-off-by: wjunLu <wjunlu217@gmail.com> Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com> Co-authored-by: wjunLu <wjunlu217@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
### What this PR does / why we need it? 1. ✅ Upgrade vllm commit to: 0115 (8471b27) Modify import paths due to the refactors: vllm-project/vllm#32245 vllm-project/vllm#32060 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913 2. ✅Upgrade vllm commit to: 0119 (9a1f16d) Fix `WorkerProc.__init__() missing 1 required positional argument: 'is_driver_worker'` due to vllm-project/vllm#28506 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569 3. ✅Upgrade vllm commit to: 0120(148117e) 1. Add `skip_compiled` param in `set_forward_context` due to vllm-project/vllm#30385 2. Modify `tests/ut/spec_decode/test_eagle_proposer.py` due to vllm-project/vllm#24322 change `self.max_num_tokens = vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size` 3. Modify UT import paths due to the refactors:vllm-project/vllm#32060 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21204851770/job/60999046946 4. ✅Upgrade vllm commit to: 0121(f23fb5a) 1. vLLM switched `uses_mrope` from target to draft model config, making `positions`/`mrope_positions` mutually exclusive, breaking vllm-ascend's direct self.positions access and tests missing `draft_model_config.uses_mrope`. vllm-project/vllm#32048 2. Moved bs_to_padded_graph_size from CompilationConfig to CudagraphDispatcher due to the refactor vllm-project/vllm#30143 3. Remove unused `maybe_setup_kv_connector` due to vllm-project/vllm#32077 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21217728738/job/61043738834 6. ✅Upgrade vllm commit to: 0122(8ebf271) Updating FusedMoEParallelConfig (added enable_eplb) and FusedMoEConfig due to vllm-project/vllm#32414 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21249922546/job/61148613054 8. ✅Upgrade vllm commit to: 0123(dc917cc) Setting temperature=0.0 due to the removal of the default temperature value in vllm-project/vllm#32723 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21280796875 ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.14.0 - vLLM main: vllm-project/vllm@d682094 --------- Signed-off-by: wjunLu <wjunlu217@gmail.com> Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com> Co-authored-by: wjunLu <wjunlu217@gmail.com>
### What this PR does / why we need it? 1. ✅ Upgrade vllm commit to: 0115 (8471b27) Modify import paths due to the refactors: vllm-project/vllm#32245 vllm-project/vllm#32060 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913 2. ✅Upgrade vllm commit to: 0119 (9a1f16d) Fix `WorkerProc.__init__() missing 1 required positional argument: 'is_driver_worker'` due to vllm-project/vllm#28506 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569 3. ✅Upgrade vllm commit to: 0120(148117e) 1. Add `skip_compiled` param in `set_forward_context` due to vllm-project/vllm#30385 2. Modify `tests/ut/spec_decode/test_eagle_proposer.py` due to vllm-project/vllm#24322 change `self.max_num_tokens = vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size` 3. Modify UT import paths due to the refactors:vllm-project/vllm#32060 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21204851770/job/60999046946 4. ✅Upgrade vllm commit to: 0121(f23fb5a) 1. vLLM switched `uses_mrope` from target to draft model config, making `positions`/`mrope_positions` mutually exclusive, breaking vllm-ascend's direct self.positions access and tests missing `draft_model_config.uses_mrope`. vllm-project/vllm#32048 2. Moved bs_to_padded_graph_size from CompilationConfig to CudagraphDispatcher due to the refactor vllm-project/vllm#30143 3. Remove unused `maybe_setup_kv_connector` due to vllm-project/vllm#32077 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21217728738/job/61043738834 6. ✅Upgrade vllm commit to: 0122(8ebf271) Updating FusedMoEParallelConfig (added enable_eplb) and FusedMoEConfig due to vllm-project/vllm#32414 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21249922546/job/61148613054 8. ✅Upgrade vllm commit to: 0123(dc917cc) Setting temperature=0.0 due to the removal of the default temperature value in vllm-project/vllm#32723 Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21280796875 ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.14.0 - vLLM main: vllm-project/vllm@d682094 --------- Signed-off-by: wjunLu <wjunlu217@gmail.com> Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com> Co-authored-by: wjunLu <wjunlu217@gmail.com>
What this PR does / why we need it?
Modify import paths due to the refactors:
[Model Runner V2] Refactor Sampler vllm#32245
[4/N][Attention] Move MLA common to model_executor vllm#32060
Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913
Fix
WorkerProc.__init__() missing 1 required positional argument: 'is_driver_worker'due to [TPU][Core] Enable Pipeline Parallelism on TPU backend vllm#28506Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569
skip_compiledparam inset_forward_contextdue to [Core] Whisper supporttorch.compilevllm#30385tests/ut/spec_decode/test_eagle_proposer.pydue to feat: spec decode with draft models vllm#24322change
self.max_num_tokens = vllm_config.scheduler_config.max_num_batched_tokens + max_batch_sizeTest result: https://github.com/vllm-project/vllm-ascend/actions/runs/21204851770/job/60999046946
uses_mropefrom target to draft model config, makingpositions/mrope_positionsmutually exclusive, breaking vllm-ascend's direct self.positions access and tests missingdraft_model_config.uses_mrope. Added qwen3 vision language moe support for speculative decoding vllm#32048maybe_setup_kv_connectordue to [Cleanup] Remove unusedKVConnectorModelRunnerMixinmethods vllm#32077Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21217728738/job/61043738834
Updating FusedMoEParallelConfig (added enable_eplb) and FusedMoEConfig due to [MoE Refactor] Oracle Select FP8+NVFP4 Kernels In Priority vllm#32414
Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21249922546/job/61148613054
Setting temperature=0.0 due to the removal of the default temperature value in [Benchmark] Don't default to
temperature==0invllm bench servevllm#32723Test result: https://github.com/vllm-project/vllm-ascend/actions/runs/21280796875
Does this PR introduce any user-facing change?
How was this patch tested?