Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Code Review
This pull request removes the ascend_scheduler feature. The changes are consistent across documentation, examples, tests, and source code, reflecting the removal of this functionality. However, I've identified a critical issue in one of the test files where the modification of server arguments has resulted in a malformed command, which will likely cause the test to fail. I've provided a suggestion to fix it.
d896281 to
153f779
Compare
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
153f779 to
dc5637d
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request removes the custom Ascend scheduler and related configurations, falling back to the default vLLM scheduler. This is a significant cleanup, touching documentation, tests, examples, and source code. The changes are mostly deletions and appear correct and consistent with the PR's goal. However, I found an issue in the documentation for TorchAirGraph. The examples are now incorrect because they enable TorchAirGraph without disabling chunked prefill, which is not a supported combination. This will lead to runtime errors for users following the guide. I've provided suggestions to fix the examples.
There was a problem hiding this comment.
Code Review
This pull request removes the ascend_scheduler and related configurations, aiming to use the default vLLM scheduler. The changes are extensive, touching documentation, examples, tests, and core logic to remove all traces of the ascend_scheduler. The refactoring is clean and aligns with the stated objective.
My main concerns are a potential change in default behavior for chunked prefill, and the removal of tests for SchedulerDynamicBatch which appears to still be in use. I've added critical and high severity comments regarding these issues.
|
CI passed here: https://github.com/vllm-project/vllm-ascend/actions/runs/19735428628/job/56546065316?pr=4498 qwen3-next test failed due to chunked prefill is enabled by default, now qwen3-next with mtp doesn't work. Let's disable it first. |
dc5637d to
9b3d1b5
Compare
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
9b3d1b5 to
38eb2e9
Compare
whx-sjtu
left a comment
There was a problem hiding this comment.
I have confirmed that no multi-modal models depend on AscendScheduler anymore. LGTM.
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
38eb2e9 to
8db5756
Compare
This reverts commit f10acdd.
Reverts #4498 - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2
Ascend scheduler was added for non chunk prefill case before, since that the npu ops didn't work well with chunked prefill. Now the ops with chunked prefill work better, it's time to remove the ascend scheduler to use vLLM default scheduler. - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Reverts vllm-project#4498 - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2
Ascend scheduler was added for non chunk prefill case before, since that the npu ops didn't work well with chunked prefill. Now the ops with chunked prefill work better, it's time to remove the ascend scheduler to use vLLM default scheduler. - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: Che Ruan <cr623@ic.ac.uk>
Reverts vllm-project#4498 - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: Che Ruan <cr623@ic.ac.uk>
Ascend scheduler was added for non chunk prefill case before, since that the npu ops didn't work well with chunked prefill. Now the ops with chunked prefill work better, it's time to remove the ascend scheduler to use vLLM default scheduler. - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: Che Ruan <cr623@ic.ac.uk>
Reverts vllm-project#4498 - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: Che Ruan <cr623@ic.ac.uk>
Ascend scheduler was added for non chunk prefill case before, since that the npu ops didn't work well with chunked prefill. Now the ops with chunked prefill work better, it's time to remove the ascend scheduler to use vLLM default scheduler. - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Reverts vllm-project#4498 - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2
Ascend scheduler was added for non chunk prefill case before, since that the npu ops didn't work well with chunked prefill. Now the ops with chunked prefill work better, it's time to remove the ascend scheduler to use vLLM default scheduler. - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>
Reverts vllm-project#4498 - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>
Ascend scheduler was added for non chunk prefill case before, since that the npu ops didn't work well with chunked prefill. Now the ops with chunked prefill work better, it's time to remove the ascend scheduler to use vLLM default scheduler. - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Reverts vllm-project#4498 - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2
Ascend scheduler was added for non chunk prefill case before, since that the npu ops didn't work well with chunked prefill. Now the ops with chunked prefill work better, it's time to remove the ascend scheduler to use vLLM default scheduler. - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Reverts vllm-project#4498 - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2
Ascend scheduler was added for non chunk prefill case before, since that the npu ops didn't work well with chunked prefill.
Now the ops with chunked prefill work better, it's time to remove the ascend scheduler to use vLLM default scheduler.