[Model][1/N] Delete deepseek v2/v3 modeling codes.#3189
[Model][1/N] Delete deepseek v2/v3 modeling codes.#3189wangxiyuan merged 6 commits intovllm-project:mainfrom
Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Code Review
This pull request deletes the file vllm_ascend/models/deepseek_v2.py, which contains the non-torchair implementations for DeepSeek V2 and V3 models. While the goal seems to be cleanup, this change is incomplete and introduces breaking changes. Specifically, vllm_ascend/models/__init__.py still attempts to register models from the deleted file, which will cause an ImportError. Additionally, vllm_ascend/models/deepseek_mtp.py relies on a monkey patch from the deleted file, and its behavior will change, potentially leading to correctness or performance issues. The test file tests/ut/models/test_deepseek_v2.py also has a direct dependency on the deleted file and will fail. This pull request should be updated to resolve these dangling dependencies.
5c376dd to
1a341b7
Compare
1730fb5 to
e25f3ed
Compare
1d78a51 to
0b1e706
Compare
71be184 to
380d3ef
Compare
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
|
Why is it possible to delete these two model files now? |
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
What do you think is the reason why it cannot be deleted? If you refer to the CUDA hard-coding issue of DeepSeek 3.2, I will circumvent it through a patch later. @LCAIZJ |
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Signed-off-by: whx-sjtu <2952154980@qq.com>
Signed-off-by: whx-sjtu <2952154980@qq.com>
Signed-off-by: whx-sjtu <2952154980@qq.com>
Signed-off-by: whx-sjtu <2952154980@qq.com>
|
Nice work |
This PR deletes model codes of deepseek_v2 and deepseek_v3 to reuse the model file from vLLM. vLLM Ascend now uses custom ops register way instead of model file hard-coding. - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 --------- Signed-off-by: whx-sjtu <2952154980@qq.com>
This is the follow-up PR to PR #3189, which continues to refactor sfa into mla and finally remove deepseek_v3_2.py. This is the last PR of deepseek modeling refactoring. After this, all deepseek-related model codes are removed from vllm_ascend. FurtherMore, after this PR deepseek v3.2 can run chunk-prefill with correct accuracy. - vLLM version: v0.11.0rc3 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: whx-sjtu <2952154980@qq.com>
This PR deletes model codes of deepseek_v2 and deepseek_v3 to reuse the model file from vLLM. vLLM Ascend now uses custom ops register way instead of model file hard-coding. - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 --------- Signed-off-by: whx-sjtu <2952154980@qq.com> Signed-off-by: luolun <luolun1995@cmbchina.com>
…project#3769) This is the follow-up PR to PR vllm-project#3189, which continues to refactor sfa into mla and finally remove deepseek_v3_2.py. This is the last PR of deepseek modeling refactoring. After this, all deepseek-related model codes are removed from vllm_ascend. FurtherMore, after this PR deepseek v3.2 can run chunk-prefill with correct accuracy. - vLLM version: v0.11.0rc3 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: whx-sjtu <2952154980@qq.com> Signed-off-by: luolun <luolun1995@cmbchina.com>
This PR deletes model codes of deepseek_v2 and deepseek_v3 to reuse the model file from vLLM. vLLM Ascend now uses custom ops register way instead of model file hard-coding. - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 --------- Signed-off-by: whx-sjtu <2952154980@qq.com> Signed-off-by: hwhaokun <haokun0405@163.com>
…project#3769) This is the follow-up PR to PR vllm-project#3189, which continues to refactor sfa into mla and finally remove deepseek_v3_2.py. This is the last PR of deepseek modeling refactoring. After this, all deepseek-related model codes are removed from vllm_ascend. FurtherMore, after this PR deepseek v3.2 can run chunk-prefill with correct accuracy. - vLLM version: v0.11.0rc3 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: whx-sjtu <2952154980@qq.com> Signed-off-by: hwhaokun <haokun0405@163.com>
This PR deletes model codes of deepseek_v2 and deepseek_v3 to reuse the model file from vLLM. vLLM Ascend now uses custom ops register way instead of model file hard-coding. - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 --------- Signed-off-by: whx-sjtu <2952154980@qq.com> Signed-off-by: nsdie <yeyifan@huawei.com>
…project#3769) This is the follow-up PR to PR vllm-project#3189, which continues to refactor sfa into mla and finally remove deepseek_v3_2.py. This is the last PR of deepseek modeling refactoring. After this, all deepseek-related model codes are removed from vllm_ascend. FurtherMore, after this PR deepseek v3.2 can run chunk-prefill with correct accuracy. - vLLM version: v0.11.0rc3 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: whx-sjtu <2952154980@qq.com> Signed-off-by: nsdie <yeyifan@huawei.com>
This PR deletes model codes of deepseek_v2 and deepseek_v3 to reuse the model file from vLLM. vLLM Ascend now uses custom ops register way instead of model file hard-coding. - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 --------- Signed-off-by: whx-sjtu <2952154980@qq.com>
…project#3769) This is the follow-up PR to PR vllm-project#3189, which continues to refactor sfa into mla and finally remove deepseek_v3_2.py. This is the last PR of deepseek modeling refactoring. After this, all deepseek-related model codes are removed from vllm_ascend. FurtherMore, after this PR deepseek v3.2 can run chunk-prefill with correct accuracy. - vLLM version: v0.11.0rc3 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: whx-sjtu <2952154980@qq.com>
This PR deletes model codes of deepseek_v2 and deepseek_v3 to reuse the model file from vLLM.
vLLM Ascend now uses custom ops register way instead of model file hard-coding.