[Bugfix] Correct method call for _set_cos_sin_cache by jianzs · Pull Request #774 · vllm-project/vllm-ascend

jianzs · 2025-05-07T08:04:34Z

This change ensures proper functionality for longer sequences by correctly invoking the _set_cos_sin_cache method with self as the first argument.

For example, with DeepSeek R1, if this change isn't made, the program will crash when the input sequence exceeds 4096.

jianzs · 2025-05-07T08:26:22Z

@wangxiyuan @Yikun The vLLM main branch and v0.8.5.post1 have different ShardedStateLoader paths. Is there a standard procedure for handling this?

This change ensures proper functionality for longer sequences by correctly invoking the _set_cos_sin_cache method with self as the first argument. Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>

jianzs · 2025-05-08T07:45:37Z

@ganyi1996ppo @Yikun Ready to merge.

ganyi1996ppo · 2025-05-08T08:50:44Z

vllm_ascend/ops/rotary_embedding.py

                                 offsets: Optional[torch.Tensor] = None,
                                 max_seq_len: Optional[int] = None):
    if max_seq_len is not None and max_seq_len > self.max_seq_len:
-        self._set_cos_sin_cache(max_seq_len, query.device, query.dtype)


Just curious, why this invoke may cause crash issue

Just curious, why this invoke may cause crash issue

DeepseekScalingRotaryEmbedding doesn't have a _set_cos_sin_cache method. Instead, it uses the self as the first parameter directly when calling the function.

oh, right, we did not patch this function inside the code, thanks for this report and fix

…llm-project#774) Merge branch wengang/dev-v0.8.5.508-cherry-pick of git@code.alipay.com:Theta/vllm-ascend.git into dev-v0.8.5.508 https://code.alipay.com/Theta/vllm-ascend/pull_requests/23 Signed-off-by: 沧濯 <zhengshoujian.zsj@antgroup.com> * [Bugfix] Correct method call for _set_cos_sin_cache (vllm-project#774)

@jianzs

### What this PR does / why we need it? Add @jianzs as vLLM Ascend maintainer @jianzs ---- I would like to nominate Shoujian Zheng (@jianzs <https://github.com/jianzs>) as a maintainer, starting with my +1. - He focuses on the code quality and good design with solid reviews in P/D disaggregation and DeepSeek improvement area about 30+ high quality review, such as #issuecomment-2811764833, #discussion_r2069927605 and #pullrequestreview-2820996674. This is the most important reason why I nominated him, because helping community developers complete PRs with high quality and continuously ensure the quality of codebase is one of the important responsibilities of a maintainer. We believe he is a great addition. - Shoujian's main expertise is distributed inference. He has a lot of experience in production about AI infra. He has very good habits and explains in great detail all changes #issue-3023082580 anqd share results open: #issuecomment-2853140443. And High quality PR: #706, #774, #852. - Community Involvement: Active involved in community discussion, he is collaborative and helps the users solve problems, involved in 30+ PR and issue, such as #issuecomment-2911934292 and #issuecomment-2833523571. Reference: [1] https://vllm-ascend.readthedocs.io/en/latest/community/contributors.html [2] https://vllm-ascend.readthedocs.io/en/latest/community/governance.html Signed-off-by: Yikun Jiang <yikunkero@gmail.com>

@jianzs

…1203) ### What this PR does / why we need it? Add @jianzs as vLLM Ascend maintainer @jianzs ---- I would like to nominate Shoujian Zheng (@jianzs <https://github.com/jianzs>) as a maintainer, starting with my +1. - He focuses on the code quality and good design with solid reviews in P/D disaggregation and DeepSeek improvement area about 30+ high quality review, such as #issuecomment-2811764833, #discussion_r2069927605 and #pullrequestreview-2820996674. This is the most important reason why I nominated him, because helping community developers complete PRs with high quality and continuously ensure the quality of codebase is one of the important responsibilities of a maintainer. We believe he is a great addition. - Shoujian's main expertise is distributed inference. He has a lot of experience in production about AI infra. He has very good habits and explains in great detail all changes #issue-3023082580 anqd share results open: #issuecomment-2853140443. And High quality PR: vllm-project#706, vllm-project#774, vllm-project#852. - Community Involvement: Active involved in community discussion, he is collaborative and helps the users solve problems, involved in 30+ PR and issue, such as #issuecomment-2911934292 and #issuecomment-2833523571. Reference: [1] https://vllm-ascend.readthedocs.io/en/latest/community/contributors.html [2] https://vllm-ascend.readthedocs.io/en/latest/community/governance.html Signed-off-by: Yikun Jiang <yikunkero@gmail.com>

@jianzs

…1203) ### What this PR does / why we need it? Add @jianzs as vLLM Ascend maintainer @jianzs ---- I would like to nominate Shoujian Zheng (@jianzs <https://github.com/jianzs>) as a maintainer, starting with my +1. - He focuses on the code quality and good design with solid reviews in P/D disaggregation and DeepSeek improvement area about 30+ high quality review, such as #issuecomment-2811764833, #discussion_r2069927605 and #pullrequestreview-2820996674. This is the most important reason why I nominated him, because helping community developers complete PRs with high quality and continuously ensure the quality of codebase is one of the important responsibilities of a maintainer. We believe he is a great addition. - Shoujian's main expertise is distributed inference. He has a lot of experience in production about AI infra. He has very good habits and explains in great detail all changes #issue-3023082580 anqd share results open: #issuecomment-2853140443. And High quality PR: vllm-project#706, vllm-project#774, vllm-project#852. - Community Involvement: Active involved in community discussion, he is collaborative and helps the users solve problems, involved in 30+ PR and issue, such as #issuecomment-2911934292 and #issuecomment-2833523571. Reference: [1] https://vllm-ascend.readthedocs.io/en/latest/community/contributors.html [2] https://vllm-ascend.readthedocs.io/en/latest/community/governance.html Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Signed-off-by: wangxiaoxin (A) <wangxiaoxin7@huawei.com>

@jianzs

…1203) ### What this PR does / why we need it? Add @jianzs as vLLM Ascend maintainer @jianzs ---- I would like to nominate Shoujian Zheng (@jianzs <https://github.com/jianzs>) as a maintainer, starting with my +1. - He focuses on the code quality and good design with solid reviews in P/D disaggregation and DeepSeek improvement area about 30+ high quality review, such as #issuecomment-2811764833, #discussion_r2069927605 and #pullrequestreview-2820996674. This is the most important reason why I nominated him, because helping community developers complete PRs with high quality and continuously ensure the quality of codebase is one of the important responsibilities of a maintainer. We believe he is a great addition. - Shoujian's main expertise is distributed inference. He has a lot of experience in production about AI infra. He has very good habits and explains in great detail all changes #issue-3023082580 anqd share results open: #issuecomment-2853140443. And High quality PR: vllm-project#706, vllm-project#774, vllm-project#852. - Community Involvement: Active involved in community discussion, he is collaborative and helps the users solve problems, involved in 30+ PR and issue, such as #issuecomment-2911934292 and #issuecomment-2833523571. Reference: [1] https://vllm-ascend.readthedocs.io/en/latest/community/contributors.html [2] https://vllm-ascend.readthedocs.io/en/latest/community/governance.html Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Signed-off-by: wangxiaoxin (A) <wangxiaoxin7@huawei.com>

@jianzs

…1203) ### What this PR does / why we need it? Add @jianzs as vLLM Ascend maintainer @jianzs ---- I would like to nominate Shoujian Zheng (@jianzs <https://github.com/jianzs>) as a maintainer, starting with my +1. - He focuses on the code quality and good design with solid reviews in P/D disaggregation and DeepSeek improvement area about 30+ high quality review, such as #issuecomment-2811764833, #discussion_r2069927605 and #pullrequestreview-2820996674. This is the most important reason why I nominated him, because helping community developers complete PRs with high quality and continuously ensure the quality of codebase is one of the important responsibilities of a maintainer. We believe he is a great addition. - Shoujian's main expertise is distributed inference. He has a lot of experience in production about AI infra. He has very good habits and explains in great detail all changes #issue-3023082580 anqd share results open: #issuecomment-2853140443. And High quality PR: vllm-project#706, vllm-project#774, vllm-project#852. - Community Involvement: Active involved in community discussion, he is collaborative and helps the users solve problems, involved in 30+ PR and issue, such as #issuecomment-2911934292 and #issuecomment-2833523571. Reference: [1] https://vllm-ascend.readthedocs.io/en/latest/community/contributors.html [2] https://vllm-ascend.readthedocs.io/en/latest/community/governance.html Signed-off-by: Yikun Jiang <yikunkero@gmail.com>

This change ensures proper functionality for longer sequences by correctly invoking the _set_cos_sin_cache method with self as the first argument. For example, with DeepSeek R1, if this change isn't made, the program will crash when the input sequence exceeds 4096. Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>

@jianzs

…1203) ### What this PR does / why we need it? Add @jianzs as vLLM Ascend maintainer @jianzs ---- I would like to nominate Shoujian Zheng (@jianzs <https://github.com/jianzs>) as a maintainer, starting with my +1. - He focuses on the code quality and good design with solid reviews in P/D disaggregation and DeepSeek improvement area about 30+ high quality review, such as #issuecomment-2811764833, #discussion_r2069927605 and #pullrequestreview-2820996674. This is the most important reason why I nominated him, because helping community developers complete PRs with high quality and continuously ensure the quality of codebase is one of the important responsibilities of a maintainer. We believe he is a great addition. - Shoujian's main expertise is distributed inference. He has a lot of experience in production about AI infra. He has very good habits and explains in great detail all changes #issue-3023082580 anqd share results open: #issuecomment-2853140443. And High quality PR: vllm-project#706, vllm-project#774, vllm-project#852. - Community Involvement: Active involved in community discussion, he is collaborative and helps the users solve problems, involved in 30+ PR and issue, such as #issuecomment-2911934292 and #issuecomment-2833523571. Reference: [1] https://vllm-ascend.readthedocs.io/en/latest/community/contributors.html [2] https://vllm-ascend.readthedocs.io/en/latest/community/governance.html Signed-off-by: Yikun Jiang <yikunkero@gmail.com>

This change ensures proper functionality for longer sequences by correctly invoking the _set_cos_sin_cache method with self as the first argument. For example, with DeepSeek R1, if this change isn't made, the program will crash when the input sequence exceeds 4096. Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>

@jianzs

…1203) ### What this PR does / why we need it? Add @jianzs as vLLM Ascend maintainer @jianzs ---- I would like to nominate Shoujian Zheng (@jianzs <https://github.com/jianzs>) as a maintainer, starting with my +1. - He focuses on the code quality and good design with solid reviews in P/D disaggregation and DeepSeek improvement area about 30+ high quality review, such as #issuecomment-2811764833, #discussion_r2069927605 and #pullrequestreview-2820996674. This is the most important reason why I nominated him, because helping community developers complete PRs with high quality and continuously ensure the quality of codebase is one of the important responsibilities of a maintainer. We believe he is a great addition. - Shoujian's main expertise is distributed inference. He has a lot of experience in production about AI infra. He has very good habits and explains in great detail all changes #issue-3023082580 anqd share results open: #issuecomment-2853140443. And High quality PR: vllm-project#706, vllm-project#774, vllm-project#852. - Community Involvement: Active involved in community discussion, he is collaborative and helps the users solve problems, involved in 30+ PR and issue, such as #issuecomment-2911934292 and #issuecomment-2833523571. Reference: [1] https://vllm-ascend.readthedocs.io/en/latest/community/contributors.html [2] https://vllm-ascend.readthedocs.io/en/latest/community/governance.html Signed-off-by: Yikun Jiang <yikunkero@gmail.com>

github-actions bot added the module:ops label May 7, 2025

jianzs force-pushed the fix/set_cos_sin branch 3 times, most recently from c6db90e to f4cc50f Compare May 7, 2025 08:45

jianzs changed the title ~~[Bugfix] Correct method call for setting cos sin cache~~ [Bugfix] Correct method call for _set_cos_sin_cache May 7, 2025

fix: correct method call for setting cos sin cache

779d6b0

This change ensures proper functionality for longer sequences by correctly invoking the _set_cos_sin_cache method with self as the first argument. Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>

jianzs force-pushed the fix/set_cos_sin branch from f4cc50f to 779d6b0 Compare May 8, 2025 06:20

ApsarasX approved these changes May 8, 2025

View reviewed changes

ganyi1996ppo reviewed May 8, 2025

View reviewed changes

ganyi1996ppo approved these changes May 9, 2025

View reviewed changes

ganyi1996ppo merged commit 2c685e3 into vllm-project:main May 9, 2025
14 checks passed

Yikun mentioned this pull request Jun 13, 2025

Add ShouJian Zheng (@jianzs) as vLLM Ascend maintainer #1203

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] Correct method call for _set_cos_sin_cache#774

[Bugfix] Correct method call for _set_cos_sin_cache#774
ganyi1996ppo merged 1 commit intovllm-project:mainfrom
jianzs:fix/set_cos_sin

jianzs commented May 7, 2025 •

edited

Loading

Uh oh!

jianzs commented May 7, 2025

Uh oh!

jianzs commented May 8, 2025

Uh oh!

ganyi1996ppo May 8, 2025

Uh oh!

jianzs May 8, 2025 •

edited

Loading

Uh oh!

ganyi1996ppo May 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jianzs commented May 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jianzs commented May 7, 2025

Uh oh!

jianzs commented May 8, 2025

Uh oh!

ganyi1996ppo May 8, 2025

Choose a reason for hiding this comment

Uh oh!

jianzs May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ganyi1996ppo May 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jianzs commented May 7, 2025 •

edited

Loading

jianzs May 8, 2025 •

edited

Loading