[Bugfix] Correct method call for _set_cos_sin_cache#774
Merged
ganyi1996ppo merged 1 commit intovllm-project:mainfrom May 9, 2025
Merged
[Bugfix] Correct method call for _set_cos_sin_cache#774ganyi1996ppo merged 1 commit intovllm-project:mainfrom
ganyi1996ppo merged 1 commit intovllm-project:mainfrom
Conversation
Collaborator
Author
|
@wangxiyuan @Yikun The vLLM main branch and v0.8.5.post1 have different |
c6db90e to
f4cc50f
Compare
This change ensures proper functionality for longer sequences by correctly invoking the _set_cos_sin_cache method with self as the first argument. Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>
Collaborator
Author
|
@ganyi1996ppo @Yikun Ready to merge. |
ApsarasX
approved these changes
May 8, 2025
ganyi1996ppo
reviewed
May 8, 2025
| offsets: Optional[torch.Tensor] = None, | ||
| max_seq_len: Optional[int] = None): | ||
| if max_seq_len is not None and max_seq_len > self.max_seq_len: | ||
| self._set_cos_sin_cache(max_seq_len, query.device, query.dtype) |
Collaborator
There was a problem hiding this comment.
Just curious, why this invoke may cause crash issue
Collaborator
Author
There was a problem hiding this comment.
Just curious, why this invoke may cause crash issue
DeepseekScalingRotaryEmbedding doesn't have a _set_cos_sin_cache method. Instead, it uses the self as the first parameter directly when calling the function.
Collaborator
There was a problem hiding this comment.
oh, right, we did not patch this function inside the code, thanks for this report and fix
ganyi1996ppo
approved these changes
May 9, 2025
venus-taibai
pushed a commit
to venus-taibai/vllm-ascend
that referenced
this pull request
May 15, 2025
…llm-project#774) Merge branch wengang/dev-v0.8.5.508-cherry-pick of git@code.alipay.com:Theta/vllm-ascend.git into dev-v0.8.5.508 https://code.alipay.com/Theta/vllm-ascend/pull_requests/23 Signed-off-by: 沧濯 <zhengshoujian.zsj@antgroup.com> * [Bugfix] Correct method call for _set_cos_sin_cache (vllm-project#774)
jianzs
pushed a commit
that referenced
this pull request
Jun 13, 2025
### What this PR does / why we need it? Add @jianzs as vLLM Ascend maintainer @jianzs ---- I would like to nominate Shoujian Zheng (@jianzs <https://github.com/jianzs>) as a maintainer, starting with my +1. - He focuses on the code quality and good design with solid reviews in P/D disaggregation and DeepSeek improvement area about 30+ high quality review, such as #issuecomment-2811764833, #discussion_r2069927605 and #pullrequestreview-2820996674. This is the most important reason why I nominated him, because helping community developers complete PRs with high quality and continuously ensure the quality of codebase is one of the important responsibilities of a maintainer. We believe he is a great addition. - Shoujian's main expertise is distributed inference. He has a lot of experience in production about AI infra. He has very good habits and explains in great detail all changes #issue-3023082580 anqd share results open: #issuecomment-2853140443. And High quality PR: #706, #774, #852. - Community Involvement: Active involved in community discussion, he is collaborative and helps the users solve problems, involved in 30+ PR and issue, such as #issuecomment-2911934292 and #issuecomment-2833523571. Reference: [1] https://vllm-ascend.readthedocs.io/en/latest/community/contributors.html [2] https://vllm-ascend.readthedocs.io/en/latest/community/governance.html Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
momo609
pushed a commit
to momo609/vllm-ascend
that referenced
this pull request
Jun 17, 2025
…1203) ### What this PR does / why we need it? Add @jianzs as vLLM Ascend maintainer @jianzs ---- I would like to nominate Shoujian Zheng (@jianzs <https://github.com/jianzs>) as a maintainer, starting with my +1. - He focuses on the code quality and good design with solid reviews in P/D disaggregation and DeepSeek improvement area about 30+ high quality review, such as #issuecomment-2811764833, #discussion_r2069927605 and #pullrequestreview-2820996674. This is the most important reason why I nominated him, because helping community developers complete PRs with high quality and continuously ensure the quality of codebase is one of the important responsibilities of a maintainer. We believe he is a great addition. - Shoujian's main expertise is distributed inference. He has a lot of experience in production about AI infra. He has very good habits and explains in great detail all changes #issue-3023082580 anqd share results open: #issuecomment-2853140443. And High quality PR: vllm-project#706, vllm-project#774, vllm-project#852. - Community Involvement: Active involved in community discussion, he is collaborative and helps the users solve problems, involved in 30+ PR and issue, such as #issuecomment-2911934292 and #issuecomment-2833523571. Reference: [1] https://vllm-ascend.readthedocs.io/en/latest/community/contributors.html [2] https://vllm-ascend.readthedocs.io/en/latest/community/governance.html Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
momo609
pushed a commit
to momo609/vllm-ascend
that referenced
this pull request
Jun 17, 2025
…1203) ### What this PR does / why we need it? Add @jianzs as vLLM Ascend maintainer @jianzs ---- I would like to nominate Shoujian Zheng (@jianzs <https://github.com/jianzs>) as a maintainer, starting with my +1. - He focuses on the code quality and good design with solid reviews in P/D disaggregation and DeepSeek improvement area about 30+ high quality review, such as #issuecomment-2811764833, #discussion_r2069927605 and #pullrequestreview-2820996674. This is the most important reason why I nominated him, because helping community developers complete PRs with high quality and continuously ensure the quality of codebase is one of the important responsibilities of a maintainer. We believe he is a great addition. - Shoujian's main expertise is distributed inference. He has a lot of experience in production about AI infra. He has very good habits and explains in great detail all changes #issue-3023082580 anqd share results open: #issuecomment-2853140443. And High quality PR: vllm-project#706, vllm-project#774, vllm-project#852. - Community Involvement: Active involved in community discussion, he is collaborative and helps the users solve problems, involved in 30+ PR and issue, such as #issuecomment-2911934292 and #issuecomment-2833523571. Reference: [1] https://vllm-ascend.readthedocs.io/en/latest/community/contributors.html [2] https://vllm-ascend.readthedocs.io/en/latest/community/governance.html Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Signed-off-by: wangxiaoxin (A) <wangxiaoxin7@huawei.com>
momo609
pushed a commit
to momo609/vllm-ascend
that referenced
this pull request
Jun 17, 2025
…1203) ### What this PR does / why we need it? Add @jianzs as vLLM Ascend maintainer @jianzs ---- I would like to nominate Shoujian Zheng (@jianzs <https://github.com/jianzs>) as a maintainer, starting with my +1. - He focuses on the code quality and good design with solid reviews in P/D disaggregation and DeepSeek improvement area about 30+ high quality review, such as #issuecomment-2811764833, #discussion_r2069927605 and #pullrequestreview-2820996674. This is the most important reason why I nominated him, because helping community developers complete PRs with high quality and continuously ensure the quality of codebase is one of the important responsibilities of a maintainer. We believe he is a great addition. - Shoujian's main expertise is distributed inference. He has a lot of experience in production about AI infra. He has very good habits and explains in great detail all changes #issue-3023082580 anqd share results open: #issuecomment-2853140443. And High quality PR: vllm-project#706, vllm-project#774, vllm-project#852. - Community Involvement: Active involved in community discussion, he is collaborative and helps the users solve problems, involved in 30+ PR and issue, such as #issuecomment-2911934292 and #issuecomment-2833523571. Reference: [1] https://vllm-ascend.readthedocs.io/en/latest/community/contributors.html [2] https://vllm-ascend.readthedocs.io/en/latest/community/governance.html Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Signed-off-by: wangxiaoxin (A) <wangxiaoxin7@huawei.com>
shiyuan680
pushed a commit
to raindaywhu/vllm-ascend
that referenced
this pull request
Jul 7, 2025
…1203) ### What this PR does / why we need it? Add @jianzs as vLLM Ascend maintainer @jianzs ---- I would like to nominate Shoujian Zheng (@jianzs <https://github.com/jianzs>) as a maintainer, starting with my +1. - He focuses on the code quality and good design with solid reviews in P/D disaggregation and DeepSeek improvement area about 30+ high quality review, such as #issuecomment-2811764833, #discussion_r2069927605 and #pullrequestreview-2820996674. This is the most important reason why I nominated him, because helping community developers complete PRs with high quality and continuously ensure the quality of codebase is one of the important responsibilities of a maintainer. We believe he is a great addition. - Shoujian's main expertise is distributed inference. He has a lot of experience in production about AI infra. He has very good habits and explains in great detail all changes #issue-3023082580 anqd share results open: #issuecomment-2853140443. And High quality PR: vllm-project#706, vllm-project#774, vllm-project#852. - Community Involvement: Active involved in community discussion, he is collaborative and helps the users solve problems, involved in 30+ PR and issue, such as #issuecomment-2911934292 and #issuecomment-2833523571. Reference: [1] https://vllm-ascend.readthedocs.io/en/latest/community/contributors.html [2] https://vllm-ascend.readthedocs.io/en/latest/community/governance.html Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
chopper0126
pushed a commit
to chopper0126/vllm-ascend
that referenced
this pull request
Oct 16, 2025
This change ensures proper functionality for longer sequences by correctly invoking the _set_cos_sin_cache method with self as the first argument. For example, with DeepSeek R1, if this change isn't made, the program will crash when the input sequence exceeds 4096. Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>
chopper0126
pushed a commit
to chopper0126/vllm-ascend
that referenced
this pull request
Oct 16, 2025
…1203) ### What this PR does / why we need it? Add @jianzs as vLLM Ascend maintainer @jianzs ---- I would like to nominate Shoujian Zheng (@jianzs <https://github.com/jianzs>) as a maintainer, starting with my +1. - He focuses on the code quality and good design with solid reviews in P/D disaggregation and DeepSeek improvement area about 30+ high quality review, such as #issuecomment-2811764833, #discussion_r2069927605 and #pullrequestreview-2820996674. This is the most important reason why I nominated him, because helping community developers complete PRs with high quality and continuously ensure the quality of codebase is one of the important responsibilities of a maintainer. We believe he is a great addition. - Shoujian's main expertise is distributed inference. He has a lot of experience in production about AI infra. He has very good habits and explains in great detail all changes #issue-3023082580 anqd share results open: #issuecomment-2853140443. And High quality PR: vllm-project#706, vllm-project#774, vllm-project#852. - Community Involvement: Active involved in community discussion, he is collaborative and helps the users solve problems, involved in 30+ PR and issue, such as #issuecomment-2911934292 and #issuecomment-2833523571. Reference: [1] https://vllm-ascend.readthedocs.io/en/latest/community/contributors.html [2] https://vllm-ascend.readthedocs.io/en/latest/community/governance.html Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Angazenn
pushed a commit
to Angazenn/vllm-ascend
that referenced
this pull request
Oct 21, 2025
This change ensures proper functionality for longer sequences by correctly invoking the _set_cos_sin_cache method with self as the first argument. For example, with DeepSeek R1, if this change isn't made, the program will crash when the input sequence exceeds 4096. Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>
Angazenn
pushed a commit
to Angazenn/vllm-ascend
that referenced
this pull request
Oct 21, 2025
…1203) ### What this PR does / why we need it? Add @jianzs as vLLM Ascend maintainer @jianzs ---- I would like to nominate Shoujian Zheng (@jianzs <https://github.com/jianzs>) as a maintainer, starting with my +1. - He focuses on the code quality and good design with solid reviews in P/D disaggregation and DeepSeek improvement area about 30+ high quality review, such as #issuecomment-2811764833, #discussion_r2069927605 and #pullrequestreview-2820996674. This is the most important reason why I nominated him, because helping community developers complete PRs with high quality and continuously ensure the quality of codebase is one of the important responsibilities of a maintainer. We believe he is a great addition. - Shoujian's main expertise is distributed inference. He has a lot of experience in production about AI infra. He has very good habits and explains in great detail all changes #issue-3023082580 anqd share results open: #issuecomment-2853140443. And High quality PR: vllm-project#706, vllm-project#774, vllm-project#852. - Community Involvement: Active involved in community discussion, he is collaborative and helps the users solve problems, involved in 30+ PR and issue, such as #issuecomment-2911934292 and #issuecomment-2833523571. Reference: [1] https://vllm-ascend.readthedocs.io/en/latest/community/contributors.html [2] https://vllm-ascend.readthedocs.io/en/latest/community/governance.html Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This change ensures proper functionality for longer sequences by correctly invoking the _set_cos_sin_cache method with self as the first argument.
For example, with DeepSeek R1, if this change isn't made, the program will crash when the input sequence exceeds 4096.