Skip to content

[Hardware][Ascend] Add silu_and_mul/rope; Add mix ops into attention layer#18

Merged
ganyi1996ppo merged 3 commits intovllm-project:developfrom
whx-sjtu:add_four_mixops
Feb 8, 2025
Merged

[Hardware][Ascend] Add silu_and_mul/rope; Add mix ops into attention layer#18
ganyi1996ppo merged 3 commits intovllm-project:developfrom
whx-sjtu:add_four_mixops

Conversation

@whx-sjtu
Copy link
Collaborator

@whx-sjtu whx-sjtu commented Feb 7, 2025

Add silu_and_mul and rope ops;
Replace the original ops in attention impl with three mixed ops: reshape_and_cache, pagedattention and selfattention for better performance.

hw_whx added 2 commits February 7, 2025 12:40
Signed-off-by: hw_whx <wanghexiang7@huawei.com>
…ease

Signed-off-by: hw_whx <wanghexiang7@huawei.com>
from vllm.model_executor.layers.rotary_embedding import RotaryEmbedding


def rope_forward_oot(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does oot mean?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of tree, which means plugin backend types in vllm

Signed-off-by: hw_whx <wanghexiang7@huawei.com>
@wuhuikx
Copy link
Contributor

wuhuikx commented Feb 8, 2025

lgtm

@ganyi1996ppo ganyi1996ppo merged commit 49e5baf into vllm-project:develop Feb 8, 2025
1 check passed
@whx-sjtu whx-sjtu deleted the add_four_mixops branch February 11, 2025 02:51
hust17yixuan pushed a commit to hust17yixuan/vllm-ascend that referenced this pull request Jan 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants