[bugfix] fix apply_rotary_emb error on Ascend NPU #38491

FightingZhen · 2025-05-30T08:58:58Z

What does this PR do?

When using Qwen2.5-VL model with Flash Attention 2, we find that the implementation logic about api torch_npu.npu_rotary_mul is a little bit different from the same api in package flash-attn.

The former can only accept input param x and sin/cos with 4-dimension and same attention head dimension, while the latter can accept param sin/cos with 2-dimension and attention head dimension chunked to half.

At the same time, we also find that the api apply_rotary_emb is also used in Qwen2.5-omni with the same situation as Qwen2.5-VL.

Therefore, this PR is committed for solving the above problem, and update flash attention judgement logic in Qwen2.5-omni and ems model from is_flash_attn_2_available to is_flash_attn_available at the same time.

Fixes # (issue)
#38189

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Rocketknight1 · 2025-05-30T12:58:49Z

cc @SunMarc!

SunMarc

Thanks for fixing this ! There are still a few places where is_flash_attn_2_available is used instead of is_flash_attn_available that you introduced. Could you fix this ? Other than that LGTM !

FightingZhen · 2025-05-30T15:47:14Z

Thanks for fixing this ! There are still a few places where is_flash_attn_2_available is used instead of is_flash_attn_available that you introduced. Could you fix this ? Other than that LGTM !

@SunMarc Thanks for your suggestion, after rechecking the places where is_flash_attn_2_available is used, there are still some places where this func can not be replaced yet. Below are the places still need using is_flash_attn_2_available and corresponding reason:

Necessary ones: including the func in modeling_flash_attention_utils.py, modeling_utils.py and testing_utils.py, which are used for checking whether package flash-attn exists or not.
Still in progress ones: including the func in flash_paged.py and model modernbert. In order to make sure using them on Ascend NPU correctly, further experiments are still required. Once experimental results are confirmed, a new PR will be commited as soon as possible.
2.1. The experiment about checking whether paged attention on Ascend NPU is ready or not;
2.2. The experiment about developing corresponding implementation for the func apply_rotary implemented by triton in package flash-attn.

Additionally, this PR still requires some furthur self-tests, so we change it to draft for now. When it is ready, we will invite you for review, thanks!

SunMarc · 2025-05-30T15:52:11Z

Thanks for the update ! Please ping me when this is ready for review !

FightingZhen · 2025-06-03T03:15:35Z

@SunMarc We have finished self-testing the modifications in this PR, it is ready for review and merge :)

SunMarc

LGTM !

HuggingFaceDocBuilderDev · 2025-06-03T09:32:18Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…38491) [bugfix] fix apply_rotary_emb error on Ascend NPU

FightingZhen force-pushed the bugfix_apply_rotary_emb_npu branch 3 times, most recently from 841f29d to d7a6a1d Compare May 30, 2025 09:36

SunMarc reviewed May 30, 2025

View reviewed changes

SunMarc marked this pull request as ready for review May 30, 2025 14:37

FightingZhen changed the title ~~[bugfix] fix apply_rotary_emb error on Ascend NPU~~ [bugfix] [WIP] fix apply_rotary_emb error on Ascend NPU May 30, 2025

FightingZhen force-pushed the bugfix_apply_rotary_emb_npu branch 4 times, most recently from 660f1a3 to bb57f04 Compare May 30, 2025 15:46

FightingZhen marked this pull request as draft May 30, 2025 15:46

[bugfix] fix apply_rotary_emb error on Ascend NPU

5b44cfc

FightingZhen force-pushed the bugfix_apply_rotary_emb_npu branch from bb57f04 to 5b44cfc Compare June 3, 2025 03:04

FightingZhen marked this pull request as ready for review June 3, 2025 03:09

SunMarc approved these changes Jun 3, 2025

View reviewed changes

SunMarc enabled auto-merge (squash) June 3, 2025 08:53

FightingZhen changed the title ~~[bugfix] [WIP] fix apply_rotary_emb error on Ascend NPU~~ [bugfix] fix apply_rotary_emb error on Ascend NPU Jun 3, 2025

Merge branch 'main' into bugfix_apply_rotary_emb_npu

52518b5

SunMarc merged commit fdf86fb into huggingface:main Jun 3, 2025
20 checks passed

FightingZhen mentioned this pull request Jun 3, 2025

Qwen2.5-VL using ascend NPU with flash-attention-2 raises error #38189

Closed

4 tasks

bvantuan pushed a commit to bvantuan/transformers that referenced this pull request Jun 12, 2025

[bugfix] [WIP] fix apply_rotary_emb error on Ascend NPU (huggingface#…

1d0c166

…38491) [bugfix] fix apply_rotary_emb error on Ascend NPU

FightingZhen deleted the bugfix_apply_rotary_emb_npu branch August 14, 2025 01:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[bugfix] fix apply_rotary_emb error on Ascend NPU #38491

[bugfix] fix apply_rotary_emb error on Ascend NPU #38491

Uh oh!

FightingZhen commented May 30, 2025 •

edited

Loading

Uh oh!

Rocketknight1 commented May 30, 2025

Uh oh!

SunMarc left a comment

Uh oh!

FightingZhen commented May 30, 2025

Uh oh!

SunMarc commented May 30, 2025

Uh oh!

FightingZhen commented Jun 3, 2025

Uh oh!

SunMarc left a comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Jun 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[bugfix] fix apply_rotary_emb error on Ascend NPU #38491

[bugfix] fix apply_rotary_emb error on Ascend NPU #38491

Uh oh!

Conversation

FightingZhen commented May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Uh oh!

Rocketknight1 commented May 30, 2025

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

FightingZhen commented May 30, 2025

Uh oh!

SunMarc commented May 30, 2025

Uh oh!

FightingZhen commented Jun 3, 2025

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Jun 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

FightingZhen commented May 30, 2025 •

edited

Loading