Skip to content

[kernels] If flash attention2 is not installed / fails to import (cc on our cluster) default to kernels#40178

Merged
Cyrilvallez merged 14 commits into
mainfrom
kernels-by-default
Aug 28, 2025
Merged

[kernels] If flash attention2 is not installed / fails to import (cc on our cluster) default to kernels#40178
Cyrilvallez merged 14 commits into
mainfrom
kernels-by-default

Conversation

@ArthurZucker

@ArthurZucker ArthurZucker commented Aug 14, 2025

Copy link
Copy Markdown
Collaborator

Improves handling of FlashAttention2 + add community kernel fallback

  • Updated _check_and_adjust_attn_implementation to set the attention implementation to kernels-community/flash-attn when FlashAttention2 is requested but not available, ensuring seamless fallback and proper kernel registration.

Testing Improvements:

  • Modified require_flash_attn in testing_utils.py to allow tests to run if either FlashAttention2 or the community kernel is available, broadening test coverage and reliability.

@HuggingFaceDocBuilderDev

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@ArthurZucker

Copy link
Copy Markdown
Collaborator Author

run-slow: flash_attention_2

@ArthurZucker

Copy link
Copy Markdown
Collaborator Author

run-slow: flash_attention_2

1 similar comment
@ArthurZucker

Copy link
Copy Markdown
Collaborator Author

run-slow: flash_attention_2

@Cyrilvallez Cyrilvallez left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the following is a bit more appropriate, although this code part is starting to be quite convoluted

Comment thread src/transformers/modeling_utils.py Outdated
Comment thread src/transformers/modeling_utils.py Outdated
@ArthurZucker

Copy link
Copy Markdown
Collaborator Author

run-slow: flash_attention_2

1 similar comment
@ArthurZucker

Copy link
Copy Markdown
Collaborator Author

run-slow: flash_attention_2

@Cyrilvallez

Copy link
Copy Markdown
Member

Confirmed on tests locally that it works! Merging

@Cyrilvallez Cyrilvallez merged commit 851b8f2 into main Aug 28, 2025
23 of 25 checks passed
@Cyrilvallez Cyrilvallez deleted the kernels-by-default branch August 28, 2025 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants