Skip to content

fix(gpt-oss): prefer flex attention over sdpa#5701

Merged
danielhanchen merged 6 commits into
mainfrom
fix-gpt-oss-flex-attn
May 22, 2026
Merged

fix(gpt-oss): prefer flex attention over sdpa#5701
danielhanchen merged 6 commits into
mainfrom
fix-gpt-oss-flex-attn

Conversation

@Datta0
Copy link
Copy Markdown
Collaborator

@Datta0 Datta0 commented May 22, 2026

With recent changes to prefer SDPA over Flex in #5346 we face the issue of gpt oss trying SDPA which should ideally use Flex. We add _SDPA_EXCLUDED to address this

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enables flex_attention for the gpt_oss model by moving it from the excluded to the preferred models list. Simultaneously, it explicitly disables sdpa for gpt_oss across the utility and loader modules, ensuring the model falls back to the eager implementation when flex_attention is unavailable. New unit tests have been added to verify these attention implementation preferences and fallback behaviors. I have no feedback to provide.

Datta0 and others added 3 commits May 22, 2026 14:39
@Datta0 Datta0 marked this pull request as ready for review May 22, 2026 15:05
@Datta0
Copy link
Copy Markdown
Collaborator Author

Datta0 commented May 22, 2026

8oQK4ErFwum colab tested

@danielhanchen danielhanchen merged commit ed1e392 into main May 22, 2026
38 of 43 checks passed
@danielhanchen danielhanchen deleted the fix-gpt-oss-flex-attn branch May 22, 2026 15:38
rsd-darshan pushed a commit to rsd-darshan/unsloth that referenced this pull request Jun 3, 2026
* fix(gpt-oss): prefer flex attention over sdpa

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(gpt-oss): use eager config for unsupported backends

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants