Skip to content

Disable SDPA Attention for gpt-bigcode model#78

Merged
libinta merged 2 commits into
habana-mainfrom
jha/bigcodegpt
Mar 5, 2024
Merged

Disable SDPA Attention for gpt-bigcode model#78
libinta merged 2 commits into
habana-mainfrom
jha/bigcodegpt

Conversation

@jiminha
Copy link
Copy Markdown

@jiminha jiminha commented Feb 27, 2024

What does this PR do?

Disable SDPA attention until gpt-bigcode model support FusedSDPA attention.

@jiminha jiminha requested a review from libinta February 27, 2024 02:55
Copy link
Copy Markdown

@vivekgoe vivekgoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiminha what issue is this PR fixing?

@jiminha
Copy link
Copy Markdown
Author

jiminha commented Feb 27, 2024

@jiminha what issue is this PR fixing?
This is fixing the issue SW-176426(TypeError: GPTBigCodeSdpaAttention.forward() got an unexpected keyword argument 'token_idx'). Transformer4.37.2/PT2.2 upgrade enabled sdpd attention for the gpt-bigcode model. This is temporary fix so the model can run with original attention layer without an error until FusedSDPA is enabled for this model.

@vivekgoe
Copy link
Copy Markdown

@jiminha It is ok to disable SDPA attention if it is causing problems. But I see a problem with using "_use_sdpa" flag here

if self._use_sdpa and head_mask is None and not output_attentions:

This creates confusion because "_use_sdpa" is used in transformers to refer to torch SDPA which we should differentiate from habana SDPA. We already fixed 1 problem related to this for Llama in this PR #73.

Copy link
Copy Markdown

@vivekgoe vivekgoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with @libinta LGTM

@vivekgoe
Copy link
Copy Markdown

vivekgoe commented Mar 2, 2024

Looks good to me. @libinta please go ahead and merge.

@libinta libinta merged commit 1cd773d into habana-main Mar 5, 2024
astachowiczhabana pushed a commit that referenced this pull request Apr 5, 2024
* Disable SDPA Attention for gpt-bigcode model

* Update argument to take "attn_implementation"
astachowiczhabana pushed a commit that referenced this pull request Apr 5, 2024
* Disable SDPA Attention for gpt-bigcode model

* Update argument to take "attn_implementation"
@astachowiczhabana
Copy link
Copy Markdown

huggingface#771

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants