Disable SDPA Attention for gpt-bigcode model#78
Conversation
|
|
@jiminha It is ok to disable SDPA attention if it is causing problems. But I see a problem with using "_use_sdpa" flag here This creates confusion because "_use_sdpa" is used in transformers to refer to torch SDPA which we should differentiate from habana SDPA. We already fixed 1 problem related to this for Llama in this PR #73. |
|
Looks good to me. @libinta please go ahead and merge. |
* Disable SDPA Attention for gpt-bigcode model * Update argument to take "attn_implementation"
* Disable SDPA Attention for gpt-bigcode model * Update argument to take "attn_implementation"
What does this PR do?
Disable SDPA attention until gpt-bigcode model support FusedSDPA attention.