[SW-185803] Enable FusedSDPA fp8 in Llama FT by pbielak · Pull Request #291 · HabanaAI/optimum-habana-fork

pbielak · 2024-07-10T11:26:03Z

This PR enables the usage of Fused Scaled Dot Product Attention in the FP8 version of the LLama model. Tested on LLama finetuning using LoRA. Set the --flash_attention_fp8 flag to use FusedSDPA.

- Update attention module and usages - Add --flash_attention_fp8 flag

vivekgoe

LGTM

This reverts commit 35f6fbe.

- Update attention module and usages - Add --flash_attention_fp8 flag

…" (#295) This reverts commit 35f6fbe. Co-authored-by: Eran Geva <egeva@habana.ai>

- Update attention module and usages - Add --flash_attention_fp8 flag

astachowiczhabana · 2024-07-29T11:46:54Z

Unmatched PR

- Update attention module and usages - Add --flash_attention_fp8 flag - Fix failure in distributed text-generation - Add assert for Gaudi 3 - Add flag to README.md - Remove unnecessary repeat_kv and reshape - Rename FusedAttention to FusedAttentionTE

- Update attention module and usages - Add --flash_attention_fp8 flag - Fix failure in distributed text-generation - Add assert for Gaudi 3 - Remove unnecessary repeat_kv and reshape - Rename FusedAttention to FusedAttentionTE

- Update attention module and usages - Add --flash_attention_fp8 flag - Fix failure in distributed text-generation - Add assert for Gaudi 3 - Remove unnecessary repeat_kv and reshape - Rename FusedAttention to FusedAttentionTE - Move flash_attention_fp8 checks - Fix fused_scaled_dot_product_attention calls

vidyasiv · 2024-08-02T22:31:21Z

@pbielak, please propagate #291 and #310 to optimum-habana for v1.17 release. The documentation PR already appears to be in OH.

- Update attention module and usages - Add --flash_attention_fp8 flag - Fix failure in distributed text-generation - Add assert for Gaudi 3 - Remove unnecessary repeat_kv and reshape - Rename FusedAttention to FusedAttentionTE - Move flash_attention_fp8 checks - Fix fused_scaled_dot_product_attention calls Change-Id: Ica468bb23931a78e2a23f6cb9bc60f87dd442007

[SW-185803] Enable FusedSDPA fp8 in Llama FT

c069484

- Update attention module and usages - Add --flash_attention_fp8 flag

pbielak requested review from bhargaveede, libinta, mandy-li, ssarkar2 and vivekgoe as code owners July 10, 2024 11:26

pbielak requested a review from a user July 10, 2024 11:26

vivekgoe requested a review from scsudhak-intel July 10, 2024 11:47

vivekgoe approved these changes Jul 12, 2024

View reviewed changes

vivekgoe merged commit 35f6fbe into habana-main Jul 12, 2024

MrGeva pushed a commit that referenced this pull request Jul 14, 2024

Revert "[SW-185803] Enable FusedSDPA fp8 in Llama FT (#291)"

6631d1f

This reverts commit 35f6fbe.

kalyanjk pushed a commit to kalyanjk/optimum-habana-fork that referenced this pull request Jul 15, 2024

[SW-185803] Enable FusedSDPA fp8 in Llama FT (HabanaAI#291)

0d38afe

- Update attention module and usages - Add --flash_attention_fp8 flag

oabramovich pushed a commit that referenced this pull request Jul 15, 2024

[SW-192667] Revert "[SW-185803] Enable FusedSDPA fp8 in Llama FT (#291)…

ac42d0a

…" (#295) This reverts commit 35f6fbe. Co-authored-by: Eran Geva <egeva@habana.ai>

pbielak added a commit that referenced this pull request Jul 17, 2024

[SW-185803] Enable FusedSDPA fp8 in Llama FT (#291)

d23c4d1

- Update attention module and usages - Add --flash_attention_fp8 flag

pbielak mentioned this pull request Jul 17, 2024

[SW-185803] Enable FusedSDPA fp8 in Llama FT #310

Merged

pbielak pushed a commit that referenced this pull request Oct 1, 2024

[SW-185803] Enable FusedSDPA fp8 in Llama FT (#291) (#310)

1969a89

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SW-185803] Enable FusedSDPA fp8 in Llama FT#291

[SW-185803] Enable FusedSDPA fp8 in Llama FT#291
vivekgoe merged 1 commit into
habana-mainfrom
dev/pbielak/enable-fusedSDPA-fp8

pbielak commented Jul 10, 2024

Uh oh!

vivekgoe left a comment

Uh oh!

astachowiczhabana commented Jul 29, 2024

Uh oh!

vidyasiv commented Aug 2, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

pbielak commented Jul 10, 2024

Uh oh!

vivekgoe left a comment

Choose a reason for hiding this comment

Uh oh!

astachowiczhabana commented Jul 29, 2024

Uh oh!

vidyasiv commented Aug 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

vidyasiv commented Aug 2, 2024 •

edited

Loading