[SW-185803] Enable FusedSDPA fp8 in Llama FT by pbielak · Pull Request #310 · HabanaAI/optimum-habana-fork

pbielak · 2024-07-17T14:20:22Z

Same PR as in #291, but I fixed the error occurring in the text-generation/run_lm_eval.py script by moving one import from top-level to function-level, i.e., see 3af45d2

wszczurekhabana

LGTM!

vivekgoe

@pbielak @wszczurekhabana added a few comments, please check.

vivekgoe · 2024-07-30T05:11:35Z

@pbielak Thanks for updates. Please check my latest responses. Please also resolve conflicts in modeling_llama.py.

- Update attention module and usages - Add --flash_attention_fp8 flag - Fix failure in distributed text-generation - Add assert for Gaudi 3 - Remove unnecessary repeat_kv and reshape - Rename FusedAttention to FusedAttentionTE - Move flash_attention_fp8 checks - Fix fused_scaled_dot_product_attention calls

- Update attention module and usages - Add --flash_attention_fp8 flag - Fix failure in distributed text-generation - Add assert for Gaudi 3 - Remove unnecessary repeat_kv and reshape - Rename FusedAttention to FusedAttentionTE - Move flash_attention_fp8 checks - Fix fused_scaled_dot_product_attention calls Change-Id: Ica468bb23931a78e2a23f6cb9bc60f87dd442007

pbielak requested review from bhargaveede, libinta, mandy-li, ssarkar2 and vivekgoe as code owners July 17, 2024 14:20

pbielak requested a review from a user July 17, 2024 14:20

wszczurekhabana self-requested a review July 18, 2024 08:24

pbielak requested review from MrGeva, guyeilat, oabramovich and scsudhak-intel and removed request for a user, libinta, mandy-li and ssarkar2 July 18, 2024 12:09

wszczurekhabana approved these changes Jul 18, 2024

View reviewed changes

vivekgoe suggested changes Jul 29, 2024

View reviewed changes

vivekgoe reviewed Jul 30, 2024

View reviewed changes

Comment thread optimum/habana/accelerate/utils/transformer_engine.py Outdated

pbielak force-pushed the dev/pbielak/enable-fusedSDPA-fp8 branch 4 times, most recently from 41f8967 to c77f6fb Compare July 31, 2024 12:42

pbielak force-pushed the dev/pbielak/enable-fusedSDPA-fp8 branch from c77f6fb to 19384dd Compare July 31, 2024 12:43

vivekgoe approved these changes Aug 1, 2024

View reviewed changes

vivekgoe merged commit 234cc25 into habana-main Aug 1, 2024

pbielak deleted the dev/pbielak/enable-fusedSDPA-fp8 branch August 1, 2024 10:04

vidyasiv mentioned this pull request Aug 2, 2024

[SW-185803] Enable FusedSDPA fp8 in Llama FT #291

Merged

pbielak pushed a commit that referenced this pull request Oct 1, 2024

[SW-185803] Enable FusedSDPA fp8 in Llama FT (#291) (#310)

1969a89

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SW-185803] Enable FusedSDPA fp8 in Llama FT#310

[SW-185803] Enable FusedSDPA fp8 in Llama FT#310
vivekgoe merged 1 commit into
habana-mainfrom
dev/pbielak/enable-fusedSDPA-fp8

pbielak commented Jul 17, 2024

Uh oh!

wszczurekhabana left a comment

Uh oh!

vivekgoe left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vivekgoe commented Jul 30, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pbielak commented Jul 17, 2024

Uh oh!

wszczurekhabana left a comment

Choose a reason for hiding this comment

Uh oh!

vivekgoe left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vivekgoe commented Jul 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vivekgoe commented Jul 30, 2024 •

edited

Loading