Quantization for FSDPA by dudilester · Pull Request #967 · huggingface/optimum-habana

dudilester · 2024-05-08T13:17:15Z

Added use_flash_attention, flash_attention_causal_mask and flash_attention_recompute to run_lm_eval
Enforce recompute flag on fsdpa quantization
Allow quantization using HQT

…at matches its scale method (#92)

* Done to allow quantization using HQT * Added use_flash_attention and flash_attention_recompute to run_lm_eval

dudilester · 2024-05-08T15:11:11Z

These commits are related to version 1.16.0
@libinta pleased add the synapse 1.16 dependency tag. thx.

ssarkar2

@dudilester can we close this PR? since this looks like an older version of #976

dudilester added 4 commits May 8, 2024 16:10

added text-generation quantization_config example file with a name th…

83b3605

…at matches its scale method (#92)

Encapsulate FSDPA in GaudiLlamaAttention (#129)

e231aa5

* Done to allow quantization using HQT * Added use_flash_attention and flash_attention_recompute to run_lm_eval

enforce recompute flag on fsdpa quantization (#133)

86fa5b6

add flash_attention_causal_mask to run_lm_eval.py (#142)

659b2d1

dudilester requested review from libinta and mandy-li as code owners May 8, 2024 13:17

dudilester requested a review from a user May 8, 2024 13:17

dudilester requested a review from regisss as a code owner May 8, 2024 13:17

dudilester mentioned this pull request May 8, 2024

Encapsulate FSDPA in GaudiLlamaAttention #882

Closed

libinta added the synapse 1.16_dependency synapse 1.16 dependency label May 8, 2024

libinta reviewed May 9, 2024

View reviewed changes

Comment thread examples/text-generation/quantization_config/act_maxabs_pow2_weights_pcs_opt_pow2_quant.json

wszczurekhabana mentioned this pull request May 10, 2024

Fast softmax #972

Merged

dudilester mentioned this pull request May 13, 2024

Quantization for FSDPA #976

Merged

ssarkar2 self-requested a review May 16, 2024 21:51

ssarkar2 reviewed May 16, 2024

View reviewed changes

dudilester closed this May 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantization for FSDPA#967

Quantization for FSDPA#967
dudilester wants to merge 4 commits into
huggingface:mainfrom
HabanaAI:dev/dlester/for_oh_1.16

dudilester commented May 8, 2024

Uh oh!

dudilester commented May 8, 2024

Uh oh!

Uh oh!

ssarkar2 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

dudilester commented May 8, 2024

Uh oh!

dudilester commented May 8, 2024

Uh oh!

Uh oh!

ssarkar2 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants