Skip to content

Quantization for FSDPA#967

Closed
dudilester wants to merge 4 commits into
huggingface:mainfrom
HabanaAI:dev/dlester/for_oh_1.16
Closed

Quantization for FSDPA#967
dudilester wants to merge 4 commits into
huggingface:mainfrom
HabanaAI:dev/dlester/for_oh_1.16

Conversation

@dudilester
Copy link
Copy Markdown
Contributor

Added use_flash_attention, flash_attention_causal_mask and flash_attention_recompute to run_lm_eval
Enforce recompute flag on fsdpa quantization
Allow quantization using HQT

@dudilester dudilester requested a review from a user May 8, 2024 13:17
@dudilester dudilester requested a review from regisss as a code owner May 8, 2024 13:17
@dudilester
Copy link
Copy Markdown
Contributor Author

These commits are related to version 1.16.0
@libinta pleased add the synapse 1.16 dependency tag. thx.

@libinta libinta added the synapse 1.16_dependency synapse 1.16 dependency label May 8, 2024
@wszczurekhabana wszczurekhabana mentioned this pull request May 10, 2024
@ssarkar2 ssarkar2 self-requested a review May 16, 2024 21:51
Copy link
Copy Markdown
Contributor

@ssarkar2 ssarkar2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dudilester can we close this PR? since this looks like an older version of #976

@dudilester dudilester closed this May 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

synapse 1.16_dependency synapse 1.16 dependency

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants