Skip to content

add flash_attention_causal_mask to run_lm_eval.py#142

Merged
MrGeva merged 1 commit into
habana-mainfrom
dev/dlester/flash_attention_causal_mask
Apr 3, 2024
Merged

add flash_attention_causal_mask to run_lm_eval.py#142
MrGeva merged 1 commit into
habana-mainfrom
dev/dlester/flash_attention_causal_mask

Conversation

@dudilester
Copy link
Copy Markdown

add flash_attention_causal_mask to run_lm_eval.py

@astachowiczhabana
Copy link
Copy Markdown

huggingface#972

@dudilester
Copy link
Copy Markdown
Author

upstream URL
huggingface#976

astachowiczhabana pushed a commit that referenced this pull request Mar 5, 2025
* [SW-199696] Supporting Dynamic Quantization

Change-Id: I28649819baeed37b59c793b9f2939cba7c42fb6e

* Adjusted to latest scales refactoring

* Update fields maxabs_quant_dynamic_quantization.json

---------

Co-authored-by: Danny <dsemiat@habana.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants