Skip to content

FP8 FusedSDPA support for Mistral#195

Merged
libinta merged 2 commits into
habana-mainfrom
jha/mistralfp8sdpa
May 10, 2024
Merged

FP8 FusedSDPA support for Mistral#195
libinta merged 2 commits into
habana-mainfrom
jha/mistralfp8sdpa

Conversation

@jiminha
Copy link
Copy Markdown

@jiminha jiminha commented May 8, 2024

What does this PR do?

  • Currently Fused SDPA doesn't support fp8, this PR will enable fp8 also for fusedSDPA.
  • update update_sincos_cache for token exceeding max position embedding size.
    Currently FusedSDPA with fp8 with 32k has accuracy issue though. 16k is fine.

@jiminha jiminha requested a review from a user May 8, 2024 18:07
@jiminha jiminha changed the base branch from main to habana-main May 9, 2024 20:06
@libinta libinta merged commit 8f996eb into habana-main May 10, 2024
@astachowiczhabana
Copy link
Copy Markdown

huggingface#931

astachowiczhabana pushed a commit that referenced this pull request Mar 13, 2025
Change-Id: Ic15cd7d04b93a4674425f4752ef31eedb64afe87
astachowiczhabana pushed a commit that referenced this pull request Mar 31, 2025
Change-Id: Ic15cd7d04b93a4674425f4752ef31eedb64afe87
zhanglirong1999 pushed a commit that referenced this pull request Apr 17, 2025
Change-Id: Ic15cd7d04b93a4674425f4752ef31eedb64afe87
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants