Skip to content

[AMD] Support FP8E5M2 with MFMA FP16 instructions#4259

Merged
antiagainst merged 4 commits intotriton-lang:mainfrom
binarman:imprecise_acc
Aug 24, 2024
Merged

[AMD] Support FP8E5M2 with MFMA FP16 instructions#4259
antiagainst merged 4 commits intotriton-lang:mainfrom
binarman:imprecise_acc

Conversation

@binarman
Copy link
Copy Markdown
Contributor

@binarman binarman commented Jul 4, 2024

Cast dot arguments from unsupported FP8 to supported FP16 in order to use MFMA instructions instead of FMA.
This approach is expected to give better performance and be more stable compared to FMA implementation.

@alefimov-amd
Copy link
Copy Markdown
Contributor

+cc @antiagainst @zhanglx13

binarman added 2 commits July 8, 2024 19:56
Cast dot arguments from unsupported FP8 to supported FP16 in order to use MFMA instructions instead of FMA.
This approach is expected to give better performance and be more stable compared to FMA implementation.
@antiagainst antiagainst marked this pull request as ready for review August 24, 2024 05:21
@antiagainst antiagainst merged commit a78c9c4 into triton-lang:main Aug 24, 2024
bertmaher pushed a commit to bertmaher/triton that referenced this pull request Dec 10, 2024
Cast dot arguments from unsupported FP8 to supported FP16 in order to
use MFMA instructions instead of FMA.
This approach is expected to give better performance and be more stable
compared to FMA implementation.

---------

Co-authored-by: Lei Zhang <antiagainst@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants