[BACKEND] Add support converting MMAV3 accumulator layout to fp8 dot_operand by ThomasRaoux · Pull Request #3370 · triton-lang/triton

ThomasRaoux · 2024-03-14T00:07:09Z

Implement layout conversion invented by Ganesh Bikshandi and Jay Shah described in the following paper:
https://research.colfax-intl.com/wp-content/uploads/2023/12/colfax-flashattention.pdf

This will allow us generating full fp8 attention kernel by having chain of dots in fp8 without going through shared memory.

…operand This will allow us generating full fp8 attention kernel by having chain of dots in fp8 without going through shared memory.

ptillet · 2024-03-14T00:12:21Z

Awesome! Maybe we should credit the https://research.colfax-intl.com/wp-content/uploads/2023/12/colfax-flashattention.pdf in the commit description? I see it's already in the comments of the code so no problem there :)

ThomasRaoux · 2024-03-14T00:20:36Z

Awesome! Maybe we should credit the https://research.colfax-intl.com/wp-content/uploads/2023/12/colfax-flashattention.pdf in the commit description? I see it's already in the comments of the code so no problem there :)

Good point, yes it should be added to the commit description as well.

…6 as it was implicitly converted to fp32

Fix AMD backend failure caused by #3362 and #3370

…operand (triton-lang#3370) Implement layout conversion invented by Ganesh Bikshandi and Jay Shah described in the following paper: https://research.colfax-intl.com/wp-content/uploads/2023/12/colfax-flashattention.pdf This will allow us generating full fp8 attention kernel by having chain of dots in fp8 without going through shared memory.

…ang#3372) Fix AMD backend failure caused by triton-lang#3362 and triton-lang#3370

…operand (triton-lang#3370) Implement layout conversion invented by Ganesh Bikshandi and Jay Shah described in the following paper: https://research.colfax-intl.com/wp-content/uploads/2023/12/colfax-flashattention.pdf This will allow us generating full fp8 attention kernel by having chain of dots in fp8 without going through shared memory.

…ang#3372) Fix AMD backend failure caused by triton-lang#3362 and triton-lang#3370

[BACKEND] Add support converting MMAV3 accumulator layout to fp8 dot_…

3130b05

…operand This will allow us generating full fp8 attention kernel by having chain of dots in fp8 without going through shared memory.

ThomasRaoux requested review from Jokeren and ptillet as code owners March 14, 2024 00:07

ThomasRaoux added 2 commits March 13, 2024 18:43

fix precision for bfloat16, the previous code was not running bfloat1…

153aad6

…6 as it was implicitly converted to fp32

fix a100 test

0b338c3

ptillet approved these changes Mar 14, 2024

View reviewed changes

ThomasRaoux merged commit ee7e5bb into triton-lang:main Mar 14, 2024

zhanglx13 mentioned this pull request Mar 14, 2024

[AMD] Make target a list instead of a tuple for HIP backend #3372

Merged

ThomasRaoux pushed a commit that referenced this pull request Mar 14, 2024

[AMD] Make target a list instead of a tuple for HIP backend (#3372)

8b48b55

Fix AMD backend failure caused by #3362 and #3370

htyu pushed a commit to htyu/triton that referenced this pull request Mar 20, 2024

[AMD] Make target a list instead of a tuple for HIP backend (triton-l…

4abc309

…ang#3372) Fix AMD backend failure caused by triton-lang#3362 and triton-lang#3370

karupayun pushed a commit to openxla/triton that referenced this pull request Apr 3, 2024

[AMD] Make target a list instead of a tuple for HIP backend (triton-l…

f664bee

…ang#3372) Fix AMD backend failure caused by triton-lang#3362 and triton-lang#3370

lezcano mentioned this pull request Nov 19, 2024

[LAYOUTS] Implement LL conversion for DotOperand(Hopper) #5193

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BACKEND] Add support converting MMAV3 accumulator layout to fp8 dot_operand#3370

[BACKEND] Add support converting MMAV3 accumulator layout to fp8 dot_operand#3370
ThomasRaoux merged 3 commits intotriton-lang:mainfrom
ThomasRaoux:attention_fp8_4

ThomasRaoux commented Mar 14, 2024 •

edited

Loading

Uh oh!

ptillet commented Mar 14, 2024 •

edited

Loading

Uh oh!

ThomasRaoux commented Mar 14, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ThomasRaoux commented Mar 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ptillet commented Mar 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ThomasRaoux commented Mar 14, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ThomasRaoux commented Mar 14, 2024 •

edited

Loading

ptillet commented Mar 14, 2024 •

edited

Loading