[Triton-MLIR] Keren/code gen for extract slice and alloc tensor#692
Merged
ptillet merged 8 commits intoSep 23, 2022
Conversation
Contributor
Author
|
@goostavz @Superjomn Please let me know if you have any comments. I'll be working on insert_slice_async after this. |
goostavz
reviewed
Sep 23, 2022
Collaborator
|
LGTM, no further comments |
Contributor
Author
|
Wait for merging until #701 is merged into master and triton-mlir |
ptillet
pushed a commit
that referenced
this pull request
Apr 1, 2024
Co-authored-by: gzhu <goostavz@outlook.com>
brunomazzottiamd
pushed a commit
to brunomazzottiamd/triton
that referenced
this pull request
Jan 29, 2025
* Move preamble code into tikzplot.tex * Rename kpack to kWidth and allow kWidth = 32 * [API change] Take user input to set dim names API change: - For blocked layout, use -tensorShape, which only takes two dims as dim0,dim1 - For dot layout, use -dotShape, which takes three dims as M,N,K * Re-structure files Separate each layout's code into their own files * Extend dotLayout plot to support kWidth=32 - When kWidth is large, use a smaller elemSize honrizontally to save space - Improve the labels, such as - change vec to kWidth for operands - change opA/opB to inA/inB and include operand dims - remove group dims in the operands so that they don't overlap with operand block dims - Better alignment: dot op and mfma zoomed-in pics are bottom aligned * [API change] Add support for kGroup kGroup is defined as total elements per thread / kWidth for one mfma instruction. We need kGroup = 2 only for the newly added mfma_f32_16x16x128_f8f6f4 and mfma_f32_32x32x64_f8f6f4 with f8 input type on MI350. * [API change] Add support for data types of both operands And print mfma instruction name accordingly. For now, mixed precision mfma between 8-bit and 4- or 6-bit is not supported yet. * Support mixed mfma with bf8/fp8 and fp6/bf6/f4 * [API change] Add support for scale * [NFC] Fix format * [API change] Refactor tensor and LDS layout - Support data types - Support both 32 and 64 banks - Still working on LDS accesses * [LDS layout] Add support for ds_read access pattern for TN config - Fixed the issue with maxPhase computation. Need to submit a PR to fix it in the triton compiler - For ds_read_b64 with 64 banks, there are bank conflicts. We need to figure out a different swizzling pattern to avoid bank conflicts. * [LDS layout] Add support for ds_write access pattern Assumed a basic global access pattern * [LDS layout] Support access pattern for MN-contig without using mfma_transpose_load instructions - Elements along the M/N dim are contiguous in both global memory and LDS. Note that this is not the in-thread transpose case. - Swizzling is disabled * [LDS layout] Support access pattern for MN-contig with mfma_trans_load instructions * Clean up the code * [lds layout] support padding * Reduce tex package required
scxiao
pushed a commit
to scxiao/triton
that referenced
this pull request
Apr 2, 2026
Summary: Updates the implementations to all have a causal variant. Pull Request resolved: facebookexperimental/triton#692 Reviewed By: htyu Differential Revision: D87807126 Pulled By: njriasan fbshipit-source-id: 0c9aa6ea90e992581a7aa009c26f75d4b4797602
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.