[LAYOUTS] Generalise HoistLayoutConversion to work with arbitrary layouts and chains of ops#5673
Conversation
1eed10c to
a77a54a
Compare
lezcano
commented
Jan 27, 2025
Comment on lines
+3278
to
+3421
| #blocked = #ttg.blocked<{sizePerThread = [1, 2], threadsPerWarp = [4, 8], warpsPerCTA = [4, 1], order = [1, 0]}> | ||
| #mma = #ttg.nvidia_mma<{versionMajor = 2, versionMinor = 0, warpsPerCTA = [1, 4]}> | ||
| module attributes {"ttg.num-warps" = 4 : i32, "ttg.target" = "cuda:80"} { | ||
| tt.func @dot_op_hoisted_to_load_with_unsupported_op_and_initializer_above_slice( |
Contributor
Author
There was a problem hiding this comment.
This is all codemovement + adding this test that was proposed but not merged in #5349 (comment)
as we now hoist everything as expected
lezcano
commented
Jan 27, 2025
ThomasRaoux
approved these changes
Jan 28, 2025
Collaborator
ThomasRaoux
left a comment
There was a problem hiding this comment.
LGTM. One comment is I'm not sure if we need the speculatively part but that's kind of a detail
Mogball
reviewed
Jan 28, 2025
| // This could be generalised if necessary | ||
| if (!loadOp) { | ||
| auto op = v.getDefiningOp(); | ||
| if (isa<arith::ConstantOp>(op) || noDataMovement(op)) { |
Collaborator
There was a problem hiding this comment.
Should ConstantOp just be put inside noDataMovement?
Contributor
Author
There was a problem hiding this comment.
Probably yeah, but leaving it as-is for now because it currently works and we are probably going to refactor this pass in the near future so whatever.
We now support all layouts as LL, and reductions support any layout as input. As such, at least in theory, we should be able to propagate layouts freely, even DotOperands, similar to what we do with other layouts. This PR is a bit tentative. Let's see if anything interesting breaks
Contributor
Author
Contributor
Author
|
WTF, the A100 tests failed: but CI marked it as green? |
ThomasRaoux
pushed a commit
that referenced
this pull request
Jan 31, 2025
I had used `.ONESHELL` to allow `cd` to effect the other commands, but it seems this also prevents the error status from propagating from anything but the last command in a rule. e.g. see #5673 (comment)
Contributor
Author
|
Reopened at #5788 |
lezcano
added a commit
that referenced
this pull request
Feb 3, 2025
AlexAUT
pushed a commit
to AlexAUT/triton
that referenced
this pull request
Feb 6, 2025
…outs and chains of ops (triton-lang#5673) We generalise `HoistLayoutConversion` to lift a given `convert_layout dot_operand` above any chain of operations that do not require data movement. We could totally generalise this in the future to lift it over other ops. We do this as a first step to keep the code somewhat similar to the previous one. Regarding the previous limitations of `canHoistDotOpEncV2` I did a bit of archeology: - The "don't hoist past select" was added in this issue triton-lang#2857. I run the repro and with the recent layout fixes, it now passes. - The TruncOps being skipped comes from triton-lang#2181. I think this is related with the hack that was removed in triton-lang#5044, so now it should work - Same same for the `UIToFpOp`, this is now supported after triton-lang#5044 - Mixed dtype hack is not necessary either as now everything works as expected with the `convert_layout` rework. We also add proper support for `isPure` for `elementwise_inline_asm` ops On the location of the code, we just leave it in `RemoveLayoutConversion.cpp` to take advantage of the rather generic implementation of `rewriteSlice`. We could totally move this pass outside of `remove-layout-conversion`, as it's probably enough to run it once. This code will go through further changes in the near future, so we'll assess this then.
AlexAUT
pushed a commit
to AlexAUT/triton
that referenced
this pull request
Feb 6, 2025
triton-lang#5776) This reverts PR triton-lang#5673 This broke the tests on A100, even though CI was green. The CI issue will be resolved by triton-lang#5775
AlexAUT
pushed a commit
to AlexAUT/triton
that referenced
this pull request
Feb 6, 2025
I had used `.ONESHELL` to allow `cd` to effect the other commands, but it seems this also prevents the error status from propagating from anything but the last command in a rule. e.g. see triton-lang#5673 (comment)
AlexAUT
pushed a commit
to AlexAUT/triton
that referenced
this pull request
Feb 6, 2025
makslevental
pushed a commit
to makslevental/triton
that referenced
this pull request
Feb 19, 2025
…outs and chains of ops (triton-lang#5673) We generalise `HoistLayoutConversion` to lift a given `convert_layout dot_operand` above any chain of operations that do not require data movement. We could totally generalise this in the future to lift it over other ops. We do this as a first step to keep the code somewhat similar to the previous one. Regarding the previous limitations of `canHoistDotOpEncV2` I did a bit of archeology: - The "don't hoist past select" was added in this issue triton-lang#2857. I run the repro and with the recent layout fixes, it now passes. - The TruncOps being skipped comes from triton-lang#2181. I think this is related with the hack that was removed in triton-lang#5044, so now it should work - Same same for the `UIToFpOp`, this is now supported after triton-lang#5044 - Mixed dtype hack is not necessary either as now everything works as expected with the `convert_layout` rework. We also add proper support for `isPure` for `elementwise_inline_asm` ops On the location of the code, we just leave it in `RemoveLayoutConversion.cpp` to take advantage of the rather generic implementation of `rewriteSlice`. We could totally move this pass outside of `remove-layout-conversion`, as it's probably enough to run it once. This code will go through further changes in the near future, so we'll assess this then.
makslevental
pushed a commit
to makslevental/triton
that referenced
this pull request
Feb 19, 2025
triton-lang#5776) This reverts PR triton-lang#5673 This broke the tests on A100, even though CI was green. The CI issue will be resolved by triton-lang#5775
makslevental
pushed a commit
to makslevental/triton
that referenced
this pull request
Feb 19, 2025
I had used `.ONESHELL` to allow `cd` to effect the other commands, but it seems this also prevents the error status from propagating from anything but the last command in a rule. e.g. see triton-lang#5673 (comment)
makslevental
pushed a commit
to makslevental/triton
that referenced
this pull request
Feb 19, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We generalise
HoistLayoutConversionto lift a givenconvert_layout dot_operandabove any chain of operations that do not require data movement. We
could totally generalise this in the future to lift it over other ops. We do
this as a first step to keep the code somewhat similar to the previous
one.
Regarding the previous limitations of
canHoistDotOpEncV2I did a bit of archeology:UIToFpOp, this is now supported after [BACKEND] Get rid of unpack/pack I32 #5044convert_layoutrework.We also add proper support for
isPureforelementwise_inline_asmopsOn the location of the code, we just leave it in
RemoveLayoutConversion.cpptotake advantage of the rather generic implementation of
rewriteSlice. We could totallymove this pass outside of
remove-layout-conversion, as it's probably enough to runit once. This code will go through further changes in the near future, so we'll assess this
then.