Skip to content

[RELAND] Infer src/dst of allowReorder reshape#9997

Merged
neildhar merged 3 commits into
mainfrom
neildhar/pr9997
Apr 15, 2026
Merged

[RELAND] Infer src/dst of allowReorder reshape#9997
neildhar merged 3 commits into
mainfrom
neildhar/pr9997

Conversation

@neildhar
Copy link
Copy Markdown
Collaborator

@neildhar neildhar commented Apr 10, 2026

Reland of #9926.

Always infer the src/dst of reshapes, even if allowReorder is set. The
result is valid for allowReorder reshapes, even if there isn't a single
canonical encoding. When the existing encoding is one of the possible
results, we prefer that to minimize changes.

This allows inference to always succeed on reshapes, and any heuristics
on whether to use the inferred value can be maintained by the caller.

One example I identified while looking at this was that allowReorder
reshapes will currently fail backward remat in RemoveLayoutConversions
if the reshape cannot be rematerialised with the same source encoding.
This PR instead changes RemoveLayoutConversions to check specifically
for whether the reshape has been marked as efficient, and otherwise
just do the remat. (this is a potentially perf sensitive change)


@neildhar neildhar changed the base branch from main to neildhar/pr9996 April 10, 2026 20:03
@neildhar neildhar marked this pull request as ready for review April 11, 2026 01:55
neildhar and others added 3 commits April 14, 2026 14:49
isExpensiveCat does not reflect the constraint we have in lowering,
which is that the number of unique result elements per thread must be
equal to the total number of unique operand elements per thread. This
means that we can sometimes fold `CatOp` into layout conversions that
have destination layouts that violate this requirement.

Rename it to `isLegalCatEncoding` to reflect that it is actually a
correctness requirement, and update it to reflect the actual constraint.
Check that the shapes and encodings of CatOp are valid and can be
lowered.
Reland of #9926.

Always infer the src/dst of reshapes, even if allowReorder is set. The
result is valid for allowReorder reshapes, even if there isn't a single
canonical encoding. When the existing encoding is one of the possible
results, we prefer that to minimize changes.

This allows inference to always succeed on reshapes, and any heuristics
on whether to use the inferred value can be maintained by the caller.

One example I identified while looking at this was that allowReorder
reshapes will currently fail backward remat in RemoveLayoutConversions
if the reshape cannot be rematerialised with the same source encoding.
This PR instead changes RemoveLayoutConversions to check specifically
for whether the reshape has been marked as efficient, and otherwise
just do the remat. (this is a potentially perf sensitive change)
Base automatically changed from neildhar/pr9996 to main April 15, 2026 13:38
@neildhar neildhar merged commit 441faac into main Apr 15, 2026
12 of 14 checks passed
@neildhar neildhar deleted the neildhar/pr9997 branch April 15, 2026 16:41
raymondtay pushed a commit to raymondtay/triton that referenced this pull request Apr 18, 2026
Reland of triton-lang#9926.

Always infer the src/dst of reshapes, even if allowReorder is set. The
result is valid for allowReorder reshapes, even if there isn't a single
canonical encoding. When the existing encoding is one of the possible
results, we prefer that to minimize changes.

This allows inference to always succeed on reshapes, and any heuristics
on whether to use the inferred value can be maintained by the caller.

One example I identified while looking at this was that allowReorder
reshapes will currently fail backward remat in RemoveLayoutConversions
if the reshape cannot be rematerialised with the same source encoding.
This PR instead changes RemoveLayoutConversions to check specifically
for whether the reshape has been marked as efficient, and otherwise
just do the remat. (this is a potentially perf sensitive change)
nurmukhametov added a commit to ROCm/xla that referenced this pull request Apr 20, 2026
…iton/commits/6ea516a6e4cb1878fb6969d222e23294712b3955)

[List of integrated
commits](triton-lang/triton@6a0b546...6ea516a)

Adapt to upstream API changes:
* `DialectInferLayoutInterface::inferReshapeOpEncoding` gained an
  `allowReorder` parameter in triton-lang/triton#9997.
* Rebase `mma_limit_pred.patch` and
  `no_accelerate_through_broadcast.patch`
  against the upstream rename of `DescriptorLoadOp` to
  `DescriptorLoadLikeOpInterface` in `AccelerateMatmul.cpp`. Patch
  semantics are unchanged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants