[Backend] Refactor OptimizeDescriptorEncoding to common path by sriakrish · Pull Request #9709 · triton-lang/triton

sriakrish · 2026-03-13T02:09:08Z

Move common utilties of this to DescriptorUtils
Provide functors to customize backend specific decisions

New contributor declaration

I am not making a trivial change, such as fixing a typo in a comment.
I have written a PR description following these
rules.
I have run pre-commit run --from-ref origin/main --to-ref HEAD.
Select one of the following.
- I have added tests.
  - /test for lit tests
  - /unittest for C++ tests
  - /python/test for end-to-end tests
- This PR does not need a test because it is a NFC refactoring.
Select one of the following.
- I have not added any lit tests.
- The lit tests I have added follow these best practices,
  including the "tests should be minimal" section. (Usually running Python code
  and using the instructions it generates is not minimal.)

sriakrish · 2026-03-13T02:35:34Z

We are working on improving our handling of tensors descriptors. It is split into two parts:

Part 1 (this PR): NFC refactor moving common utility to DescriptorUtils and keeping backend specific decisions such as choosing a fallback layout, finding encoding from uses, etc., in the pass
Part 2: Add OptimizeDescriptorEncoding pass for AMD backend re-using the common functions from DescriptorUtils

cc: @antiagainst

ThomasRaoux · 2026-03-13T02:52:35Z

What part of the logic will be different for AMD? The logic in this pass seems fairly generic except for the part about NVMMASharedEncodingAttr which are layout specifics to TMAs. I wonder if we really need a fully separate pass

sriakrish · 2026-03-13T03:42:13Z

What part of the logic will be different for AMD? The logic in this pass seems fairly generic except for the part about NVMMASharedEncodingAttr which are layout specifics to TMAs. I wonder if we really need a fully separate pass

Our current implementation requires these four functions

getFallbackSharedLayout : We assign padded shared layouts as fallback for most cases determined by hardware features and use swizzled only under certain conditions
updateEncodingForShape: Consequently, this function handles padded layouts and swizzled.
findEncodingFromUsers: In this function, we walk the uses of the descriptor load and check if it leads to dot operands and derive padded layouts. It basically moves the padding decisions made here to this pass.
forcedToDefault: We need this one because of ReinterpretTensorDescOp. We are forcing to default on just call and return ops.

sriakrish · 2026-03-19T23:51:35Z

@ThomasRaoux @antiagainst

I have updated this PR. We now require only two callbacks:

buildFallbackSharedEncoding - this is backend specific fallback encoding to build
isCompatibleEncoding - using this to check if the encoding is backend compatible

Rest of the infrastructure is common.

updateEncodingForShape is now commonly shared between both backends
findLoadEncodingFromUsers is also common for both backends. It first checks for an (discardable) attribute tt.desired_encoding on descriptor loads, and the rest of this is unchanged. On the AMD side, we populate this attribute with a padded encoding from its usage in dot. And we let findLoadEncodingFromUsers simply pick this when it is available.

antiagainst · 2026-03-19T23:52:28Z

-    assignMemoryLayouts(m);
+
+    // Fallback shared encoding callback
+    auto buildFallbackSharedEncoding =


Rather than using lambda, maybe just create it as a free function like isTMACompatibleEncoding to be symmetric.

antiagainst · 2026-03-19T23:53:01Z

+        ctx, swizEnc.getVec(), swizEnc.getPerPhase(), swizEnc.getMaxPhase(),
+        order, newCgaEnc);
+  }
+  if (auto paddedEnc = dyn_cast<ttg::PaddedSharedEncodingAttr>(encoding)) {


I guess we can remove this part and add it later together with the AMD logic.

antiagainst · 2026-03-19T23:53:57Z

+/// Utility class to assign memory layouts to tensor descriptors in a module.
+class AssignDescriptorMemoryLayouts {
+public:
+  AssignDescriptorMemoryLayouts() = default;


Is this constructor needed?

sriakrish · 2026-03-20T01:41:27Z

@antiagainst Thank you, PR updated with suggested revisions

ThomasRaoux

makes sense but I still think it should be refactored a little bit hopefully my comments make sense to you

ThomasRaoux · 2026-03-20T01:58:23Z

+struct DescriptorAnalysisCallbacks {
+  /// Callback to check for compatible shared encoding
+  llvm::function_ref<bool(Attribute)> isCompatibleSharedEncoding;
+
+  /// create a fallback encoding given the shape, order, cga layout and
+  /// element type
+  llvm::function_ref<Attribute(mlir::MLIRContext *, ArrayRef<int64_t>,
+                               ArrayRef<unsigned>, CGAEncodingAttr, Type)>
+      buildFallbackSharedEncoding;
+};


a struct with a bunch of callbacks seems like a convoluted way to make virtual functions. How about we make those virtual functions in AssignDescriptorMemoryLayouts or DescriptorAnalysisCallbacks?

ThomasRaoux · 2026-03-20T02:04:31Z

+#include "llvm/ADT/PriorityWorklist.h"
+#include <unordered_set>
+
+namespace ttg = mlir::triton::gpu;


overall the idea to separate things out make sense but on the style I think calling this file Utils is misleading. At this point this is really the whole implementation of the pass rather than a set of generic utils functions.

My suggestion is make a generic class with some overloaded functions and inherit those in a target specific way.

I know the result is very similar but I think mixing up passes and utils is going to be confusing.

Thank you for the feedback. Agreed, I have renamed the file.

I have a new revision which addresses the following:

A base class AssignDescriptorMemoryLayouts now implements the core logic for layout assignment and provides virtual methods for backend overloads. Moved the core logic functions to the class.

Renamed from DescriptorUtils to DescriptorMemoryLayouts

ThomasRaoux · 2026-03-20T17:02:14Z

+#include <unordered_set>
+
+namespace mlir::triton::gpu {
+// Forward declarations


nit: meaningless comment

- Move common utilties of this to DescriptorUtils - Provide functors to customize backend specific decisions

- Introduce only two callbacks - Callback for building backend fallback layout - Callback for checking backend compatible encoding - Move updateEncodingForShape to common place - Adapt updateEncodingForShape to handle padded layout

- Fix comment

…lang#9709) A base class `AssignDescriptorMemoryLayouts` now implements the core logic for layout assignment and provides virtual methods for backend overloads. Moved the core logic functions to the class.

sriakrish requested a review from ptillet as a code owner March 13, 2026 02:09

sriakrish force-pushed the nfc-refactor-opt-desc-encoding branch from 323776d to c63d6db Compare March 14, 2026 14:58

sriakrish marked this pull request as draft March 14, 2026 22:42

antiagainst mentioned this pull request Mar 16, 2026

[Tools][Translator] Add AMD backend support for Triton-to-Gluon translator #9717

Merged

sriakrish force-pushed the nfc-refactor-opt-desc-encoding branch from c63d6db to ca53687 Compare March 19, 2026 23:40

sriakrish marked this pull request as ready for review March 19, 2026 23:42

antiagainst changed the title ~~[NFC] Refactor OptimizeDescriptorEncoding~~ [Backend][NFC] Refactor OptimizeDescriptorEncoding to common path Mar 19, 2026

antiagainst changed the title ~~[Backend][NFC] Refactor OptimizeDescriptorEncoding to common path~~ [Backend] Refactor OptimizeDescriptorEncoding to common path Mar 19, 2026

antiagainst reviewed Mar 19, 2026

View reviewed changes

ThomasRaoux reviewed Mar 20, 2026

View reviewed changes

ThomasRaoux approved these changes Mar 20, 2026

View reviewed changes

sriakrish added 6 commits March 20, 2026 18:59

[NFC] Refactor OptimizeDescriptorEncoding

816c991

- Move common utilties of this to DescriptorUtils - Provide functors to customize backend specific decisions

[NFC] Refactor OptimizeDescriptorEncoding

b3dde97

- Introduce only two callbacks - Callback for building backend fallback layout - Callback for checking backend compatible encoding - Move updateEncodingForShape to common place - Adapt updateEncodingForShape to handle padded layout

[NFC] Address PR feedback

ad52269

[NFC] Add missing changes

b55c48a

[NFC] Refactor to address feedback

c682313

[NFC] cleanup

e028faf

- Fix comment

sriakrish force-pushed the nfc-refactor-opt-desc-encoding branch from 92ff3a7 to e028faf Compare March 20, 2026 19:00

antiagainst merged commit d48908a into triton-lang:main Mar 20, 2026
9 checks passed

Conversation

sriakrish commented Mar 13, 2026

New contributor declaration

Uh oh!

sriakrish commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ThomasRaoux commented Mar 13, 2026

Uh oh!

sriakrish commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sriakrish commented Mar 19, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sriakrish commented Mar 20, 2026

Uh oh!

ThomasRaoux left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sriakrish commented Mar 13, 2026 •

edited

Loading

sriakrish commented Mar 13, 2026 •

edited

Loading