[matmul kernel] [nvfp4] Use flex ctx out scale - to support tensor scale with nvfp4 output by tristan-oai · Pull Request #9854 · triton-lang/triton

tristan-oai · 2026-03-26T01:11:55Z

nvfp4 has a tensor-wide scale. This PR adds support for this scale when the matmul output needs to be quantized to nvfp4. We use flex ctx to carry the scale.

New contributor declaration

I am not making a trivial change, such as fixing a typo in a comment.
I have written a PR description following these
rules.
I have run pre-commit run --from-ref origin/main --to-ref HEAD.
Select one of the following.
- I have added tests.
  - /test for lit tests
  - /unittest for C++ tests
  - /python/test for end-to-end tests
- This PR does not need a test because FILL THIS IN.
Select one of the following.
- I have not added any lit tests.
- The lit tests I have added follow these best practices,
  including the "tests should be minimal" section. (Usually running Python code
  and using the instructions it generates is not minimal.)

…cale

…ale with nvfp4 output (triton-lang#9854) nvfp4 has a tensor-wide scale. This PR adds support for this scale when the matmul output needs to be quantized to nvfp4. We use flex ctx to carry the scale.  # New contributor declaration - [ ] I am not making a trivial change, such as fixing a typo in a comment. - [ ] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [ ] I have run `pre-commit run --from-ref origin/main --to-ref HEAD`. - Select one of the following. - [ ] I have added tests. - `/test` for `lit` tests - `/unittest` for C++ tests - `/python/test` for end-to-end tests - [ ] This PR does not need a test because `FILL THIS IN`. - Select one of the following. - [ ] I have not added any `lit` tests. - [ ] The `lit` tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.)

tristan-oai force-pushed the tristan/mx-tensor-scale branch from 2f50e42 to 4131a8b Compare March 26, 2026 18:05

tristan-oai added 2 commits March 26, 2026 13:12

use flex ctx out scale - to support tensor scale with nvfp4

4edf037

fix output scale and add test coverage

27ec1e1

tristan-oai force-pushed the tristan/mx-tensor-scale branch from d8bf707 to 27ec1e1 Compare March 26, 2026 20:12

Merge remote-tracking branch 'upstream/main' into tristan/mx-tensor-s…

5c94d95

…cale

tristan-oai changed the title ~~use flex ctx out scale - to support tensor scale with nvfp4~~ [matmul kernel] [nvfp4] Use flex ctx out scale - to support tensor scale with nvfp4 output Apr 17, 2026

clean up

8ccc1ac

tristan-oai marked this pull request as ready for review April 17, 2026 19:05

tristan-oai requested a review from ptillet as a code owner April 17, 2026 19:05

ThomasRaoux requested a review from aeng-openai April 17, 2026 19:08

aeng-openai approved these changes Apr 17, 2026

View reviewed changes

tristan-oai added 2 commits April 17, 2026 15:57

fix h100 tests

3b70e97

Skip NVFP4 output matmul tests on A100

459b038

tristan-oai merged commit 3123400 into triton-lang:main Apr 18, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[matmul kernel] [nvfp4] Use flex ctx out scale - to support tensor scale with nvfp4 output#9854

[matmul kernel] [nvfp4] Use flex ctx out scale - to support tensor scale with nvfp4 output#9854
tristan-oai merged 6 commits into
triton-lang:mainfrom
tristan-oai:tristan/mx-tensor-scale

tristan-oai commented Mar 26, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tristan-oai commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

New contributor declaration

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tristan-oai commented Mar 26, 2026 •

edited

Loading