Merge OpenAI Triton commit 09a2aad#6660
Merged
Merged
Conversation
We make the following changes:
- when bitcasting from a floating-point type to the same-width integer
type, we apply a pseudorandom invertible function (a composition of
multiplications and xorshifts) to mix the bits, and apply the inverse
function when bitcasting back;
- the function is chosen so that floating-point negation (of nonzero
operands) maps onto integer negation, the constants {-1.0, 0.0, 1.0} map
to {-1, 0, 1} respectively, and 'negative zero' maps to the integer
`0b1000...000`;
- our unary functions include multiplications before and after XORing
with the unary operation tag in order to prevent unwanted commutativity
of unary functions;
- reciprocal is now self-inverse, and implements modular inversion if
the operand is odd in the integer domain;
- scaled dot (tcgen05 and AMD) behaves like the constituent ops;
and we get nice consequences such as `exp(-x)` behaving identically
under fpsan to `1.0 / exp(x)`.
Enable Triton+LLVM symbol visibility by default to allow extensions to be natively shipped/package with Triton. This is part of the co-design of shipping extensions with PyTorch Triton release w/ @atalman.
This PR adds version checking. If `TRITON_PLUGIN_VERSION_CHECK` is unset then only the release version will be checked (ie that both the plugin and Triton were built on version say "3.7.0"). However is `TRITON_PLUGIN_VERSION_CHECK=on` then both the release version and the git hash built on will be checked to match, and finally if `TRITON_PLUGIN_VERSION_CHECK=off` no version checking other than the plugin api version itself will be checked.
tl.rand returns values in the half-open interval [0, 1), not the closed interval [0, 1]. The test assertion used x <= 1, which would have accepted 1.0 as a valid output. Philox-based float generation masks the upper mantissa bits and sets the exponent to produce values strictly less than 1.0. A value of exactly 1.0 indicates a bug (e.g. broken seed delivery returning all-ones from the uint-to-float conversion). <!--- The core Triton is a small number of people, and we receive many PRs (thank you!). To help us review your code more quickly, **if you are a new contributor (less than 3 PRs merged) we ask that you complete the following tasks and include the filled-out checklist in your PR description.** Complete the following tasks before sending your PR, and replace `[ ]` with `[x]` to indicate you have done them. --> # New contributor declaration - [x] I am not making a trivial change, such as fixing a typo in a comment. - [x] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [x] I have run `pre-commit run --from-ref origin/main --to-ref HEAD`. - Select one of the following. - [ ] I have added tests. - `/test` for `lit` tests - `/unittest` for C++ tests - `/python/test` for end-to-end tests - [x] This PR does not need a test because it's fixing a test. - Select one of the following. - [x] I have not added any `lit` tests. - [ ] The `lit` tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.) Co-authored-by: Wes Turner <westurner@users.noreply.github.com>
We do so by following the PTX docs and our LLVM lowerings 1. In PTX, a barrier flips when both `arrivals == 0` and `tx-count == 0` 2. In PTX, an expect implies a commit of 1. 3. In triton, we lower `ttng.expect` as an expect on the leader CTA and as a commit of 1 We model all these points in consan.
(Coauthored with Thomas Raoux)
75cefbe to
0ab6fc0
Compare
Signed-off-by: Witold Dziurdz <witold.dziurdz@intel.com>
Signed-off-by: Witold Dziurdz <witold.dziurdz@intel.com>
8994d35 to
22f1461
Compare
43badaa09a2aad
Signed-off-by: Witold Dziurdz <witold.dziurdz@intel.com>
56e590d to
130d417
Compare
whitneywhtsang
approved these changes
Apr 16, 2026
exolyr
added a commit
that referenced
this pull request
Apr 17, 2026
…kiplist Upstream commit 8d4c6cd ([KERNELS] Fix tmem overflow for fp32 matmul, triton-lang/triton#9967), merged via PR #6660, added 1 new fp32 test case to _build_test_op_cases() in test_matmul.py before the swiglu section. This shifted all swiglu parametrize indices by +1, causing pytest-skip's --select-fail-on-missing to reject the stale IDs. Increment each swiglu_optsN by 1 (89→90, 90→91, 93→94, 94→95). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
exolyr
added a commit
that referenced
this pull request
Apr 17, 2026
…kiplist Upstream commit 8d4c6cd ([KERNELS] Fix tmem overflow for fp32 matmul, triton-lang/triton#9967), merged via PR #6660, added 1 new fp32 test case to _build_test_op_cases() in test_matmul.py before the swiglu section. This shifted all swiglu parametrize indices by +1, causing pytest-skip's --select-fail-on-missing to reject the stale IDs. Increment each swiglu_optsN by 1 (89→90, 90→91, 93→94, 94→95). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
exolyr
added a commit
that referenced
this pull request
Apr 17, 2026
…kiplist (#6703) Upstream commit 8d4c6cd merged via PR #6660, added 1 new fp32 test case to _build_test_op_cases() in test_matmul.py before the swiglu section. This shifted all swiglu parametrize indices by +1, causing pytest-skip's --select-fail-on-missing to reject the stale IDs. Increment each swiglu_optsN by 1 (89→90, 90→91, 93→94, 94→95).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR changes the Triton base from 43badaa to 09a2aad (Apr 9).
Pass rate: 99.55%->99.55%