Skip to content

[Triton] Triton a16w8 gemm preshuffle#1778

Merged
k50112113 merged 17 commits intomainfrom
shaoclee/triton_a16w8_gemm_shuffle
Jan 9, 2026
Merged

[Triton] Triton a16w8 gemm preshuffle#1778
k50112113 merged 17 commits intomainfrom
shaoclee/triton_a16w8_gemm_shuffle

Conversation

@k50112113
Copy link
Contributor

this PR adds a16w8 gemm preshuffle and also standardized return_y_pp flag for skip_reduce option

@k50112113 k50112113 force-pushed the shaoclee/triton_a16w8_gemm_shuffle branch from 63a89ed to e5cc853 Compare January 6, 2026 21:20
@k50112113 k50112113 force-pushed the shaoclee/triton_a16w8_gemm_shuffle branch from e5cc853 to 7a0f35f Compare January 6, 2026 21:26
@k50112113 k50112113 marked this pull request as ready for review January 7, 2026 04:21
@k50112113 k50112113 requested a review from a team January 7, 2026 04:21
@k50112113 k50112113 requested a review from azaidy January 7, 2026 20:03
Copy link
Contributor

@azaidy azaidy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@k50112113 k50112113 merged commit 5b4ee40 into main Jan 9, 2026
20 of 21 checks passed
@k50112113 k50112113 deleted the shaoclee/triton_a16w8_gemm_shuffle branch January 9, 2026 19:01
zhuyuhua-v pushed a commit that referenced this pull request Jan 14, 2026
* add weight preshuffling for triton fp8 blockscale gemm

* add config interface

* add x_scale shuffle

* import

* add default config for gfx942

* fix get_config return

* fix

* Added tuned configs for gemm a8w8 blockscale preshuffled

* Fixed tuned configs keys

* resolve comments

* resolve comments

* update config

* add kernel, add config, standard return_y_pp flag

* update config

* fix

* update config and UT

---------

Co-authored-by: Farel Lukas <farlukas@amd.com>
valarLip pushed a commit that referenced this pull request Mar 18, 2026
* add weight preshuffling for triton fp8 blockscale gemm

* add config interface

* add x_scale shuffle

* import

* add default config for gfx942

* fix get_config return

* fix

* Added tuned configs for gemm a8w8 blockscale preshuffled

* Fixed tuned configs keys

* resolve comments

* resolve comments

* update config

* add kernel, add config, standard return_y_pp flag

* update config

* fix

* update config and UT

---------

Co-authored-by: Farel Lukas <farlukas@amd.com>
valarLip pushed a commit that referenced this pull request Mar 18, 2026
* add weight preshuffling for triton fp8 blockscale gemm

* add config interface

* add x_scale shuffle

* import

* add default config for gfx942

* fix get_config return

* fix

* Added tuned configs for gemm a8w8 blockscale preshuffled

* Fixed tuned configs keys

* resolve comments

* resolve comments

* update config

* add kernel, add config, standard return_y_pp flag

* update config

* fix

* update config and UT

---------

Co-authored-by: Farel Lukas <farlukas@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants