Skip to content

[TRITON] [GLUON] GFX1250 Gluon MoE A4W4 Kernel#2513

Open
farlukas wants to merge 44 commits into
mainfrom
farlukas/moe-a4w4-gfx1250
Open

[TRITON] [GLUON] GFX1250 Gluon MoE A4W4 Kernel#2513
farlukas wants to merge 44 commits into
mainfrom
farlukas/moe-a4w4-gfx1250

Conversation

@farlukas
Copy link
Copy Markdown
Contributor

@farlukas farlukas commented Mar 27, 2026

Motivation

Port MoE A4W4 kernel from Triton to Gluon for GFX1250.

Technical Details

Follows the same function signature as Triton version but using TDM features in Gluon.

NOTE: Requires #2583.

Test Plan

Added backend parameter to the Triton unit tests to switch between Triton and Gluon.

pytest op_tests/triton_tests/moe/test_moe_gemm_a4w4.py # runs both triton and gluon tests
pytest op_tests/triton_tests/moe/test_moe_gemm_a4w4.py -k "gluon" # runs just gluon tests
pytest op_tests/triton_tests/moe/test_moe_gemm_a4w4.py -k "triton" # runs just triton tests

python3 op_tests/op_benchmarks/triton/bench_moe_gemm_a4w4.py --M 32 --shape 1024 7168 --experts 1 1
python3 op_tests/op_benchmarks/triton/bench_moe_gemm_a4w4.py --M 2048 --shape 1024 7168 --experts 1 1
python3 op_tests/op_benchmarks/triton/bench_moe_gemm_a4w4.py --M 32 --shape 1024 7168 --experts 4 1
python3 op_tests/op_benchmarks/triton/bench_moe_gemm_a4w4.py --M 2048 --shape 1024 7168 --experts 4 1

Test Result

The following unit tests has passed:

  • Base (no HBM swizzling, no gather, no scatter, no gammas, no activations, no fused quant)
  • With HBM swizzling
  • With gather
  • With scatter
  • With gammas
  • With swiglu activation
  • With fused quant

Submission Checklist

@github-actions
Copy link
Copy Markdown
Contributor

🏷️ CI Guide

Runs automatically on every PR:

  • ✅ Pre-checks (submodule verification, code formatting)
  • ✅ Aiter op tests (gfx942 + gfx950)
  • ✅ Triton tests (only when aiter/ops/triton/** or related paths are changed)

Extended tests (opt-in via labels):

Label Tests
ci:triton-355 Run Triton tests on MI355 in addition to MI325
ci:sglang SGLang integration tests
ci:atom ATOM benchmark (DeepSeek-R1 + GPT-OSS)
ci:vllm vLLM benchmark
ci:all All of the above

Add labels via the sidebar or gh pr edit 2513 --add-label <label>

@farlukas farlukas marked this pull request as ready for review March 27, 2026 18:36
@farlukas farlukas requested a review from a team March 27, 2026 18:36
@farlukas farlukas changed the title GFX1250 Gluon MoE A4W4 Kernel [TRITON] [GLUON] GFX1250 Gluon MoE A4W4 Kernel Mar 27, 2026
Copy link
Copy Markdown
Contributor

@azaidy azaidy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look at individual comments

Comment thread aiter/ops/triton/_gluon_kernels/moe/moe_op_gemm_a4w4.py Outdated
Comment thread aiter/ops/triton/_gluon_kernels/moe/moe_op_gemm_a4w4.py Outdated
Comment thread aiter/ops/triton/_gluon_kernels/moe/moe_op_gemm_a4w4.py Outdated
Comment thread aiter/ops/triton/_gluon_kernels/moe/moe_op_gemm_a4w4.py Outdated
Comment thread aiter/ops/triton/_gluon_kernels/moe/moe_op_gemm_a4w4.py Outdated
Comment thread aiter/ops/triton/_gluon_kernels/moe/moe_op_gemm_a4w4.py Outdated
Comment thread aiter/ops/triton/_gluon_kernels/moe/moe_op_gemm_a4w4.py Outdated
Comment thread aiter/ops/triton/_gluon_kernels/moe/moe_op_gemm_a4w4.py Outdated
Comment thread aiter/ops/triton/_gluon_kernels/moe/moe_op_gemm_a4w4.py Outdated
Comment thread aiter/ops/triton/_gluon_kernels/moe/moe_op_gemm_a4w4.py Outdated
Comment thread op_tests/triton_tests/moe/test_moe_gemm_a4w4.py Outdated
Comment thread op_tests/triton_tests/moe/test_moe_gemm_a4w4.py Outdated
Comment thread aiter/ops/triton/_gluon_kernels/moe/moe_op_gemm_a4w4.py Outdated
Comment thread aiter/ops/triton/_gluon_kernels/moe/moe_op_gemm_a4w4.py Outdated
Comment thread aiter/ops/triton/_gluon_kernels/moe/moe_op_gemm_a4w4.py Outdated
Comment thread op_tests/op_benchmarks/triton/bench_moe_gemm_a4w4.py Outdated
Comment thread aiter/ops/triton/_gluon_kernels/moe/moe_op_gemm_a4w4.py Outdated
Comment thread aiter/ops/triton/_gluon_kernels/moe/moe_op_gemm_a4w4.py
Comment thread aiter/ops/triton/_gluon_kernels/moe/moe_op_gemm_a4w4.py
@farlukas farlukas requested a review from vgokhale April 30, 2026 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants