Add Grouped GEMM for Mixed Dtype #457

muhammad-tanvir-1211 · 2025-07-04T15:59:31Z

This PR adds Grouped GEMM support for mixed precision GEMM.

examples/sycl/10_bmg_grouped_gemm_mixed_dtype/10_bmg_grouped_gemm_mixed_dtype.cpp

examples/sycl/10_bmg_grouped_gemm_mixed_dtype/bmg_grouped_gemm_mixed_dtype_runner.hpp

include/cutlass/gemm/collective/xe_array_mma_mixed_input.hpp

…tlass-fork into mixed_group_gemm

jiyang1011

LGTM

taozha2 · 2025-08-27T07:43:52Z

can we refine the include/cutlass/gemm/collective/xe_array_mma_mixed_input.hpp and include/cutlass/gemm/collective/xe_mma_mixed_input.hpp together, i found they are many common code.

jiyang1011 · 2025-08-27T08:42:15Z

can we refine the include/cutlass/gemm/collective/xe_array_mma_mixed_input.hpp and include/cutlass/gemm/collective/xe_mma_mixed_input.hpp together, i found they are many common code.

I tried to figure out the difference and dispatch it to xe_mma_mixed_input.hpp. the biggest diff is to initialize the params : array mma must initial the tiled copy with individual tensor and update the group index, so it is not easy

taozha2 · 2025-08-27T08:53:12Z

can we refine the include/cutlass/gemm/collective/xe_array_mma_mixed_input.hpp and include/cutlass/gemm/collective/xe_mma_mixed_input.hpp together, i found they are many common code.

can we refine the include/cutlass/gemm/collective/xe_array_mma_mixed_input.hpp and include/cutlass/gemm/collective/xe_mma_mixed_input.hpp together, i found they are many common code.

I tried to figure out the difference and dispatch it to xe_mma_mixed_input.hpp. the biggest diff is to initialize the params : array mma must initial the tiled copy with individual tensor and update the group index, so it is not easy

the quantization and operator(gemm main loop) is same which is the most important part of the implementation, can we make a base struct like xe_mma_mixed_dtype_base contains these common part, and your grouped mixed gemm inherit it？

jiyang1011 · 2025-08-28T01:28:55Z

can we refine the include/cutlass/gemm/collective/xe_array_mma_mixed_input.hpp and include/cutlass/gemm/collective/xe_mma_mixed_input.hpp together, i found they are many common code.

can we refine the include/cutlass/gemm/collective/xe_array_mma_mixed_input.hpp and include/cutlass/gemm/collective/xe_mma_mixed_input.hpp together, i found they are many common code.

I tried to figure out the difference and dispatch it to xe_mma_mixed_input.hpp. the biggest diff is to initialize the params : array mma must initial the tiled copy with individual tensor and update the group index, so it is not easy

the quantization and operator(gemm main loop) is same which is the most important part of the implementation, can we make a base struct like xe_mma_mixed_dtype_base contains these common part, and your grouped mixed gemm inherit it？

True, But this method will involve a lot of files. I will provide another PR to deal with it

Add Grouped GEMM for mixed type

260b3e3

muhammad-tanvir-1211 requested a review from a team July 4, 2025 15:59

t4c1 reviewed Jul 7, 2025

View reviewed changes

muhammad-tanvir-1211 and others added 7 commits July 9, 2025 15:06

Address feedback

eff23c8

Merge branch 'sycl-develop' of https://github.com/codeplaysoftware/cu…

923eea0

…tlass-fork into mixed_group_gemm

Added tests

a5b4b11

Fixed u4 example build

1541f98

Merge branch 'sycl-develop' of https://github.com/codeplaysoftware/cu…

a028a73

…tlass-fork into mixed_group_gemm

Merge branch 'sycl-develop' into mixed_group_gemm

2f21cfc

Merge branch 'sycl-develop' into mixed_group_gemm

5344bad

jiyang1011 requested review from jiyang1011 and tdeng5 August 27, 2025 07:10

jiyang1011 approved these changes Aug 27, 2025

View reviewed changes

tdeng5 requested review from rolandschulz and taozha2 August 27, 2025 07:15

fix comment

66c7804

fix some bugs

e9d1004

jiyang1011 force-pushed the mixed_group_gemm branch from 6c8c6a7 to e9d1004 Compare August 28, 2025 06:35

taozha2 approved these changes Aug 29, 2025

View reviewed changes

jiyang1011 merged commit 32e15ba into intel:sycl-develop Aug 29, 2025
9 of 12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Grouped GEMM for Mixed Dtype #457

Add Grouped GEMM for Mixed Dtype #457

Uh oh!

muhammad-tanvir-1211 commented Jul 4, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jiyang1011 left a comment

Uh oh!

taozha2 commented Aug 27, 2025

Uh oh!

jiyang1011 commented Aug 27, 2025

Uh oh!

taozha2 commented Aug 27, 2025

Uh oh!

jiyang1011 commented Aug 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add Grouped GEMM for Mixed Dtype #457

Add Grouped GEMM for Mixed Dtype #457

Uh oh!

Conversation

muhammad-tanvir-1211 commented Jul 4, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jiyang1011 left a comment

Choose a reason for hiding this comment

Uh oh!

taozha2 commented Aug 27, 2025

Uh oh!

jiyang1011 commented Aug 27, 2025

Uh oh!

taozha2 commented Aug 27, 2025

Uh oh!

jiyang1011 commented Aug 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants