[flashinfer] Support directing JIT to FlashInfer GroupedGemm kernels #18325

neurusL · 2025-09-19T21:20:30Z

in tvm/python/tvm/relax/backend/cuda/flashinfer.py added a gen_grouped_gemm_module
in tvm/tests/python/relax/test_group_gemm_flashinfer.py added tests for different combinations of

input and output types: ("float8_e4m3fn", "float8_e4m3fn", "bfloat16"), ("float8_e4m3fn", "float8_e4m3fn", "float16"),
scale granularity of m, n, k: (1, 128, 128),
scale major mode: "MN", "K"
mma_sm: 1, 2
different batch sizes and m_sizes

This PR replaces closed PR #18322
@MasterJH5574 please review the PR, thanks!

MasterJH5574 · 2025-09-20T02:50:02Z

tests/python/relax/test_group_gemm_flashinfer.py

+import numpy as np
+import pytest
+import torch
+from einops import rearrange, reduce, repeat


There is a CI error about ModuleNotFoundError: No module named 'einops', likely because the CI docker image doesn't have einops installed.

In this case, could you help check if we can replace einops operations with torch operations (so we can avoid using einops). I assume that is doable. And in case it isn't, we can move this import into the functions that call these operations, so that einops isn't imported unless executing those functions.

MasterJH5574

LGTM. Thank you @neurusL for landing this!

Anrui Liu added 2 commits September 19, 2025 15:08

[FlashInfer] Add gen_grouped_gemm_fp8 tvm binding

a05984d

revert 3rdparty/libbacktrace

95d7372

MasterJH5574 self-assigned this Sep 20, 2025

MasterJH5574 reviewed Sep 20, 2025

View reviewed changes

Anrui Liu added 3 commits September 21, 2025 18:45

move einops inside function call, and reformat

66a858d

reformat

b2498ec

remove unused imports, to pass pylint

b1d6156

MasterJH5574 approved these changes Sep 22, 2025

View reviewed changes

MasterJH5574 merged commit 4c82c71 into apache:main Sep 22, 2025
10 checks passed

ysh329 mentioned this pull request Oct 24, 2025

[Release] v0.22.0 Release Candidate Notes #18391

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[flashinfer] Support directing JIT to FlashInfer GroupedGemm kernels #18325

[flashinfer] Support directing JIT to FlashInfer GroupedGemm kernels #18325

Uh oh!

neurusL commented Sep 19, 2025 •

edited

Loading

Uh oh!

MasterJH5574 Sep 20, 2025

Uh oh!

MasterJH5574 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[flashinfer] Support directing JIT to FlashInfer GroupedGemm kernels #18325

[flashinfer] Support directing JIT to FlashInfer GroupedGemm kernels #18325

Uh oh!

Conversation

neurusL commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MasterJH5574 Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

MasterJH5574 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

neurusL commented Sep 19, 2025 •

edited

Loading