Skip to content

Conversation

@neurusL
Copy link
Contributor

@neurusL neurusL commented Sep 19, 2025

in tvm/python/tvm/relax/backend/cuda/flashinfer.py added a gen_grouped_gemm_module
in tvm/tests/python/relax/test_group_gemm_flashinfer.py added tests for different combinations of

  • input and output types: ("float8_e4m3fn", "float8_e4m3fn", "bfloat16"), ("float8_e4m3fn", "float8_e4m3fn", "float16"),
  • scale granularity of m, n, k: (1, 128, 128),
  • scale major mode: "MN", "K"
  • mma_sm: 1, 2
  • different batch sizes and m_sizes

This PR replaces closed PR #18322
@MasterJH5574 please review the PR, thanks!

@MasterJH5574 MasterJH5574 self-assigned this Sep 20, 2025
import numpy as np
import pytest
import torch
from einops import rearrange, reduce, repeat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a CI error about ModuleNotFoundError: No module named 'einops', likely because the CI docker image doesn't have einops installed.

In this case, could you help check if we can replace einops operations with torch operations (so we can avoid using einops). I assume that is doable. And in case it isn't, we can move this import into the functions that call these operations, so that einops isn't imported unless executing those functions.

Copy link
Contributor

@MasterJH5574 MasterJH5574 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you @neurusL for landing this!

@MasterJH5574 MasterJH5574 merged commit 4c82c71 into apache:main Sep 22, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants