docs: Fix incorrect column-major scale layout in FP8 GEMM docstrings by bledden · Pull Request #2614 · flashinfer-ai/flashinfer

bledden · 2026-02-21T17:58:57Z

Summary

Fixes the a_scale parameter docstrings in three FP8 GEMM functions that incorrectly described the scale tensor layout as "Column-major" when the kernel actually expects standard contiguous (row-major) tensors.

Functions fixed:

gemm_fp8_nt_groupwise
group_gemm_fp8_nt_groupwise
group_gemm_mxfp8_mxfp4_nt_groupwise

Validation

I verified the correct layout by cross-referencing three sources:

quantize_fp8 in flashinfer/testing/utils.py — produces standard contiguous PyTorch tensors (row-major) for the scales. The returned x_scale is never transposed before being returned.
Test suite (tests/gemm/test_groupwise_scaled_gemm_fp8.py) — creates a_scale via quantize_fp8() and passes the resulting contiguous tensor directly to the GEMM functions without any transposition. These tests pass, confirming row-major is the correct layout.
Existing b_scale docs — already correctly say "Row-major scale tensor for b" in the same docstrings. The a_scale description was the only inconsistency.

Fixes #2147

Summary by CodeRabbit

Documentation
- Clarified docstrings for FP8/FP4 groupwise GEMM routines: a_scale and b_scale are Row-major when scale_major_mode is "K"; tensor shape conventions otherwise remain unchanged. No functional behavior changed.

… docstrings The a_scale parameter docstrings in gemm_fp8_nt_groupwise, group_gemm_fp8_nt_groupwise, and group_gemm_mxfp8_mxfp4_nt_groupwise incorrectly described the scale tensor as "Column-major". The kernel actually expects standard contiguous (row-major) tensors, consistent with what quantize_fp8 produces and the test suite passes. Changed "Column-major" to "Row-major" in all three a_scale descriptions to match the b_scale docs, which already correctly say "Row-major". Fixes flashinfer-ai#2147 Signed-off-by: Blake Ledden <bledden@users.noreply.github.com>

gemini-code-assist · 2026-02-21T17:59:08Z

Summary of Changes

Hello @bledden, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an inconsistency in the documentation for several FP8 GEMM functions. It rectifies the description of the a_scale parameter, clarifying that it expects a row-major tensor rather than a column-major one. This correction ensures that users correctly understand the required input format, preventing potential issues arising from misinterpretation of the docstrings.

Highlights

Docstring Correction for FP8 GEMM a_scale: Updated the a_scale parameter docstrings in gemm_fp8_nt_groupwise, group_gemm_fp8_nt_groupwise, and group_gemm_mxfp8_mxfp4_nt_groupwise functions from "Column-major" to "Row-major" to accurately reflect the expected tensor layout.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

flashinfer/gemm/gemm_base.py
- Updated the a_scale docstring in gemm_fp8_nt_groupwise to specify "Row-major" instead of "Column-major".
- Updated the a_scale docstring in group_gemm_fp8_nt_groupwise to specify "Row-major" instead of "Column-major".
- Updated the a_scale docstring in group_gemm_mxfp8_mxfp4_nt_groupwise to specify "Row-major" instead of "Column-major".

Activity

No human activity has been recorded on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai · 2026-02-21T17:59:14Z

No actionable comments were generated in the recent review. 🎉

📝 Walkthrough

Walkthrough

Docstring corrections: updated scale-tensor memory-layout descriptions for several FP8/FP4 groupwise GEMM functions in flashinfer/gemm/gemm_base.py to specify Row-major layout when scale_major_mode is "K". No executable code changed.

Changes

Cohort / File(s)	Summary
FP8/FP4 groupwise GEMM docstrings `flashinfer/gemm/gemm_base.py`	Updated docstrings for `gemm_fp8_nt_groupwise`, `group_gemm_fp8_nt_groupwise`, and `group_gemm_mxfp8_mxfp4_nt_groupwise` to describe `a_scale`/`b_scale` as Row-major when `scale_major_mode` is `"K"`; preserved existing shape descriptions for other modes. No logic changes.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Possibly related PRs

misc: support checks for gemm #2214: Docstring and API adjustments for groupwise FP8 GEMM entry points and related validation in gemm_base.py.

Suggested reviewers

nvmbreughe
cyx-6
yzh119

Poem

🐰 I hopped through docstrings, tidy and spry,
Row-major scales now catch the right eye.
Three functions aligned, no code to rewrite,
Just clearer directions — precise and polite. 🥕

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the primary change: fixing incorrect docstring documentation for column-major scale layouts in FP8 GEMM functions.
Description check	✅ Passed	The PR description comprehensively explains the fix, validation methodology across three sources, affected functions, and links to the related issue `#2147`.
Linked Issues check	✅ Passed	The PR successfully addresses issue `#2147` by correcting a_scale docstrings from column-major to row-major, matching the implementation and test behavior verified against quantize_fp8 utility and test suite.
Out of Scope Changes check	✅ Passed	All changes are scoped to docstring corrections in FP8 GEMM functions; no unrelated code modifications or functional changes are present.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

The pull request correctly addresses a documentation inaccuracy in the FP8 GEMM functions. The a_scale parameter was previously described as having a column-major layout, which is incorrect as the underlying kernels expect standard row-major (contiguous) PyTorch tensors. This fix ensures that the documentation is consistent with the implementation and the b_scale parameter description. The changes are applied to gemm_fp8_nt_groupwise, group_gemm_fp8_nt_groupwise, and group_gemm_mxfp8_mxfp4_nt_groupwise.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

flashinfer/gemm/gemm_base.py (1)
4213-4215: ⚠️ Potential issue | 🟡 Minor

Pre-existing typo in adjacent b_scale docstring: scale_major_k → scale_major_mode.

While not introduced by this PR, fixing it here keeps the docstring fully consistent since scale_major_k is not a valid parameter name.
📝 Proposed fix
-        Row-major scale tensor for b, shape ``(n // block_size, k // block_size)`` if scale_major_k is ``K``
+        Row-major scale tensor for b, shape ``(n // block_size, k // block_size)`` if scale_major_mode is ``K``
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@flashinfer/gemm/gemm_base.py` around lines 4213 - 4215, Fix the typo in the
b_scale docstring in gemm_base.py: replace the incorrect parameter name
"scale_major_k" with the correct "scale_major_mode" in the sentence describing
the row-major scale tensor shapes for backend "cutlass" so the docstring for
parameter b_scale consistently references scale_major_mode.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@flashinfer/gemm/gemm_base.py`:
- Around line 4213-4215: Fix the typo in the b_scale docstring in gemm_base.py:
replace the incorrect parameter name "scale_major_k" with the correct
"scale_major_mode" in the sentence describing the row-major scale tensor shapes
for backend "cutlass" so the docstring for parameter b_scale consistently
references scale_major_mode.

Per CodeRabbit feedback, the adjacent b_scale docstring had scale_major_k instead of scale_major_mode — fixing while I'm already editing these docstrings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

bledden requested review from bkryu, jimmyzho, nvmbreughe and yongwww as code owners February 21, 2026 17:58

gemini-code-assist bot reviewed Feb 21, 2026

View reviewed changes

coderabbitai bot reviewed Feb 21, 2026

View reviewed changes

fix: also fix adjacent scale_major_k typo in b_scale docstring

2b84047

Per CodeRabbit feedback, the adjacent b_scale docstring had scale_major_k instead of scale_major_mode — fixing while I'm already editing these docstrings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: Fix incorrect column-major scale layout in FP8 GEMM docstrings#2614

docs: Fix incorrect column-major scale layout in FP8 GEMM docstrings#2614
bledden wants to merge 2 commits intoflashinfer-ai:mainfrom
bledden:fix/fp8-scale-docstring-layout

bledden commented Feb 21, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

gemini-code-assist bot commented Feb 21, 2026

Uh oh!

coderabbitai bot commented Feb 21, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bledden commented Feb 21, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Summary by CodeRabbit

Uh oh!

gemini-code-assist bot commented Feb 21, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai bot commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bledden commented Feb 21, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 21, 2026 •

edited

Loading