[Feature] Batch Invariant: Support DeepGEMM and Blackwell #27127

yewentao256 · 2025-10-17T21:24:58Z

Purpose

Thanks for the previous effort from @bwasti !

Support DeepGEMM and Blackwell architecture

Test

Tested on B200

export VLLM_USE_DEEP_GEMM=1
VLLM_TEST_MODEL=Qwen/Qwen3-30B-A3B-FP8 pytest test_batch_invariance.py -x

==================== 7 passed, 11 warnings in 296.96s (0:04:56) ====================

(vllm-user-6) vllm-user-6@centralia:~/vllm-source/tests/v1/generation$ pytest test_rms_norm_batch_invariant.py 
test_rms_norm_batch_invariant.py ........................................... [ 48%]
..............................................                               [100%]

================================ 89 passed in 8.59s ================================

Signed-off-by: yewentao256 <[email protected]>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

vllm/model_executor/layers/quantization/fp8.py

Signed-off-by: yewentao256 <[email protected]>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

vllm/model_executor/layers/quantization/fp8.py

…invariant

mgoin · 2025-10-18T13:49:41Z

vllm/model_executor/layers/quantization/fp8.py

+                q_input, input_scale = QuantFP8(
+                    False,
+                    self.act_q_group_shape,
+                    column_major_scales=True,
+                )(x)


I don't think this is the intended way to use QuantFP8. Is there a reason why this cannot be put in the shared block w8a8 utils so it can be reused for other backends like compressed tensors?

Let's talk offline

…ct#27127) Signed-off-by: yewentao256 <[email protected]>

…ct#27127) Signed-off-by: yewentao256 <[email protected]> Signed-off-by: Alberto Perdomo <[email protected]>

…ct#27127) Signed-off-by: yewentao256 <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

…ct#27127) Signed-off-by: yewentao256 <[email protected]> Signed-off-by: 0xrushi <[email protected]>

…ct#27127) Signed-off-by: yewentao256 <[email protected]>

support deepgemm + blackwell

f29600d

Signed-off-by: yewentao256 <[email protected]>

yewentao256 requested review from mgoin, robertgshaw2-redhat and tlrmchlsmth as code owners October 17, 2025 21:24

mergify bot added the v1 label Oct 17, 2025

yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 17, 2025

chatgpt-codex-connector bot reviewed Oct 17, 2025

View reviewed changes

vllm/model_executor/layers/quantization/fp8.py Show resolved Hide resolved

yewentao256 marked this pull request as draft October 17, 2025 21:41

yewentao256 added 2 commits October 17, 2025 15:22

support deep_gemm

4caf7cf

Signed-off-by: yewentao256 <[email protected]>

remove view

5aeca72

Signed-off-by: yewentao256 <[email protected]>

yewentao256 added this to Batch-invariant Inference Oct 17, 2025

yewentao256 self-assigned this Oct 17, 2025

yewentao256 marked this pull request as ready for review October 17, 2025 22:33

chatgpt-codex-connector bot reviewed Oct 17, 2025

View reviewed changes

vllm/model_executor/layers/quantization/fp8.py Show resolved Hide resolved

zhuohan123 approved these changes Oct 17, 2025

View reviewed changes

Merge branch 'main' into wentao-blackwell-deepgemm-support-for-batch-…

bd7f539

…invariant

yewentao256 merged commit 245e4f2 into main Oct 18, 2025
55 checks passed

yewentao256 deleted the wentao-blackwell-deepgemm-support-for-batch-invariant branch October 18, 2025 13:28

mgoin reviewed Oct 18, 2025

View reviewed changes

lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025

[Feature] Batch Invariant: Support DeepGEMM and Blackwell (vllm-proje…

bb721af

…ct#27127) Signed-off-by: yewentao256 <[email protected]>

adabeyta pushed a commit to adabeyta/vllm that referenced this pull request Oct 20, 2025

[Feature] Batch Invariant: Support DeepGEMM and Blackwell (vllm-proje…

5c469b1

…ct#27127) Signed-off-by: yewentao256 <[email protected]>

yewentao256 mentioned this pull request Oct 23, 2025

[Feature]: Batch Invariant Feature and Performance Optimization #27433

Open

19 tasks

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025

[Feature] Batch Invariant: Support DeepGEMM and Blackwell (vllm-proje…

7200fef

…ct#27127) Signed-off-by: yewentao256 <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025

[Feature] Batch Invariant: Support DeepGEMM and Blackwell (vllm-proje…

0c247a9

…ct#27127) Signed-off-by: yewentao256 <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025

[Feature] Batch Invariant: Support DeepGEMM and Blackwell (vllm-proje…

272514e

…ct#27127) Signed-off-by: yewentao256 <[email protected]> Signed-off-by: 0xrushi <[email protected]>

0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025

[Feature] Batch Invariant: Support DeepGEMM and Blackwell (vllm-proje…

ee1c134

…ct#27127) Signed-off-by: yewentao256 <[email protected]> Signed-off-by: 0xrushi <[email protected]>

yewentao256 mentioned this pull request Oct 27, 2025

[Feature] Refactor batch invariant fp8 DeepGEMM #27606

Merged

ilmarkov pushed a commit to neuralmagic/vllm that referenced this pull request Nov 7, 2025

[Feature] Batch Invariant: Support DeepGEMM and Blackwell (vllm-proje…

43a71e8

…ct#27127) Signed-off-by: yewentao256 <[email protected]>

rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025

[Feature] Batch Invariant: Support DeepGEMM and Blackwell (vllm-proje…

5230b28

…ct#27127) Signed-off-by: yewentao256 <[email protected]>

Zhathw pushed a commit to Zhathw/vllm that referenced this pull request Nov 12, 2025

[Feature] Batch Invariant: Support DeepGEMM and Blackwell (vllm-proje…

b2d196c

…ct#27127) Signed-off-by: yewentao256 <[email protected]>

devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025

[Feature] Batch Invariant: Support DeepGEMM and Blackwell (vllm-proje…

d720dd7

…ct#27127) Signed-off-by: yewentao256 <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature] Batch Invariant: Support DeepGEMM and Blackwell #27127

[Feature] Batch Invariant: Support DeepGEMM and Blackwell #27127

Uh oh!

yewentao256 commented Oct 17, 2025 •

edited by github-actions bot

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

mgoin Oct 18, 2025

Uh oh!

yewentao256 Oct 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

[Feature] Batch Invariant: Support DeepGEMM and Blackwell #27127

[Feature] Batch Invariant: Support DeepGEMM and Blackwell #27127

Uh oh!

Conversation

yewentao256 commented Oct 17, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

mgoin Oct 18, 2025

Choose a reason for hiding this comment

Uh oh!

yewentao256 Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yewentao256 commented Oct 17, 2025 •

edited by github-actions bot

Loading