Skip to content

Conversation

@yewentao256
Copy link
Member

@yewentao256 yewentao256 commented Oct 17, 2025

Purpose

Thanks for the previous effort from @bwasti !

Support DeepGEMM and Blackwell architecture

Test

Tested on B200

export VLLM_USE_DEEP_GEMM=1
VLLM_TEST_MODEL=Qwen/Qwen3-30B-A3B-FP8 pytest test_batch_invariance.py -x

==================== 7 passed, 11 warnings in 296.96s (0:04:56) ====================
(vllm-user-6) vllm-user-6@centralia:~/vllm-source/tests/v1/generation$ pytest test_rms_norm_batch_invariant.py 
test_rms_norm_batch_invariant.py ........................................... [ 48%]
..............................................                               [100%]

================================ 89 passed in 8.59s ================================

Signed-off-by: yewentao256 <[email protected]>
@mergify mergify bot added the v1 label Oct 17, 2025
@yewentao256 yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 17, 2025
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@yewentao256 yewentao256 marked this pull request as draft October 17, 2025 21:41
Signed-off-by: yewentao256 <[email protected]>
Signed-off-by: yewentao256 <[email protected]>
@yewentao256 yewentao256 self-assigned this Oct 17, 2025
@yewentao256 yewentao256 marked this pull request as ready for review October 17, 2025 22:33
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@yewentao256 yewentao256 merged commit 245e4f2 into main Oct 18, 2025
55 checks passed
@yewentao256 yewentao256 deleted the wentao-blackwell-deepgemm-support-for-batch-invariant branch October 18, 2025 13:28
Comment on lines +553 to +557
q_input, input_scale = QuantFP8(
False,
self.act_q_group_shape,
column_major_scales=True,
)(x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the intended way to use QuantFP8. Is there a reason why this cannot be put in the shared block w8a8 utils so it can be reused for other backends like compressed tensors?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's talk offline

lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
adabeyta pushed a commit to adabeyta/vllm that referenced this pull request Oct 20, 2025
albertoperdomo2 pushed a commit to albertoperdomo2/vllm that referenced this pull request Oct 23, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
ilmarkov pushed a commit to neuralmagic/vllm that referenced this pull request Nov 7, 2025
rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025
Zhathw pushed a commit to Zhathw/vllm that referenced this pull request Nov 12, 2025
devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

4 participants