[Bugfix] Fix Quant Type Descriptor for Weights by tjtanaa · Pull Request #32702 · vllm-project/vllm

tjtanaa · 2026-01-20T16:56:37Z

Purpose

The weight descriptor assigned in this PR #27814 is not semantically correct. Introduce the GroupShape.CHANNEL and kFp8StaticChannelSym descriptor, following PR #32414 .

Test Plan

Pass CI

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

…er than token Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

gemini-code-assist

Code Review

This pull request effectively addresses the bug related to the incorrect quantization type descriptor for weights. The introduction of GroupShape.PER_CHANNEL and its corresponding constant kFp8StaticChannelSym correctly aligns the weight descriptor with channel-wise quantization. The refactoring of the ScaleDesc.__str__ method also improves code readability and maintainability. The changes are well-contained and directly resolve the identified semantic issue.

mergify · 2026-01-21T15:25:13Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @tjtanaa.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

ProExpertProg

This looks good but maybe we should unify these into per-row?

ProExpertProg · 2026-01-22T12:45:44Z

vllm/model_executor/layers/quantization/utils/quant_utils.py

+# Input shape is in (M, K)
+# Descriptor for weights that are quantized per token
 GroupShape.PER_TOKEN = GroupShape(1, -1)
-GroupShape.PER_CHANNEL = GroupShape(-1, 1)


Maybe we need to just call this one per-row for both? Wdyt @LucasWilkinson @mgoin

tjtanaa added 2 commits January 20, 2026 16:49

fix the semantics, weight descriptor should use the term channel rath…

34bbc4a

…er than token Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

add is_per_channel method

7ead378

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

tjtanaa requested review from mgoin, pavanimajety, robertgshaw2-redhat, tlrmchlsmth and yewentao256 as code owners January 20, 2026 16:56

tjtanaa added rocm Related to AMD ROCm ready ONLY add when PR is ready to merge/full CI is needed labels Jan 20, 2026

mergify bot added the bug Something isn't working label Jan 20, 2026

gemini-code-assist bot reviewed Jan 20, 2026

View reviewed changes

mergify bot added the needs-rebase label Jan 21, 2026

rebase

215f1e5

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

mergify bot removed the needs-rebase label Jan 21, 2026

tjtanaa added 3 commits January 22, 2026 07:05

Merge remote-tracking branch 'origin/main' into fix-weight-descriptor

307c196

fix is_per_channel check condition and add documentation

8783e9d

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

add back quant type

834a467

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

ProExpertProg reviewed Jan 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] Fix Quant Type Descriptor for Weights#32702

[Bugfix] Fix Quant Type Descriptor for Weights#32702
tjtanaa wants to merge 6 commits intovllm-project:mainfrom
EmbeddedLLM:fix-weight-descriptor

tjtanaa commented Jan 20, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

mergify bot commented Jan 21, 2026

Uh oh!

ProExpertProg left a comment

Uh oh!

ProExpertProg Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

tjtanaa commented Jan 20, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

mergify bot commented Jan 21, 2026

Uh oh!

ProExpertProg left a comment

Choose a reason for hiding this comment

Uh oh!

ProExpertProg Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tjtanaa commented Jan 20, 2026 •

edited by github-actions bot

Loading