Skip to content

[Bugfix] Accept canonicalized modelopt_* quant_method in _extract_modelopt_quant_algo#42181

Merged
yewentao256 merged 1 commit into
vllm-project:mainfrom
vadiklyutiy:modelopt-prefix-fix
May 11, 2026
Merged

[Bugfix] Accept canonicalized modelopt_* quant_method in _extract_modelopt_quant_algo#42181
yewentao256 merged 1 commit into
vllm-project:mainfrom
vadiklyutiy:modelopt-prefix-fix

Conversation

@vadiklyutiy

Copy link
Copy Markdown
Member

ModelArchConfigConvertorBase._normalize_quantization_config rewrites quant_method to the family-specific name (e.g. "modelopt_fp4" for the legacy hf_quant_config.json shape with quant_algo: "NVFP4"). _extract_modelopt_quant_algo then strict-equals against "modelopt" and returns None, so the override loop in ModelConfig._verify_quantization finds no match and validation raises:

Quantization method specified in the model config (modelopt_fp4) does not match the quantization method specified in the quantization argument (modelopt).

This was hitting nvidia/Qwen3.5-397B-A17B-NVFP4 with --quantization modelopt intermittently — one of several spawn-context API-server processes resolved the legacy hf_quant_config.json instead of the modern config.json and the strict check tipped it over.

Replace the equality with startswith("modelopt"). Covers all four registered variants (modelopt, modelopt_fp4, modelopt_mxfp8, modelopt_mixed) and matches what humming.py (config["quant_method"] in [..., "modelopt"]) and utils/torch_utils.py:315 (quant_method.startswith("modelopt")) already accept.

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@mergify mergify Bot added the bug Something isn't working label May 9, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request updates the _extract_modelopt_quant_algo function in vllm/model_executor/layers/quantization/modelopt.py to check if the quant_method starts with "modelopt" instead of requiring an exact match. This change allows for more flexible quantization method naming. I have no feedback to provide as no review comments were present.

@yewentao256 yewentao256 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the work!

@yewentao256 yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label May 9, 2026
@yewentao256 yewentao256 merged commit a2e776d into vllm-project:main May 11, 2026
76 checks passed
weifang231 pushed a commit to weifang231/eb-vllm that referenced this pull request May 13, 2026
…modelopt_quant_algo` (vllm-project#42181)

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
mfylcek pushed a commit to mfylcek/vllm that referenced this pull request May 19, 2026
…modelopt_quant_algo` (vllm-project#42181)

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
jhu960213 pushed a commit to jhu960213/vllm that referenced this pull request May 20, 2026
…modelopt_quant_algo` (vllm-project#42181)

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
h1t35h pushed a commit to h1t35h/vllm that referenced this pull request May 21, 2026
…modelopt_quant_algo` (vllm-project#42181)

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
mvanhorn pushed a commit to mvanhorn/vllm that referenced this pull request Jun 4, 2026
…modelopt_quant_algo` (vllm-project#42181)

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
knight0528 pushed a commit to knight0528/vllm that referenced this pull request Jun 8, 2026
…modelopt_quant_algo` (vllm-project#42181)

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants