[ROCm][Bugfix] fix exception related to trust_remote_code for MiniMax-M2.1-MXFP4 by hongxiayang · Pull Request #37698 · vllm-project/vllm

hongxiayang · 2026-03-20T15:43:35Z

Purpose

Bug Fix: QuarkConfig.maybe_update_config

Problem: The original code called get_config() with hardcoded trust_remote_code=False for every Quark model. This caused:

Exceptions for models like amd/MiniMax-M2.1-MXFP4 that require trust_remote_code=True
For example:

Value error, The repository amd/MiniMax-M2.5-MXFP4 contains custom code which must be executed to correctly load the model.

Wasteful HF hub access for non-deepseek amd quark models where the logic doesn't even apply
the user can not override the trust_remote_code as it is hard-coded

File Changes

vllm/model_executor/layers/quantization/quark/quark.py:

Replaced get_config() call with pre-loaded hf_config from ModelConfig, so no need to get from hf config. Also, user should be able to override trust_remote_code from command line.

Added early return for non-deepseek_v3 model types via _DEEPSEEK_V3_FAMILY_MODEL_TYPES frozenset.

vllm/model_executor/layers/quantization/base_config.py: Extended base maybe_update_config signature to accept revision + **kwargs

vllm.py: Passes hf_config, revision, and trust_remote_code from ModelConfig to maybe_update_config

This will allow user to specify trust_remote_code.

and other places to align with the signature change.

Added new Test

tests/quantization/test_quark_maybe_update_config.py: 3 tests using real HF configs — verifies amd/MiniMax-M2.1-MXFP4 stays False, amd/DeepSeek-R1-MXFP4-ASQ enables True, and missing hf_config doesn't crash

Test Result

root@node:/home/vllm/tests/quantization# pytest test_quark_maybe_update_config.py
==================================================== test session starts ====================================================
platform linux -- Python 3.12.12, pytest-9.0.2, pluggy-1.6.0
rootdir: /dockerx/vllm
configfile: pyproject.toml
plugins: asyncio-1.3.0, anyio-4.12.1
asyncio: mode=Mode.STRICT, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 3 items

test_quark_maybe_update_config.py ... [100%]

=============================================== 3 passed, 2 warnings in 4.72s ===============================================
sys:1: DeprecationWarning: builtin type swigvarlink has no module attribute

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request effectively addresses the bug related to trust_remote_code for Quark models by propagating the hf_config from ModelConfig down to maybe_update_config. This avoids re-fetching the configuration with a hardcoded trust_remote_code=False and also adds a performance improvement by skipping logic for non-applicable models. The signature changes across various quantization configs are correctly handled to maintain compatibility. I've added one comment regarding improving the robustness of dictionary access to prevent potential KeyError exceptions.

mergify · 2026-03-20T16:00:29Z

Hi @hongxiayang, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?

mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

hongxiayang · 2026-03-20T20:22:25Z

cc @dllehr-amd

… amd quark models Signed-off-by: Hongxia Yang <hongxiay.yang@amd.com>

Signed-off-by: Hongxia Yang <hongxiay.yang@amd.com>

hongxiayang · 2026-03-24T13:46:22Z

cc @tjtanaa

BowenBao

LGTM, thanks for the fix, and restricting the dynamic quant for DS.

BowenBao · 2026-03-26T20:53:04Z

+            revision: The revision of the model
+        Returns:
        """
+        # TODO: revision is never passed currently in vllm.py,


yea should be okay to drop revision.

cc @dllehr-amd

will do on a follow up PR

hongxiayang · 2026-03-27T13:45:09Z

cc @DarkLight1337 for helping review and merge

DarkLight1337

cc @mgoin @Isotr0py

hongxiayang · 2026-03-27T19:19:39Z

It seems the failures are not related to this PR.
cc @DarkLight1337

…-M2.1-MXFP4 (vllm-project#37698) Signed-off-by: Hongxia Yang <hongxiay.yang@amd.com> Co-authored-by: Hongxia Yang <hongxiay.yang@amd.com> Signed-off-by: neweyes <328719365@qq.com>

…-M2.1-MXFP4 (vllm-project#37698) Signed-off-by: Hongxia Yang <hongxiay.yang@amd.com> Co-authored-by: Hongxia Yang <hongxiay.yang@amd.com> Signed-off-by: Rishi Puri <riship@nvidia.com>

### What this PR does / why we need it? Main2main upgrade vllm to 0330 fix breaks: 1. vllm-project/vllm#37728 add clear_row method for BlockTable 2. vllm-project/vllm#37975 Adapt GatedDeltaNetAttention Refactor 3. vllm-project/vllm#37698 update maybe_update_config in vllm_ascend/quantization/modelslim_config.py to adapt this pr change 4. vllm-project/vllm#37880 This pr add the feat where we can set different moe backends between draft and target model, we should overwrite it in the draft proposer 5. vllm-project/vllm#37853 for now just to skip test_cpu_offloading.py test case utils this feature has been adapted. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI - vLLM version: v0.18.0 - vLLM main: vllm-project/vllm@29e4870 --------- Signed-off-by: 22dimensions <waitingwind@foxmail.com> Signed-off-by: wxsIcey <1790571317@qq.com> Signed-off-by: wangli <wangli858794774@gmail.com> Co-authored-by: Claude Code <claude@anthropic.com> Co-authored-by: Claude Code <noreply@anthropic.com> Co-authored-by: wxsIcey <1790571317@qq.com> Co-authored-by: wangli <wangli858794774@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

…-M2.1-MXFP4 (vllm-project#37698) Signed-off-by: Hongxia Yang <hongxiay.yang@amd.com> Co-authored-by: Hongxia Yang <hongxiay.yang@amd.com>

hongxiayang requested review from ProExpertProg, WoosukKwon, hmellor, houseroad, mgoin, pavanimajety, robertgshaw2-redhat, tjtanaa, tlrmchlsmth, yewentao256 and youkaichao as code owners March 20, 2026 15:43

mergify Bot added rocm Related to AMD ROCm cpu Related to CPU backends bug Something isn't working labels Mar 20, 2026

github-project-automation Bot added this to AMD Mar 20, 2026

github-project-automation Bot moved this to Todo in AMD Mar 20, 2026

gemini-code-assist Bot reviewed Mar 20, 2026

View reviewed changes

Comment thread vllm/model_executor/layers/quantization/quark/quark.py Outdated

hongxiayang changed the title ~~[ROCm][Bugfix] fix exception related to trust_remote_code for certain amd quark models~~ [ROCm][Bugfix] fix exception related to trust_remote_code for MiniMax-M2.1-MXFP4 Mar 20, 2026

hongxiayang mentioned this pull request Mar 20, 2026

[ROCm][Quantization] fallback trust_remote_code=True in Quark config for some cases #37408

Closed

Hongxia Yang added 4 commits March 20, 2026 18:19

[ROCm][Bugfix] fix exception related to trust_remote_code for certain…

2754afa

… amd quark models Signed-off-by: Hongxia Yang <hongxiay.yang@amd.com>

minor

88e5981

Signed-off-by: Hongxia Yang <hongxiay.yang@amd.com>

further cleanup

a8f1f32

Signed-off-by: Hongxia Yang <hongxiay.yang@amd.com>

cleanup

30d7c71

Signed-off-by: Hongxia Yang <hongxiay.yang@amd.com>

hongxiayang force-pushed the fix_quark_trust_remote branch from d86a7a3 to 30d7c71 Compare March 20, 2026 23:20

hongxiayang removed the cpu Related to CPU backends label Mar 24, 2026

mergify Bot added the cpu Related to CPU backends label Mar 24, 2026

BowenBao approved these changes Mar 26, 2026

View reviewed changes

This was referenced Mar 27, 2026

Minimax M2.5 MXFP4 benchmark for MI355x vLLM v0.19.1 (TP=1,2,4) SemiAnalysisAI/InferenceX#827

Merged

[Bug]: AMD's minimax mxfp4 trust_remote_code bug #38307

Closed

DarkLight1337 approved these changes Mar 27, 2026

View reviewed changes

DarkLight1337 enabled auto-merge (squash) March 27, 2026 14:32

github-actions Bot added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 27, 2026

Isotr0py approved these changes Mar 27, 2026

View reviewed changes

hongxiayang moved this from Todo to In Progress in AMD Mar 27, 2026

BowenBao mentioned this pull request Mar 27, 2026

[Quantization] Support Quark W8A8 INT8 MoE inference #36320

Merged

5 tasks

Isotr0py and others added 2 commits March 29, 2026 15:38

Merge branch 'main' into fix_quark_trust_remote

4301451

Merge branch 'main' into fix_quark_trust_remote

98afb13

DarkLight1337 merged commit dbdd9ae into vllm-project:main Mar 30, 2026
69 of 70 checks passed

github-project-automation Bot moved this from In Progress to Done in AMD Mar 30, 2026

22dimensions mentioned this pull request Apr 7, 2026

[CI] Main2main upgrade vllm to 0330 vllm-project/vllm-ascend#7962

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ROCm][Bugfix] fix exception related to trust_remote_code for MiniMax-M2.1-MXFP4#37698

[ROCm][Bugfix] fix exception related to trust_remote_code for MiniMax-M2.1-MXFP4#37698
DarkLight1337 merged 6 commits intovllm-project:mainfrom
hongxiayang:fix_quark_trust_remote

hongxiayang commented Mar 20, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

mergify Bot commented Mar 20, 2026

Uh oh!

hongxiayang commented Mar 20, 2026

Uh oh!

hongxiayang commented Mar 24, 2026

Uh oh!

BowenBao left a comment

Uh oh!

BowenBao Mar 26, 2026

Uh oh!

hongxiayang Mar 27, 2026

Uh oh!

hongxiayang commented Mar 27, 2026

Uh oh!

DarkLight1337 left a comment

Uh oh!

hongxiayang commented Mar 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

hongxiayang commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Result

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

mergify Bot commented Mar 20, 2026

Uh oh!

hongxiayang commented Mar 20, 2026

Uh oh!

hongxiayang commented Mar 24, 2026

Uh oh!

BowenBao left a comment

Choose a reason for hiding this comment

Uh oh!

BowenBao Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

hongxiayang Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

hongxiayang commented Mar 27, 2026

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

hongxiayang commented Mar 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hongxiayang commented Mar 20, 2026 •

edited

Loading