Skip to content

[BUGFIX] Pixtral cannot be loaded with --limit-mm-per-prompt 0#33406

Merged
vllm-bot merged 1 commit intovllm-project:mainfrom
juliendenize:juliendenize/fix_pixtral_staged_vision_loading
Jan 30, 2026
Merged

[BUGFIX] Pixtral cannot be loaded with --limit-mm-per-prompt 0#33406
vllm-bot merged 1 commit intovllm-project:mainfrom
juliendenize:juliendenize/fix_pixtral_staged_vision_loading

Conversation

@juliendenize
Copy link
Copy Markdown
Contributor

@juliendenize juliendenize commented Jan 30, 2026

Purpose

This PR fixes the loading of Pixtral model when the --limit-mm-per-prompt is set to 0 for images.

In such cases, the vision part is no longer None but StageMissingLayer. However as the vision weights exist the loading weight functions still try to load them in a non-existent module.

Fix #32959 in place of #33006 or #33008.

Thanks @dbary for informing me that the error was still present. I believe this PR should also be used for #33174. LMK if it works out for you 😄

Test Plan

I checked it works for mistralai/Mistral-Large-3-675B-Instruct-2512 and mistralai/Devstral-Small-2-24B-Instruct-2512 for --limit-mm-per-prompt in [0,1].

Test Result

Model is successfully loaded.


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results

Signed-off-by: juliendenize <julien.denize@mistral.ai>
@mergify mergify bot added the bug Something isn't working label Jan 30, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a bug where the Pixtral model fails to load when multi-modal features are disabled via --limit-mm-per-prompt 0. The root cause is that vision-related modules are replaced by StageMissingLayer placeholders, but the weight loading logic only checked for None, leading to attempts to load weights into non-existent modules.

The fix introduces a helper function _is_layer_none_or_staged to correctly check if a layer is either None or a StageMissingLayer placeholder. This check is then applied in the load_weights method for all vision-related components (vision_encoder, patch_merger, pre_mm_projector_norm, vision_language_adapter), ensuring that weight loading is skipped for these components when they are not active.

The changes are correct, well-targeted, and effectively resolve the described bug. The use of a helper function for the check is good practice and keeps the code clean.

@vllm-bot vllm-bot merged commit 8e2ad97 into vllm-project:main Jan 30, 2026
11 of 12 checks passed
@dbari
Copy link
Copy Markdown
Contributor

dbari commented Jan 30, 2026

@juliendenize good catch, it works for me, thanks!

varun-sundar-rabindranath pushed a commit to tlrmchlsmth/vllm that referenced this pull request Feb 2, 2026
PiratePai pushed a commit to PiratePai/epd_shm that referenced this pull request Feb 3, 2026
…project#33406)

Signed-off-by: juliendenize <julien.denize@mistral.ai>
Signed-off-by: Pai <416932041@qq.com>
varun-sundar-rabindranath added a commit to tlrmchlsmth/vllm that referenced this pull request Feb 6, 2026
…project#33406) (#28)

Signed-off-by: juliendenize <julien.denize@mistral.ai>
Co-authored-by: Julien Denize <40604584+juliendenize@users.noreply.github.com>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: KeyError: merging_layer.weight when loading Mistral/vision-enabled checkpoints after PR #32780 refactor

4 participants