[Model] Register Qwen3_5ForCausalLM and Qwen3_5MoeForCausalLM for text-only checkpoints#36289
[Model] Register Qwen3_5ForCausalLM and Qwen3_5MoeForCausalLM for text-only checkpoints#36289aminsamir45 wants to merge 4 commits intovllm-project:mainfrom
Conversation
d02d404 to
69c39ca
Compare
There was a problem hiding this comment.
Code Review
This pull request correctly registers Qwen3_5ForCausalLM and Qwen3_5MoeForCausalLM to enable loading text-only Qwen3.5 checkpoints. The change is straightforward and addresses the described issue. However, it appears to be missing a corresponding update to the test registry, which is a required step for adding new model architectures to ensure proper test coverage.
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run You ask your reviewers to trigger select CI tests on top of Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add If you have any questions, please reach out to us on Slack at https://slack.vllm.ai. 🚀 |
…l registry The Qwen3_5ForCausalLM and Qwen3_5MoeForCausalLM classes already exist in qwen3_5.py but are not registered in the model registry, so vLLM cannot load text-only Qwen3.5 checkpoints. When a text-only checkpoint (e.g. Qwen3.5-4B) specifies architectures: ["Qwen3_5ForCausalLM"], the registry lookup fails and vLLM falls through to the VLM class Qwen3_5ForConditionalGeneration, which expects weights under language_model.model.layers.* instead of model.layers.*, causing a hard weight mismatch. Changes: - Add Qwen3_5ForCausalLM and Qwen3_5MoeForCausalLM to _TEXT_GENERATION_MODELS in registry.py - Add IsHybrid mixin and get_mamba_state_dtype_from_config, get_mamba_state_shape_from_config, get_mamba_state_copy_func to Qwen3_5ForCausalLMBase so standalone text-only usage correctly configures the Gated DeltaNet state cache - Add both architectures to MODELS_CONFIG_MAP in config.py so mamba_ssm_cache_dtype is auto-configured from the HF config - Add test registry entries for CI coverage Fixes vllm-project#36275 Signed-off-by: Samir Amin <aminsamir45@gmail.com> Made-with: Cursor
69c39ca to
efb2d60
Compare
This is not true - afaik all Qwen3.5 checkpoints are natively multimodal: https://huggingface.co/Qwen/Qwen3.5-4B/blob/main/config.json#L3 Do you have a text-only checkpoint that's available on huggingface? Adding them into the list of model architectures will also require public checkpoints with |
tests/models/registry.py
Outdated
| "Qwen3_5ForCausalLM": _HfExamplesInfo( | ||
| "Qwen/Qwen3.5-0.8B", | ||
| max_model_len=4096, | ||
| ), | ||
| "Qwen3_5MoeForCausalLM": _HfExamplesInfo( | ||
| "Qwen/Qwen3.5-35B-A3B", | ||
| max_model_len=4096, | ||
| ), |
There was a problem hiding this comment.
Cursor is just wrong here... both models are registered as XXXForConditionalGeneration
The example models (Qwen/Qwen3.5-0.8B, Qwen/Qwen3.5-35B-A3B) are VLMs that use Qwen3_5ForConditionalGeneration, not Qwen3_5ForCausalLM. The test entries never exercised the text-only code path. For text-only checkpoints produced by fine-tuning a Qwen3.5 VLM with AutoModelForCausalLM, the officially supported path is to load via the VLM class with language_model_only=True, which skips the vision pipeline and loads only the LM backbone. Signed-off-by: Samir Amin <aminsamir45@gmail.com>
|
There are no changes from this PR - do you still intend to keep it? If the purpose is to provide guidance for people who run into the issue when they cannot initialize the model as a text-only model, we already have |
Summary
Qwen3_5ForCausalLMandQwen3_5MoeForCausalLMclasses already exist inqwen3_5.pybut are not registered in_TEXT_GENERATION_MODELSin the model registry. This means vLLM cannot load text-only Qwen3.5 checkpoints (e.g.Qwen/Qwen3.5-4B).When a text-only checkpoint specifies
architectures: ["Qwen3_5ForCausalLM"], the registry lookup fails and vLLM falls through to the VLM classQwen3_5ForConditionalGeneration, which expects weights underlanguage_model.model.layers.*instead ofmodel.layers.*, causing a hard weight mismatch.This completely blocks using vLLM with TRL's
GRPOTrainer(colocate mode) for RL training on text-only Qwen3.5 models.Fix
Two lines added to
_TEXT_GENERATION_MODELSinregistry.py:The implementation classes are already fully functional — they just weren't wired into the registry.
Fixes #36275
Related: #36236
Test plan
Qwen/Qwen3.5-4B) and verify it resolves toQwen3_5ForCausalLMinstead ofQwen3_5ForConditionalGenerationmodel.layers.*without thelanguage_model.prefix mismatchQwen3_5ForConditionalGenerationis unaffectedMade with Cursor