fix(gguf): Auto-select compatible dtype for GGUF models on Blackwell#30365
fix(gguf): Auto-select compatible dtype for GGUF models on Blackwell#30365kitaekatt wants to merge 1 commit into
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
There was a problem hiding this comment.
Code Review
This pull request effectively resolves a dtype conflict for GGUF models on Blackwell GPUs, particularly for models like Gemma3 with specific dtype restrictions. The changes are well-implemented. In vllm/model_executor/layers/quantization/gguf.py, bfloat16 is correctly excluded on SM 120+ devices. The new logic in vllm/config/vllm.py for automatic dtype conflict resolution is robust; it finds a compatible dtype by intersecting model and quantization-supported types, selects the most performant option, and warns the user. This is a solid fix that also handles future similar conflicts. I have not identified any high or critical severity issues.
Fixes Gemma3 GGUF models failing on Blackwell GPUs with --dtype auto. Problem: - Gemma3 blocks float16 (numerical instability) - GGUF on Blackwell blocks bfloat16 (precision issues) - Only float32 works, but dtype=auto picks bfloat16 → fails Changes: 1. gguf.py: Block bfloat16 on SM 120+ (Blackwell) devices 2. vllm.py: Auto-select compatible dtype when model and quantization restrictions conflict, instead of failing with an error This allows --dtype auto to work correctly with Gemma3 GGUF on Blackwell by automatically falling back to float32. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
41f57b4 to
9b115ed
Compare
|
|
Summary
Fixes dtype conflict for Gemma3 GGUF models on Blackwell GPUs (SM 120+) where
--dtype autofails because:[bfloat16, float32][float16, float32]float32worksError before fix:
Changes
gguf.py: Block bfloat16 on Blackwell (SM 120+) viacurrent_platform.has_device_capability(120)vllm.py: Add_resolve_dtype_conflict()to find compatible dtype when model restrictions and quantization restrictions conflict. Falls back to float32 when no other option exists.Test Plan
google/gemma-3-1b-itGGUF on RTX 5090 (Blackwell)--dtype autoRelated