[Bugfix] Accept file-type-only quant types (IQ2_M, IQ3_XS, ...) in remote GGUF model IDs by Sunt-ing · Pull Request #19 · vllm-project/vllm-gguf-plugin

Sunt-ing · 2026-06-03T06:13:11Z

Purpose

Port the fix approved at vllm-project/vllm#44218 to the plugin, as @Isotr0py requested there (GGUF support is migrating here per #39612). Fixes vllm-project/vllm#42734.

vllm serve <repo>:UD-IQ2_M (and any repo_id:quant_type reference whose quant is IQ2_M, IQ3_M, IQ3_XS, or MXFP4_MOE) fails with Repo id must use alphanumeric chars...: the whole string is treated as a plain repo id instead of a remote GGUF reference.

Root cause. is_valid_gguf_quant_type() only checks GGMLQuantizationType (the tensor quantization enum), but the quant_type in a repo_id:quant_type reference is a GGUF file type (LlamaFileType) used to select the .gguf file. These are two distinct gguf enums, and IQ2_M/IQ3_M/IQ3_XS/MXFP4_MOE exist only in LlamaFileType — they have no GGMLQuantizationType member, and no valid base after suffix stripping (IQ2_M → IQ2, which doesn't exist). So is_remote_gguf() returns False and the reference is rejected. UD-IQ1_S/UD-IQ1_M work only because IQ1_S/IQ1_M happen to exist in both enums.

This matches the upstream conclusion in ggml-org/llama.cpp#23085: the gguf maintainer confirms this is a downstream issue and that general.file_type must be parsed with LlamaFileType. It also cannot be addressed by adding IQ2_M to GGMLQuantizationType upstream, because GGMLQuantizationType.IQ1_M and LlamaFileType.MOSTLY_IQ2_M both equal 29 in the shared int space.

Fix. Accept either enum in is_valid_gguf_quant_type(): a LlamaFileType file type (members are prefixed MOSTLY_) or a GGMLQuantizationType tensor type. The existing suffix handling for extended names (e.g. Q4_K_M → Q4_K) is preserved. Minimal change, no special-case table, so newly added file types are picked up automatically.

Test Plan

Unit: extend tests/test_gguf_utils.py::TestIsRemoteGGUF to cover file-type-only quants (IQ2_M/IQ3_M/IQ3_XS/MXFP4_MOE), with and without a vendor prefix (UD-), plus negative cases (IQ9_M, NOTATYPE). Each new assertion was confirmed to fail on the unpatched code and pass after the fix.

Test Result

$ pytest tests/test_gguf_utils.py -q
25 passed

Before the fix the new cases fail (e.g. is_remote_gguf("unsloth/Qwen3.6-35B-A3B-GGUF:UD-IQ2_M") returns False, reproducing the reported Repo id must use alphanumeric chars...); after the fix all 25 pass. ruff check, ruff format, and typos are clean on the changed files.

AI assistance was used to investigate, reproduce, and draft this change; the author reviewed the diff and validation output.

… in remote GGUF model IDs The quant_type in a repo_id:quant_type reference is a GGUF file type (LlamaFileType), not a tensor type (GGMLQuantizationType). File types such as IQ2_M, IQ3_M, IQ3_XS and MXFP4_MOE have no GGMLQuantizationType member, so is_valid_gguf_quant_type() rejected them and the whole reference was treated as a plain repo id, failing with "Repo id must use alphanumeric chars...". Accept either enum (LlamaFileType members are prefixed MOSTLY_) so these file-type-only quants are recognized; the existing extended-suffix handling (e.g. Q4_K_M -> Q4_K) is preserved. Ports the fix approved at vllm-project/vllm#44218 to the plugin, as requested by the maintainer since GGUF support is migrating here (vllm-project/vllm#39612). Fixes vllm-project/vllm#42734 Signed-off-by: Ting Sun <suntcrick@gmail.com>

Isotr0py

Thanks!

Sunt-ing mentioned this pull request Jun 3, 2026

[Bugfix][GGUF] Accept file-type-only quant types (IQ2_M, IQ3_XS, ...) in remote GGUF model IDs vllm-project/vllm#44218

Open

4 tasks

Isotr0py approved these changes Jun 3, 2026

View reviewed changes

Isotr0py merged commit 69dae43 into vllm-project:main Jun 3, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] Accept file-type-only quant types (IQ2_M, IQ3_XS, ...) in remote GGUF model IDs#19

[Bugfix] Accept file-type-only quant types (IQ2_M, IQ3_XS, ...) in remote GGUF model IDs#19
Isotr0py merged 1 commit into
vllm-project:mainfrom
Sunt-ing:fix/accept-file-type-only-quants

Sunt-ing commented Jun 3, 2026

Uh oh!

Isotr0py left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Sunt-ing commented Jun 3, 2026

Purpose

Test Plan

Test Result

Uh oh!

Isotr0py left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants