fix: handle vendor-prefixed GGUF quant types (e.g., UD-Q4_K_XL)#39470
fix: handle vendor-prefixed GGUF quant types (e.g., UD-Q4_K_XL)#39470ianliuy wants to merge 2 commits into
Conversation
GGUF model publishers like Unsloth use vendor-prefixed quant type names (e.g., UD-Q4_K_XL for Unsloth Dynamic quantization). These were not recognized by is_valid_gguf_quant_type(), causing the entire GGUF detection chain to fail. The colon separator was never stripped from the model name, and the full string (with ':') was passed to HuggingFace APIs, resulting in HFValidationError. Fix: Since GGML quant type names never contain hyphens, any hyphen in the quant string reliably indicates a vendor prefix. The validation function now strips the prefix (splitting on the first hyphen) before checking the base type against the GGMLQuantizationType enum. Fixes vllm-project#39198 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Code Review
This pull request introduces support for vendor-prefixed GGUF quantization types, such as Unsloth Dynamic (UD-), by updating the validation logic and adding comprehensive unit tests. The is_valid_gguf_quant_type function was refactored to handle hyphenated prefixes, and error messages were updated to reflect this new support. A review comment suggests using rsplit instead of split when isolating the vendor prefix to ensure robustness if a vendor name itself contains a hyphen.
| # GGML quant type names never contain hyphens, so a hyphen indicates | ||
| # a vendor prefix. | ||
| if "-" in gguf_quant_type: | ||
| prefix, remainder = gguf_quant_type.split("-", 1) |
There was a problem hiding this comment.
Using split("-", 1) only supports vendor prefixes that do not contain hyphens. If a vendor prefix itself contains a hyphen (e.g., MY-VENDOR-Q4_K), this logic will fail to validate the remainder. Since GGML quantization types are guaranteed not to contain hyphens, using rsplit("-", 1) is a more robust way to isolate the quantization type from any vendor prefix.
| prefix, remainder = gguf_quant_type.split("-", 1) | |
| prefix, remainder = gguf_quant_type.rsplit("-", 1) |
There was a problem hiding this comment.
Good catch! Applied in db8856f. Since GGML quant types never contain hyphens, rsplit correctly isolates the quant type from any multi-part vendor prefix (e.g., MY-VENDOR-Q4_K_XL -> remainder Q4_K_XL).
rsplit('-', 1) is more robust than split('-', 1) for multi-part vendor
prefixes (e.g., MY-VENDOR-Q4_K_XL). Since GGML quant type names never
contain hyphens, splitting from the right correctly isolates the quant
type regardless of how many hyphens the vendor prefix contains.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
This pull request has merge conflicts that must be resolved before it can be |
|
Closing this was superseded by #39471 which merged a similar fix (is_nonstandard_gguf_quant_type with rsplit). Same root cause analysis, same approach. Thanks! |
Summary
Fixes #39198
GGUF model publishers like Unsloth use vendor-prefixed quant type names (e.g.,
UD-Q4_K_XLfor Unsloth Dynamic quantization). These were not recognized byis_valid_gguf_quant_type(), causing the entire GGUF detection chain to fail. The colon separator was never stripped from the model name, and the full string (with:) was passed to HuggingFace APIs, resulting inHFValidationError.Root Cause
is_valid_gguf_quant_type()only checks exactGGMLQuantizationTypeenum members + standard size suffixes (_M,_S,_L,_XL,_XS,_XXS). Vendor-prefixed types likeUD-Q4_K_XLare not in the enum, so the full detection chain returnsFalse. The:is never stripped, and the full model string is passed as-is to HF'shf_hub_download, which rejects the:character.Fix
GGML quant type names never contain hyphens - they only use underscores and alphanumerics. A hyphen is therefore a reliable signal of a vendor prefix.
is_valid_gguf_quant_type()now:Example:
UD-Q4_K_XL-> stripUD-->Q4_K_XL-> strip suffix_XL->Q4_K(valid enum member)Changes
vllm/transformers_utils/gguf_utils.py: Extract_is_base_gguf_quant_type()helper; add vendor-prefix stripping; update error messagetests/transformers_utils/test_utils.py: Add tests for vendor-prefixed quant types across all GGUF utility functionsTesting
UD-Q4_K_XL,UD-F16,XX-Q4_K_M; invalid:UD-INVALID,UD-,-Q4_K)