fix(studio): custom folder scan fails to find GGUF variants when pointing directly at a model directory#4860
Conversation
…ights
When a custom scan folder points directly at a model directory (e.g.
gemma-4-e2b-it-gguf/ containing config.json and .gguf files),
_scan_models_dir previously skipped the directory itself and listed
individual .gguf files as standalone models. The gguf-variants endpoint
then received file paths instead of directory paths, causing
list_local_gguf_variants to return an empty list ("No GGUF variants
found").
Three fixes:
1. _scan_models_dir: detect when the scanned directory itself is a model
(has BOTH a config file AND weight files) and return it as a single
entry. Both conditions are required to avoid false positives on bare
.gguf collections or config-only directories.
2. _scan_lmstudio_dir: early-return when the directory has config files
(not a publisher structure), and skip child directories that are
model directories rather than treating them as publisher folders.
3. list_local_gguf_variants: fall back to the parent directory when a
.gguf file path is passed instead of a directory.
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Code Review
This pull request enhances model directory detection by checking for both configuration files and weights, and it updates GGUF variant listing to support individual file paths. Feedback identifies a missing .gguf check in the LM Studio directory scanner that could lead to duplicate entries and suggests optimizing multiple directory scans into a single pass for better performance.
| # handles it. Only treat it as a publisher directory otherwise. | ||
| _child_is_model = ( | ||
| (child / "config.json").exists() | ||
| or (child / "adapter_config.json").exists() | ||
| or any(child.glob("*.safetensors")) | ||
| or any(child.glob("*.bin")) |
There was a problem hiding this comment.
The _child_is_model check in _scan_lmstudio_dir is missing a check for .gguf files. Since LM Studio primarily uses GGUF models, omitting this check means that model directories containing only GGUF files and a config will not be correctly identified and skipped, leading to the duplicate/broken entries this PR aims to fix.
_child_is_model = (
(child / "config.json").exists()
or (child / "adapter_config.json").exists()
or any(child.glob("*.safetensors"))
or any(child.glob("*.bin"))
or any(child.glob("*.gguf"))
)| any(models_dir.glob("*.gguf")) | ||
| or any(models_dir.glob("*.safetensors")) | ||
| or any(models_dir.glob("*.bin")) | ||
| ) | ||
| _is_self_model = _has_config and _has_weights |
There was a problem hiding this comment.
The current implementation performs three separate directory scans using glob. This can be optimized into a single pass over the directory contents to improve performance, especially when scanning multiple custom folders.
| any(models_dir.glob("*.gguf")) | |
| or any(models_dir.glob("*.safetensors")) | |
| or any(models_dir.glob("*.bin")) | |
| ) | |
| _is_self_model = _has_config and _has_weights | |
| _has_weights = any( | |
| f.suffix.lower() in (".gguf", ".safetensors", ".bin") | |
| for f in models_dir.iterdir() | |
| if f.is_file() | |
| ) |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d955ee22c7
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if p.is_file() and p.suffix.lower() == ".gguf": | ||
| p = p.parent |
There was a problem hiding this comment.
Normalize local GGUF file paths before variant lookup
This change returns variants when repo_id is a .gguf file by scanning the parent directory, but downstream loading still treats that same repo_id as a file path. When a user selects a quant, ModelConfig.from_identifier(..., gguf_variant=...) calls _find_local_gguf_by_variant(path, variant), which only works for directories, so GGUF detection is skipped and the model load path becomes invalid for standalone file entries. In short, variant selection now appears to work for file rows but can fail at load time unless path normalization is made consistent across both code paths.
Useful? React with 👍 / 👎.
| if (lm_dir / "config.json").exists() or ( | ||
| lm_dir / "adapter_config.json" | ||
| ).exists(): | ||
| return [] |
There was a problem hiding this comment.
Avoid short-circuiting LM Studio scans on root config files
This early return assumes _scan_models_dir will handle model-directory detection, but for LM Studio roots list_local_models invokes only _scan_lmstudio_dir. If a discovered LM Studio directory (for example a downloadsFolder override) points directly to a model directory with config.json, this branch now drops it entirely and no model is listed. The LM scanner should either emit that model itself or the caller must also run _scan_models_dir for LM Studio roots.
Useful? React with 👍 / 👎.
…com/JYYYYYT/unsloth into fix/local-folder-scan-gguf-variants # Conflicts: # studio/backend/routes/models.py
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a4e5673ec4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if p.is_file() and p.suffix.lower() == ".gguf": | ||
| p = p.parent |
There was a problem hiding this comment.
Keep GGUF file entries scoped to their own variants
When a standalone .gguf file path is passed, this fallback now scans the entire parent directory, which breaks the “loose GGUF files are separate entries” behavior from _scan_models_dir. In a folder containing multiple unrelated GGUF models, variant listing for one file can include sibling-model quants, and load-time resolution can pick a different file than the one the user selected (silent model switch). This is especially risky because the UI recommends a default variant from the merged set, not from the selected file.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Keep GGUF file entries scoped to their own variants
When a standalone
.gguffile path is passed, this fallback now scans the entire parent directory, which breaks the “loose GGUF files are separate entries” behavior from_scan_models_dir. In a folder containing multiple unrelated GGUF models, variant listing for one file can include sibling-model quants, and load-time resolution can pick a different file than the one the user selected (silent model switch). This is especially risky because the UI recommends a default variant from the merged set, not from the selected file.Useful? React with 👍 / 👎.
This fallback is a defensive safety net that only triggers for standalone .gguf file entries (Phase 2 of _scan_models_dir), i.e. directories without config.json
that contain loose GGUF files. The primary fix — the _is_self_model check — ensures that any proper model directory (with both config and weights) is returned as a
single entry with a directory path, so this fallback is never reached in that case.
For the loose-file scenario, the previous behavior was returning an empty array ("No GGUF variants found"), which was completely non-functional. Scanning the
parent directory is imperfect when unrelated models coexist, but it's strictly better than returning nothing. This is an edge case of an edge case — a directory with
multiple unrelated .gguf files and no config.json — and not a regression from this PR.
…model scan - list_local_gguf_variants / _find_local_gguf_by_variant: only fall back to parent directory when parent has model metadata (config.json / adapter_config.json), preventing cross-wiring of unrelated sibling GGUFs in loose directories. - _scan_lmstudio_dir root check: require BOTH config AND weight files (consistent with _scan_models_dir) so a stray config.json alone does not suppress scanning of nested LM Studio models. - _scan_lmstudio_dir child check: only skip children with config metadata, not weight files -- publisher directories may legitimately contain direct .gguf files that the inner scan handles. - _scan_models_dir: do not early-return when directory is a self-model; instead prepend the self-model entry and continue scanning children. Skip the loose .gguf scan when root is already a model to avoid duplicate entries.
- Extract _is_model_directory() helper in models.py to deduplicate the config+weights detection logic used by both _scan_models_dir and _scan_lmstudio_dir. Single source of truth for model directory heuristic. - Extract _resolve_gguf_dir() helper in model_config.py to deduplicate the GGUF file-to-parent resolution logic used by both list_local_gguf_variants and _find_local_gguf_by_variant. - Restore early return in _scan_models_dir when directory is a self-model to prevent surfacing bogus entries from nested helper directories like 1_Pooling/ or tokenizer/ that also contain config.json. - Use _is_model_directory() for child skip in _scan_lmstudio_dir, requiring BOTH config AND weights consistently.
_scan_lmstudio_dir is the only scanner called for default LM Studio roots. When a model directory was detected, the previous fix returned [] or continued, causing those models to silently disappear. Now _scan_lmstudio_dir surfaces model directories directly as lmstudio entries instead of skipping them: - Root-level model dir: return it as a single entry - Child model dir: append it to found and continue (don't descend into it as a publisher directory) This preserves the fix for duplicate broken entries while ensuring no models are dropped from LM Studio scans.
The previous heuristic treated any .gguf/.safetensors/.bin file as a model weight, causing false positives: - config.json + mmproj-BF16.gguf (vision projector, not a model weight) - config.json + tokenizer.bin (tokenizer file, not a model weight) Now _is_weight_file excludes: - .gguf files containing "mmproj" in the name - .bin files that are not pytorch_model*, model*, adapter_model*, or consolidated* (excludes tokenizer.bin, vocab.bin, etc.) - .safetensors files are always treated as weights (safe assumption)
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 44ac1f74ce
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if _is_self_model: | ||
| try: | ||
| updated_at = models_dir.stat().st_mtime | ||
| except OSError: | ||
| updated_at = None | ||
| return [ |
There was a problem hiding this comment.
Continue scanning after identifying root model directory
This early return drops all nested models whenever the scan root itself has config.json/weights. In mixed layouts (a root model plus additional model subfolders), the scanner now returns only the root entry, so valid child models disappear from custom-folder and LM Studio discovery results. Instead of returning immediately, add the root model to found and continue scanning children so both root and nested models are discoverable.
Useful? React with 👍 / 👎.
…ting directly at a model directory (unslothai#4860) Fix custom folder scanning when pointing directly at a model directory. When a user adds a custom scan folder that points directly at a model directory (e.g. /path/to/gemma-4-e2b-it-gguf/ containing config.json and gemma-4-E2B-it-BF16.gguf), the model list previously showed individual .gguf files as separate entries instead of recognizing the directory as a single model. Clicking any entry showed "No GGUF variants found" because list_local_gguf_variants received a file path and immediately returned empty. Changes: - Add _is_model_directory() helper that detects directories with both config metadata and actual model weight files (excludes mmproj GGUFs and non-weight .bin files like tokenizer.bin) - _scan_models_dir: detect self-model and return single directory entry - _scan_lmstudio_dir: surface model directories directly instead of descending into them as publisher folders; handle both root and child model directories - Add _resolve_gguf_dir() helper for GGUF path resolution that only falls back to parent directory when parent has model metadata - list_local_gguf_variants / _find_local_gguf_by_variant: use resolver so .gguf file paths inside model directories work correctly
…ting directly at a model directory (unslothai#4860) Fix custom folder scanning when pointing directly at a model directory. When a user adds a custom scan folder that points directly at a model directory (e.g. /path/to/gemma-4-e2b-it-gguf/ containing config.json and gemma-4-E2B-it-BF16.gguf), the model list previously showed individual .gguf files as separate entries instead of recognizing the directory as a single model. Clicking any entry showed "No GGUF variants found" because list_local_gguf_variants received a file path and immediately returned empty. Changes: - Add _is_model_directory() helper that detects directories with both config metadata and actual model weight files (excludes mmproj GGUFs and non-weight .bin files like tokenizer.bin) - _scan_models_dir: detect self-model and return single directory entry - _scan_lmstudio_dir: surface model directories directly instead of descending into them as publisher folders; handle both root and child model directories - Add _resolve_gguf_dir() helper for GGUF path resolution that only falls back to parent directory when parent has model metadata - list_local_gguf_variants / _find_local_gguf_by_variant: use resolver so .gguf file paths inside model directories work correctly
Problem
When adding a custom scan folder that points directly at a model directory
(e.g.
/path/to/gemma-4-e2b-it-gguf/containingconfig.jsonandgemma-4-E2B-it-BF16.gguf), the model list shows individual.gguffilesas separate entries instead of recognizing the directory as a single model.
Clicking any of these entries shows "No GGUF variants found" because
list_local_gguf_variantsreceives a file path instead of a directory pathand
is_dir()returnsFalse.Additionally,
_scan_lmstudio_dirmisidentifies the model directory as anLM Studio publisher folder, creating duplicate broken entries.
Steps to reproduce
gemma-4-e2b-it-gguf/withconfig.json,gemma-4-E2B-it-BF16.gguf,mmproj-BF16.gguf)pointing directly at the model directory
gemma-4-E2B-it-BF16andmmproj-BF16) insteadof one (
gemma-4-e2b-it-gguf)produces duplicate broken entries from the LM Studio scanner
For example
/Users/shisheng/Documents/llm-workspace/models/gemma-4-E2B-it-gguf/
├── config.json
├── configuration.json
├── gemma-4-E2B-it-BF16.gguf (9.3 GB)
├── imatrix_unsloth.gguf_file
└── mmproj-BF16.gguf (987 MB)
When I add the directory above as a custom model folder, I see two entries:
And when I add the
/Users/shisheng/Documents/llm-workspace/models/directory as a custom model folder, I see these entries:Related Issues
After searching the issue tracker, I found no existing reports for this bug. This PR directly addresses the issue with a minimal fix.
Fix
1.
_scan_models_dir— detect self-as-model (routes/models.py)Before scanning subdirectories, check whether the directory itself is a
model: it must have both a config file (
config.jsonoradapter_config.json) and weight files (.gguf,.safetensors, or.bin). Both conditions are required to avoid false positives:.gguffiles (no config) may be a mixedcollection → should list files individually (existing behavior)
config.jsonalone (no weights) is not a model directory2.
_scan_lmstudio_dir— skip model directories (routes/models.py)model, not a publisher structure)
directories, not publisher directories)
3.
list_local_gguf_variants— file path fallback (model_config.py)If a
.gguffile path is passed instead of a directory, fall back to theparent directory. This is a defensive safety net for any remaining code
paths that may pass file paths.
Test plan