fix(studio): custom folder scan fails to find GGUF variants when pointing directly at a model directory by JYYYYYT · Pull Request #4860 · unslothai/unsloth

JYYYYYT · 2026-04-05T08:23:41Z

Problem

When adding a custom scan folder that points directly at a model directory
(e.g. /path/to/gemma-4-e2b-it-gguf/ containing config.json and
gemma-4-E2B-it-BF16.gguf), the model list shows individual .gguf files
as separate entries instead of recognizing the directory as a single model.

Clicking any of these entries shows "No GGUF variants found" because
list_local_gguf_variants receives a file path instead of a directory path
and is_dir() returns False.

Additionally, _scan_lmstudio_dir misidentifies the model directory as an
LM Studio publisher folder, creating duplicate broken entries.

Steps to reproduce

Download a GGUF model repo locally (e.g. gemma-4-e2b-it-gguf/ with
config.json, gemma-4-E2B-it-BF16.gguf, mmproj-BF16.gguf)
In Studio, click Select Model → Custom Folders and add the path
pointing directly at the model directory
Two entries appear (gemma-4-E2B-it-BF16 and mmproj-BF16) instead
of one (gemma-4-e2b-it-gguf)
Clicking either entry shows "No GGUF variants found"
Scanning the parent directory works for the top-level entry but still
produces duplicate broken entries from the LM Studio scanner

For example

/Users/shisheng/Documents/llm-workspace/models/gemma-4-E2B-it-gguf/
├── config.json
├── configuration.json
├── gemma-4-E2B-it-BF16.gguf (9.3 GB)
├── imatrix_unsloth.gguf_file
└── mmproj-BF16.gguf (987 MB)

When I add the directory above as a custom model folder, I see two entries:

And when I add the /Users/shisheng/Documents/llm-workspace/models/ directory as a custom model folder, I see these entries:

Related Issues

After searching the issue tracker, I found no existing reports for this bug. This PR directly addresses the issue with a minimal fix.

Fix

1. `_scan_models_dir` — detect self-as-model (routes/models.py)

Before scanning subdirectories, check whether the directory itself is a
model: it must have both a config file (config.json or
adapter_config.json) and weight files (.gguf, .safetensors, or
.bin). Both conditions are required to avoid false positives:

A bare directory with loose .gguf files (no config) may be a mixed
collection → should list files individually (existing behavior)
A config.json alone (no weights) is not a model directory

2. `_scan_lmstudio_dir` — skip model directories (routes/models.py)

Early-return when the scanned directory itself has config files (it's a
model, not a publisher structure)
Skip child directories that contain config/weight files (they are model
directories, not publisher directories)

3. `list_local_gguf_variants` — file path fallback (model_config.py)

If a .gguf file path is passed instead of a directory, fall back to the
parent directory. This is a defensive safety net for any remaining code
paths that may pass file paths.

Test plan

Add custom folder pointing directly at a model dir (with config.json + .gguf) → shows one entry, GGUF variants load correctly
Add custom folder pointing at a parent dir containing model subdirs → each subdir appears once, no duplicates
Directory with only loose .gguf files (no config.json) → files still listed individually (no regression)
Standard HF cache and LM Studio directories still scan correctly

…ights When a custom scan folder points directly at a model directory (e.g. gemma-4-e2b-it-gguf/ containing config.json and .gguf files), _scan_models_dir previously skipped the directory itself and listed individual .gguf files as standalone models. The gguf-variants endpoint then received file paths instead of directory paths, causing list_local_gguf_variants to return an empty list ("No GGUF variants found"). Three fixes: 1. _scan_models_dir: detect when the scanned directory itself is a model (has BOTH a config file AND weight files) and return it as a single entry. Both conditions are required to avoid false positives on bare .gguf collections or config-only directories. 2. _scan_lmstudio_dir: early-return when the directory has config files (not a publisher structure), and skip child directories that are model directories rather than treating them as publisher folders. 3. list_local_gguf_variants: fall back to the parent directory when a .gguf file path is passed instead of a directory.

for more information, see https://pre-commit.ci

gemini-code-assist

Code Review

This pull request enhances model directory detection by checking for both configuration files and weights, and it updates GGUF variant listing to support individual file paths. Feedback identifies a missing .gguf check in the LM Studio directory scanner that could lead to duplicate entries and suggests optimizing multiple directory scans into a single pass for better performance.

gemini-code-assist · 2026-04-05T08:29:54Z

+            # handles it.  Only treat it as a publisher directory otherwise.
+            _child_is_model = (
+                (child / "config.json").exists()
+                or (child / "adapter_config.json").exists()
+                or any(child.glob("*.safetensors"))
+                or any(child.glob("*.bin"))


The _child_is_model check in _scan_lmstudio_dir is missing a check for .gguf files. Since LM Studio primarily uses GGUF models, omitting this check means that model directories containing only GGUF files and a config will not be correctly identified and skipped, leading to the duplicate/broken entries this PR aims to fix.

_child_is_model = ( (child / "config.json").exists() or (child / "adapter_config.json").exists() or any(child.glob("*.safetensors")) or any(child.glob("*.bin")) or any(child.glob("*.gguf")) )

gemini-code-assist · 2026-04-05T08:29:54Z

+            any(models_dir.glob("*.gguf"))
+            or any(models_dir.glob("*.safetensors"))
+            or any(models_dir.glob("*.bin"))
+        )
+        _is_self_model = _has_config and _has_weights


The current implementation performs three separate directory scans using glob. This can be optimized into a single pass over the directory contents to improve performance, especially when scanning multiple custom folders.

Suggested change

any(models_dir.glob("*.gguf"))

or any(models_dir.glob("*.safetensors"))

or any(models_dir.glob("*.bin"))

)

_is_self_model = _has_config and _has_weights

_has_weights = any(

f.suffix.lower() in (".gguf", ".safetensors", ".bin")

for f in models_dir.iterdir()

if f.is_file()

)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d955ee22c7

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-05T08:29:54Z

+        if p.is_file() and p.suffix.lower() == ".gguf":
+            p = p.parent


Normalize local GGUF file paths before variant lookup

This change returns variants when repo_id is a .gguf file by scanning the parent directory, but downstream loading still treats that same repo_id as a file path. When a user selects a quant, ModelConfig.from_identifier(..., gguf_variant=...) calls _find_local_gguf_by_variant(path, variant), which only works for directories, so GGUF detection is skipped and the model load path becomes invalid for standalone file entries. In short, variant selection now appears to work for file rows but can fail at load time unless path normalization is made consistent across both code paths.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-05T08:29:54Z

+        if (lm_dir / "config.json").exists() or (
+            lm_dir / "adapter_config.json"
+        ).exists():
+            return []


Avoid short-circuiting LM Studio scans on root config files

This early return assumes _scan_models_dir will handle model-directory detection, but for LM Studio roots list_local_models invokes only _scan_lmstudio_dir. If a discovered LM Studio directory (for example a downloadsFolder override) points directly to a model directory with config.json, this branch now drops it entirely and no model is listed. The LM scanner should either emit that model itself or the caller must also run _scan_models_dir for LM Studio roots.

Useful? React with 👍 / 👎.

…lookup fallback

…com/JYYYYYT/unsloth into fix/local-folder-scan-gguf-variants # Conflicts: # studio/backend/routes/models.py

for more information, see https://pre-commit.ci

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a4e5673ec4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-05T10:07:23Z

+        if p.is_file() and p.suffix.lower() == ".gguf":
+            p = p.parent


Keep GGUF file entries scoped to their own variants

When a standalone .gguf file path is passed, this fallback now scans the entire parent directory, which breaks the “loose GGUF files are separate entries” behavior from _scan_models_dir. In a folder containing multiple unrelated GGUF models, variant listing for one file can include sibling-model quants, and load-time resolution can pick a different file than the one the user selected (silent model switch). This is especially risky because the UI recommends a default variant from the merged set, not from the selected file.

Useful? React with 👍 / 👎.

Keep GGUF file entries scoped to their own variants

When a standalone .gguf file path is passed, this fallback now scans the entire parent directory, which breaks the “loose GGUF files are separate entries” behavior from _scan_models_dir. In a folder containing multiple unrelated GGUF models, variant listing for one file can include sibling-model quants, and load-time resolution can pick a different file than the one the user selected (silent model switch). This is especially risky because the UI recommends a default variant from the merged set, not from the selected file.

Useful? React with 👍 / 👎.

This fallback is a defensive safety net that only triggers for standalone .gguf file entries (Phase 2 of _scan_models_dir), i.e. directories without config.json
that contain loose GGUF files. The primary fix — the _is_self_model check — ensures that any proper model directory (with both config and weights) is returned as a
single entry with a directory path, so this fallback is never reached in that case.

For the loose-file scenario, the previous behavior was returning an empty array ("No GGUF variants found"), which was completely non-functional. Scanning the
parent directory is imperfect when unrelated models coexist, but it's strictly better than returning nothing. This is an edge case of an edge case — a directory with
multiple unrelated .gguf files and no config.json — and not a regression from this PR.

…model scan - list_local_gguf_variants / _find_local_gguf_by_variant: only fall back to parent directory when parent has model metadata (config.json / adapter_config.json), preventing cross-wiring of unrelated sibling GGUFs in loose directories. - _scan_lmstudio_dir root check: require BOTH config AND weight files (consistent with _scan_models_dir) so a stray config.json alone does not suppress scanning of nested LM Studio models. - _scan_lmstudio_dir child check: only skip children with config metadata, not weight files -- publisher directories may legitimately contain direct .gguf files that the inner scan handles. - _scan_models_dir: do not early-return when directory is a self-model; instead prepend the self-model entry and continue scanning children. Skip the loose .gguf scan when root is already a model to avoid duplicate entries.

- Extract _is_model_directory() helper in models.py to deduplicate the config+weights detection logic used by both _scan_models_dir and _scan_lmstudio_dir. Single source of truth for model directory heuristic. - Extract _resolve_gguf_dir() helper in model_config.py to deduplicate the GGUF file-to-parent resolution logic used by both list_local_gguf_variants and _find_local_gguf_by_variant. - Restore early return in _scan_models_dir when directory is a self-model to prevent surfacing bogus entries from nested helper directories like 1_Pooling/ or tokenizer/ that also contain config.json. - Use _is_model_directory() for child skip in _scan_lmstudio_dir, requiring BOTH config AND weights consistently.

_scan_lmstudio_dir is the only scanner called for default LM Studio roots. When a model directory was detected, the previous fix returned [] or continued, causing those models to silently disappear. Now _scan_lmstudio_dir surfaces model directories directly as lmstudio entries instead of skipping them: - Root-level model dir: return it as a single entry - Child model dir: append it to found and continue (don't descend into it as a publisher directory) This preserves the fix for duplicate broken entries while ensuring no models are dropped from LM Studio scans.

The previous heuristic treated any .gguf/.safetensors/.bin file as a model weight, causing false positives: - config.json + mmproj-BF16.gguf (vision projector, not a model weight) - config.json + tokenizer.bin (tokenizer file, not a model weight) Now _is_weight_file excludes: - .gguf files containing "mmproj" in the name - .bin files that are not pytorch_model*, model*, adapter_model*, or consolidated* (excludes tokenizer.bin, vocab.bin, etc.) - .safetensors files are always treated as weights (safe assumption)

for more information, see https://pre-commit.ci

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 44ac1f74ce

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-06T15:35:59Z

+    if _is_self_model:
+        try:
+            updated_at = models_dir.stat().st_mtime
+        except OSError:
+            updated_at = None
+        return [


Continue scanning after identifying root model directory

This early return drops all nested models whenever the scan root itself has config.json/weights. In mixed layouts (a root model plus additional model subfolders), the scanner now returns only the root entry, so valid child models disappear from custom-folder and LM Studio discovery results. Instead of returning immediately, add the root model to found and continue scanning children so both root and nested models are discoverable.

Useful? React with 👍 / 👎.

…ting directly at a model directory (unslothai#4860) Fix custom folder scanning when pointing directly at a model directory. When a user adds a custom scan folder that points directly at a model directory (e.g. /path/to/gemma-4-e2b-it-gguf/ containing config.json and gemma-4-E2B-it-BF16.gguf), the model list previously showed individual .gguf files as separate entries instead of recognizing the directory as a single model. Clicking any entry showed "No GGUF variants found" because list_local_gguf_variants received a file path and immediately returned empty. Changes: - Add _is_model_directory() helper that detects directories with both config metadata and actual model weight files (excludes mmproj GGUFs and non-weight .bin files like tokenizer.bin) - _scan_models_dir: detect self-model and return single directory entry - _scan_lmstudio_dir: surface model directories directly instead of descending into them as publisher folders; handle both root and child model directories - Add _resolve_gguf_dir() helper for GGUF path resolution that only falls back to parent directory when parent has model metadata - list_local_gguf_variants / _find_local_gguf_by_variant: use resolver so .gguf file paths inside model directories work correctly

JYYYYYT requested a review from rolandtannous as a code owner April 5, 2026 08:23

[pre-commit.ci] auto fixes from pre-commit.com hooks

d955ee2

for more information, see https://pre-commit.ci

gemini-code-assist Bot reviewed Apr 5, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Apr 5, 2026

View reviewed changes

JYYYYYT and others added 3 commits April 5, 2026 18:00

address review: optimize glob, add .gguf to child check, add variant …

19f06ff

…lookup fallback

Merge branch 'fix/local-folder-scan-gguf-variants' of https://github.…

f5ebf10

…com/JYYYYYT/unsloth into fix/local-folder-scan-gguf-variants # Conflicts: # studio/backend/routes/models.py

[pre-commit.ci] auto fixes from pre-commit.com hooks

a4e5673

for more information, see https://pre-commit.ci

chatgpt-codex-connector Bot reviewed Apr 5, 2026

View reviewed changes

danielhanchen mentioned this pull request Apr 6, 2026

fix(studio): custom folder scan fails to find GGUF variants when pointing directly at a model directory unslothai/unsloth-staging-1#5

Closed

JYYYYYT and others added 2 commits April 6, 2026 15:29

address review: add explanatory comment to empty except block

7d4391f

Merge branch 'main' into fix/local-folder-scan-gguf-variants

57dfd81

danielhanchen and others added 5 commits April 6, 2026 14:13

[pre-commit.ci] auto fixes from pre-commit.com hooks

44ac1f7

for more information, see https://pre-commit.ci

danielhanchen merged commit aa4c601 into unslothai:main Apr 6, 2026
1 check passed

chatgpt-codex-connector Bot reviewed Apr 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(studio): custom folder scan fails to find GGUF variants when pointing directly at a model directory#4860

fix(studio): custom folder scan fails to find GGUF variants when pointing directly at a model directory#4860
danielhanchen merged 12 commits into
unslothai:mainfrom
JYYYYYT:fix/local-folder-scan-gguf-variants

JYYYYYT commented Apr 5, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 5, 2026

Uh oh!

gemini-code-assist Bot Apr 5, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 5, 2026

Uh oh!

chatgpt-codex-connector Bot Apr 5, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 5, 2026

Uh oh!

JYYYYYT Apr 5, 2026

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

JYYYYYT commented Apr 5, 2026

Problem

Steps to reproduce

For example

Related Issues

Fix

1. _scan_models_dir — detect self-as-model (routes/models.py)

2. _scan_lmstudio_dir — skip model directories (routes/models.py)

3. list_local_gguf_variants — file path fallback (model_config.py)

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

JYYYYYT Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. `_scan_models_dir` — detect self-as-model (routes/models.py)

2. `_scan_lmstudio_dir` — skip model directories (routes/models.py)

3. `list_local_gguf_variants` — file path fallback (model_config.py)