Combine studio setup fixes: frontend caching, venv isolation, Windows CPU support#4413
Conversation
setup.ps1 previously hard-exited if nvidia-smi was not found, blocking setup entirely on CPU-only or non-NVIDIA machines. The backend already supports CPU and MLX (Apple Silicon) in chat-only GGUF mode, and the Linux/Mac setup.sh handles missing GPUs gracefully. Changes: - Convert the GPU check from a hard exit to a warning - Guard CUDA toolkit installation behind $HasNvidiaSmi - Install CPU-only PyTorch when no GPU is detected - Build llama.cpp without CUDA flags when no GPU is present - Update doc comment to reflect CPU support
Skip the frontend npm install + build if frontend/dist already exists. Previously setup.ps1 nuked node_modules and package-lock.json on every run, and both scripts always rebuilt even when dist/ was already present. On a git clone editable install, the first setup run still builds the frontend as before. Subsequent runs skip it, saving several minutes. To force a rebuild, delete frontend/dist and re-run setup.
The torch CUDA wheel is ~2.8 GB and the CPU wheel is ~300 MB. With | Out-Null suppressing all output, the install appeared completely frozen with no feedback. Remove | Out-Null for the torch install lines so pip's download progress bar is visible. Add a size hint so users know the download is expected to take a while. Also moves the Triton success message inside the GPU branch so it only prints when Triton was actually installed.
The CUDA_PATH re-sanitization block (lines 1020-1033) references $CudaToolkitRoot which is only set when $HasNvidiaSmi is true and the CUDA Toolkit section runs. On CPU-only machines, $CudaToolkitRoot is null, causing Split-Path to throw: Split-Path : Cannot bind argument to parameter 'Path' because it is null. Wrap the entire block in `if ($HasNvidiaSmi -and $CudaToolkitRoot)`.
Instead of only checking if dist/ exists, compare source file timestamps against the dist/ directory. If any file in frontend/src/ is newer than dist/, trigger a rebuild. This handles the case where a developer pulls new frontend changes and re-runs setup -- stale assets get rebuilt automatically.
Two issues fixed: 1. After winget installs cmake, Refresh-Environment may not pick up the new PATH entry (MSI PATH changes sometimes need a new shell). Added a fallback that probes cmake's default install locations (Program Files, LocalAppData) and adds the directory to PATH explicitly if found. 2. If cmake is still unavailable when the llama.cpp build starts (e.g. winget failed silently or PATH was not updated), the build now skips gracefully with a [SKIP] warning instead of crashing with "cmake : The term 'cmake' is not recognized".
Address review feedback: - Check entire frontend/ directory for changes, not just src/. The build also depends on package.json, vite.config.ts, tailwind.config.ts, public/, and other config files. A change to any of these now triggers a rebuild. - Move oxc-validator npm install outside the frontend build gate in setup.sh so it always runs on setup, matching setup.ps1 which already had it outside the gate.
…tion Two fixes for issue #4405 (Windows setup fails at cmake configure): 1. cmake configure: capture output and display it on failure instead of piping to Out-Null. When the error mentions "No CUDA toolset found", print a hint about the CUDA VS integration files. 2. CUDA VS integration copy: when the direct Copy-Item fails (needs admin access to write to Program Files), retry with Start-Process -Verb RunAs to prompt for elevation. This is the root cause of the "No CUDA toolset found" cmake failure -- the .targets files that let MSBuild compile .cu files are missing from the VS BuildCustomizations directory.
… error check 1. Persist cmake PATH to user registry so Refresh-Environment cannot drop it later in the same setup run. Previously the process-only PATH addition at phase 1 could vanish when Refresh-Environment rebuilt PATH from registry during phase 2/3 installs. 2. Clean stale CMake cache before configure. If a previous run built with CUDA and the user reruns without a GPU (or vice versa), the cached GGML_CUDA value would persist. Now the build dir is removed before configure. 3. Explicitly set -DGGML_CUDA=OFF for CPU-only builds instead of just omitting CUDA flags. This prevents cmake from auto-detecting a partial CUDA installation. 4. Fix CUDA cmake flag indentation -- was misaligned from the original PR, now consistently indented inside the if/else block. 5. Fail hard if pip install torch returns a non-zero exit code instead of silently continuing with a broken environment.
Drop GGML_CUDA_FA_ALL_QUANTS, GGML_CUDA_F16, GGML_CUDA_GRAPHS, GGML_CUDA_FORCE_CUBLAS, and GGML_CUDA_PEER_MAX_BATCH_SIZE flags. The Linux build in setup.sh only sets GGML_CUDA=ON and lets llama.cpp use its defaults for everything else. Keep Windows consistent.
…ary rebuild 1. GPU detection: fallback to default nvidia-smi install locations (Program Files\NVIDIA Corporation\NVSMI, System32) when nvidia-smi is not on PATH. Prevents silent CPU-only provisioning on machines that have a GPU but a broken PATH. 2. Triton: check $LASTEXITCODE after pip install and print [WARN] on failure instead of unconditional [OK]. 3. Stale llama-server: check CMakeCache.txt for GGML_CUDA setting and rebuild if the existing binary does not match the current GPU mode (e.g. CUDA binary on a now-CPU-only rerun, or vice versa).
Addresses reviewer feedback on the frontend caching logic: 1. setup.sh: Fix broken find command that caused exit under pipefail. The piped `find | xargs find -newer` had paths after the expression which GNU find rejects. Replaced with a simpler `find -maxdepth 1 -type f -newer dist/` that checks ALL top-level files (catches index.html, bun.lock, etc. that the extension allowlist missed). 2. setup.sh: Guard oxc-validator npm install behind `command -v npm` check. When the frontend build is skipped (dist/ is cached), Node bootstrap is also skipped, so npm may not be available. 3. setup.ps1: Replace Get-ChildItem -Include with explicit path probing for src/ and public/. PowerShell's -Include without a trailing wildcard silently returns nothing, so src/public changes were never detected. Also check ALL top-level files instead of just .json/.ts/.js/.mjs extensions.
- All platforms (including Colab) now create ~/.unsloth/studio/.venv with --without-pip fallback for broken ensurepip environments - Add --python sys.executable to uv pip install in install_python_stack.py so uv targets the correct venv instead of system Python - Centralize .venv_t5 bootstrap in transformers_version.py with proper validation (checks required packages exist, not just non-empty dir) - Replace ~150 lines of duplicated install code across 3 worker files with calls to the shared _ensure_venv_t5_exists() helper - Use uv-if-present with pip fallback; do not install uv at runtime - Add site.addsitedir() shim in colab.py so notebook cells can import studio packages from the venv without system-Python double-install - Update .venv_t5 packages: huggingface_hub 1.3.0->1.7.1, add hf_xet - Bump transformers pin 4.57.1->4.57.6 in requirements + constraints - Add Fast-Install helper to setup.ps1 with uv+pip fallback - Keep Colab-specific completion banner in setup.sh
for more information, see https://pre-commit.ci
1. Store nvidia-smi as an absolute path ($NvidiaSmiExe) on first detection. All later calls (Get-CudaComputeCapability, Get-PytorchCudaTag, CUDA toolkit detection) use this absolute path instead of relying on PATH. This survives Refresh-Environment which rebuilds PATH from the registry and drops process-only additions. 2. Make cmake fatal for CPU-only installs. CPU-only machines depend entirely on llama-server for GGUF chat mode, so reporting "Setup Complete!" without it is misleading. GPU machines can still skip the llama-server build since they have other inference paths.
- setup.sh: Replace broken `find | xargs find -newer` pipeline with single `find ... -newer` call. The old pipeline produced "paths must precede expression" errors (silently suppressed by 2>/dev/null), causing top-level config changes to never trigger a rebuild. - setup.sh: Add `command -v npm` guard to oxc-validator block so it does not fail when Node was not installed (build-skip path). - setup.ps1: Replace `Get-ChildItem -Include` (unreliable without -Recurse on PS 5.1) with explicit directory paths for src/ and public/ scanning. - Both: Add *.html to tracked file patterns so index.html (Vite entry point) changes trigger a rebuild. - Both: Use -print -quit instead of piping to head -1 for efficiency.
…ombined/studio-setup-fixes # Conflicts: # studio/setup.ps1
- setup.sh: Add || true guard to find command that checks frontend/src and frontend/public dirs, preventing script abort under set -euo pipefail when either directory is missing - colab.py: Use sys.path.insert(0, ...) instead of site.addsitedir() so Studio venv packages take priority over system copies. Add warning when venv is missing instead of silently failing. - transformers_version.py: _venv_t5_is_valid() now checks installed package versions via .dist-info metadata, not just directory presence. Prevents false positives from stale or wrong-version packages. - transformers_version.py: _install_to_venv_t5() now passes --upgrade so pip replaces existing stale packages in the target directory. - setup.ps1: CPU-only PyTorch install uses --index-url for cpu wheel and all install commands use Fast-Install (uv with pip fallback).
for more information, see https://pre-commit.ci
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request consolidates several improvements to the Unsloth Studio setup process, focusing on enhancing efficiency, reliability, and platform compatibility. It introduces intelligent caching for frontend builds to accelerate subsequent setups, strengthens virtual environment isolation to prevent conflicts with system Python installations, and extends support to Windows users without NVIDIA GPUs, enabling CPU-only operations. These changes collectively streamline the initial setup experience and broaden the accessibility of Unsloth Studio across diverse development environments. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request combines several significant improvements to the studio setup process, including frontend build caching, better venv isolation for Colab, and new CPU-only support for Windows. The setup scripts for both PowerShell and bash have been refactored for robustness and performance, notably by using uv for faster package installation. The validation logic for the transformers v5 environment is also much more thorough. I have one suggestion to further improve the robustness of this validation logic.
| pkg_version, | ||
| ) | ||
| return False | ||
| dist_info_found = True |
There was a problem hiding this comment.
The break statement here will cause the loop over dist-info directories to exit after the first iteration, regardless of whether a valid METADATA file was found. If glob returns multiple directories and the first one doesn't contain a METADATA file (e.g., a leftover from a failed installation), the check will incorrectly fail for this package because dist_info_found remains False.
Removing this break will make the validation more robust by allowing it to check all found dist-info directories until a valid one is processed.
Remove premature break that caused the loop over .dist-info directories to exit after the first match even if it had no METADATA file. Now continues iterating until a valid METADATA is found or all dirs are exhausted.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 15c93e68e8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| dist_info_found = True | ||
| break | ||
| break | ||
| if not dist_info_found: | ||
| return False |
There was a problem hiding this comment.
Validate all dist-info matches before failing version check
pip install --target --upgrade can leave multiple *.dist-info directories for the same package, and this loop unconditionally breaks after the first match. That makes _venv_t5_is_valid() depend on filesystem iteration order: if an older metadata directory is visited first, the function reports the venv invalid even when the required version is present, causing repeated runtime reinstalls (or activation failures in offline environments). Keep scanning until a matching version is found instead of exiting after the first entry.
Useful? React with 👍 / 👎.
| _VENV_T5_PACKAGES = ( | ||
| f"transformers=={TRANSFORMERS_5_VERSION}", | ||
| "huggingface_hub==1.7.1", | ||
| "hf_xet==1.4.2", | ||
| ) |
There was a problem hiding this comment.
Make hf_xet optional outside supported architectures
Adding hf_xet==1.4.2 to _VENV_T5_PACKAGES makes it a hard requirement for every platform because _ensure_venv_t5_exists() fails if any package in this tuple is unavailable. On platforms where hf_xet is not distributed, transformers 5.x activation will now fail even though transformers and huggingface_hub would otherwise work, so this should be conditional or best-effort rather than mandatory.
Useful? React with 👍 / 👎.
setup.ps1: 6 locations changed from `| Out-Null` to `| Out-String` with output shown on failure -- PyTorch GPU/CPU install, Triton install, venv_t5 package loop, cmake llama-server and llama-quantize builds. transformers_version.py: clean stale .venv_t5 directory before reinstall when validation detects missing or version-mismatched packages.
for more information, see https://pre-commit.ci
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request is a substantial and well-executed effort to improve the setup process for Unsloth Studio. It successfully combines several important fixes, including frontend caching, better virtual environment isolation, and crucial CPU support for Windows users. The refactoring of environment management logic into centralized utility functions is a significant improvement for maintainability. The setup scripts are now more robust, faster (thanks to uv integration), and more user-friendly. The suggestions to further improve maintainability by reducing duplication of package versions across the setup scripts remain valid.
| foreach ($pkg in @("transformers==5.3.0", "huggingface_hub==1.7.1", "hf_xet==1.4.2")) { | ||
| $output = Fast-Install --target $VenvT5Dir --no-deps $pkg | Out-String | ||
| if ($LASTEXITCODE -ne 0) { | ||
| Write-Host "[FAIL] Could not install $pkg into .venv_t5/" -ForegroundColor Red | ||
| Write-Host $output -ForegroundColor Red | ||
| $ErrorActionPreference = $prevEAP_t5 | ||
| exit 1 | ||
| } | ||
| } |
There was a problem hiding this comment.
To improve maintainability and avoid version drift, consider sourcing the list of transformers 5.x packages directly from the Python source of truth (_VENV_T5_PACKAGES in transformers_version.py) instead of hardcoding them here. This ensures that any updates to the package list are automatically reflected in this script.
$packages_str = python -c "import sys; sys.path.insert(0, 'backend'); from utils.transformers_version import _VENV_T5_PACKAGES; print(' '.join(_VENV_T5_PACKAGES))"
$packages = $packages_str.Split(' ')
foreach ($pkg in $packages) {
if (-not $pkg) { continue }
$output = Fast-Install --target $VenvT5Dir --no-deps $pkg | Out-String
if ($LASTEXITCODE -ne 0) {
Write-Host "[FAIL] Could not install $pkg into .venv_t5/" -ForegroundColor Red
Write-Host $output -ForegroundColor Red
$ErrorActionPreference = $prevEAP_t5
exit 1
}
}
| run_quiet "install transformers 5.x" fast_install --target "$VENV_T5_DIR" --no-deps "transformers==5.3.0" | ||
| run_quiet "install huggingface_hub for t5" fast_install --target "$VENV_T5_DIR" --no-deps "huggingface_hub==1.7.1" | ||
| run_quiet "install hf_xet for t5" fast_install --target "$VENV_T5_DIR" --no-deps "hf_xet==1.4.2" |
There was a problem hiding this comment.
To improve maintainability and avoid version drift, consider sourcing the list of transformers 5.x packages directly from the Python source of truth (_VENV_T5_PACKAGES in transformers_version.py) instead of hardcoding them here. This ensures that any updates to the package list are automatically reflected in this script.
| run_quiet "install transformers 5.x" fast_install --target "$VENV_T5_DIR" --no-deps "transformers==5.3.0" | |
| run_quiet "install huggingface_hub for t5" fast_install --target "$VENV_T5_DIR" --no-deps "huggingface_hub==1.7.1" | |
| run_quiet "install hf_xet for t5" fast_install --target "$VENV_T5_DIR" --no-deps "hf_xet==1.4.2" | |
| VENV_T5_PACKAGES_STR=$(python -c "import sys; sys.path.insert(0, 'backend'); from utils.transformers_version import _VENV_T5_PACKAGES; print('\n'.join(_VENV_T5_PACKAGES))") | |
| readarray -t VENV_T5_PACKAGES <<< "$VENV_T5_PACKAGES_STR" | |
| run_quiet "install transformers 5.x" fast_install --target "$VENV_T5_DIR" --no-deps "${VENV_T5_PACKAGES[0]}" | |
| run_quiet "install huggingface_hub for t5" fast_install --target "$VENV_T5_DIR" --no-deps "${VENV_T5_PACKAGES[1]}" | |
| run_quiet "install hf_xet for t5" fast_install --target "$VENV_T5_DIR" --no-deps "${VENV_T5_PACKAGES[2]}" |
The backend uses bare "from utils.*" imports everywhere, relying on backend/ being on sys.path. Workers and routes add it at startup, but the CLI imports studio.backend.core as a package -- backend/ was never added. Add sys.path setup at the top of core/__init__.py so lazy imports resolve correctly regardless of entry point. Fixes: unsloth inference unsloth/Qwen3-8B "who are you" crashing with "No module named 'utils'"
The extension allowlist (*.json, *.ts, *.js, *.mjs, *.html) missed files like bun.lock, so lockfile-only dependency changes could skip the frontend rebuild. Check all top-level files instead.
Qwen models use tiktoken-based tokenizers which fail when routed through the transformers 5.x overlay without tiktoken installed. Add it to the setup scripts (with deps for Windows) and runtime fallback list. Integrates PR #4418.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive set of improvements to the studio setup process, focusing on frontend build caching, better virtual environment isolation, and adding CPU-only support for Windows. The changes are well-structured, particularly the refactoring of dependency management to use uv for faster installations and centralizing the transformers version switching logic.
I've found two critical issues that need to be addressed:
- A bug in
studio/backend/utils/transformers_version.pythat will cause a crash when validating thetransformers 5.xvenv due to an unhandled package without a version specifier. - A syntax error in
studio/setup.ps1due to an extra closing brace.
Once these issues are fixed, this PR will significantly improve the robustness and user experience of the setup process.
| def _venv_t5_is_valid() -> bool: | ||
| """Return True if .venv_t5/ has all required packages at the correct versions.""" | ||
| if not os.path.isdir(_VENV_T5_DIR) or not os.listdir(_VENV_T5_DIR): | ||
| return False | ||
| # Check that the key package directories exist AND match the required version | ||
| for pkg_spec in _VENV_T5_PACKAGES: | ||
| pkg_name, pkg_version = pkg_spec.split("==") | ||
| pkg_name_norm = pkg_name.replace("-", "_") | ||
| # Check directory exists | ||
| if not any( | ||
| (Path(_VENV_T5_DIR) / d).is_dir() | ||
| for d in (pkg_name_norm, pkg_name_norm.replace("_", "-")) | ||
| ): | ||
| return False | ||
| # Check version via .dist-info metadata | ||
| dist_info_found = False | ||
| for di in Path(_VENV_T5_DIR).glob(f"{pkg_name_norm}-*.dist-info"): | ||
| metadata = di / "METADATA" | ||
| if not metadata.is_file(): | ||
| continue | ||
| for line in metadata.read_text(errors = "replace").splitlines(): | ||
| if line.startswith("Version:"): | ||
| installed_ver = line.split(":", 1)[1].strip() | ||
| if installed_ver != pkg_version: | ||
| logger.info( | ||
| ".venv_t5 has %s==%s but need %s", | ||
| pkg_name, | ||
| installed_ver, | ||
| pkg_version, | ||
| ) | ||
| return False | ||
| dist_info_found = True | ||
| break | ||
| if dist_info_found: | ||
| break | ||
| if not dist_info_found: | ||
| return False | ||
| return True |
There was a problem hiding this comment.
The function _venv_t5_is_valid will raise a ValueError when processing _VENV_T5_PACKAGES. The package tiktoken does not have a version specified with ==, which will cause pkg_spec.split('==') to fail.
The logic should be updated to handle packages without a version specifier. For such packages, only existence should be checked, not the version.
def _venv_t5_is_valid() -> bool:
"""Return True if .venv_t5/ has all required packages at the correct versions."""
if not os.path.isdir(_VENV_T5_DIR) or not os.listdir(_VENV_T5_DIR):
return False
# Check that the key package directories exist AND match the required version
for pkg_spec in _VENV_T5_PACKAGES:
pkg_parts = pkg_spec.split("==")
pkg_name = pkg_parts[0]
pkg_version = pkg_parts[1] if len(pkg_parts) > 1 else None
pkg_name_norm = pkg_name.replace("-", "_")
# Check directory exists
if not any(
(Path(_VENV_T5_DIR) / d).is_dir()
for d in (pkg_name_norm, pkg_name_norm.replace("_", "-"))
):
return False
# If no version is specified, we only check for existence.
if pkg_version is None:
continue
# Check version via .dist-info metadata
dist_info_found = False
for di in Path(_VENV_T5_DIR).glob(f"{pkg_name_norm}-*.dist-info"):
metadata = di / "METADATA"
if not metadata.is_file():
continue
for line in metadata.read_text(errors = "replace").splitlines():
if line.startswith("Version:"):
installed_ver = line.split(":", 1)[1].strip()
if installed_ver != pkg_version:
logger.info(
".venv_t5 has %s==%s but need %s",
pkg_name,
installed_ver,
pkg_version,
)
return False
dist_info_found = True
break
if dist_info_found:
break
if not dist_info_found:
return False
return True| exit 1 | ||
| Write-Host "[WARN] Could not install tiktoken into .venv_t5/ -- Qwen tokenizers may fail" -ForegroundColor Yellow | ||
| } | ||
| } |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 27df4666e6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| return False | ||
| # Check that the key package directories exist AND match the required version | ||
| for pkg_spec in _VENV_T5_PACKAGES: | ||
| pkg_name, pkg_version = pkg_spec.split("==") |
There was a problem hiding this comment.
Handle unpinned package specs in venv validation
_venv_t5_is_valid() assumes every entry in _VENV_T5_PACKAGES contains ==, but this commit adds "tiktoken" without a pinned version. The unpack at pkg_name, pkg_version = pkg_spec.split("==") raises ValueError, so any call path that checks transformers 5.x readiness (e.g. _ensure_venv_t5_exists() from worker activation) will crash instead of validating/installing the environment.
Useful? React with 👍 / 👎.
| exit 1 | ||
| Write-Host "[WARN] Could not install tiktoken into .venv_t5/ -- Qwen tokenizers may fail" -ForegroundColor Yellow | ||
| } | ||
| } |
There was a problem hiding this comment.
Remove stray closing brace that breaks setup.ps1 parsing
There is an extra standalone } after the tiktoken warning block, which leaves the script with unbalanced braces. PowerShell parses the whole script before execution, so this prevents setup.ps1 from running at all on Windows (including CPU-only setup paths introduced by this change).
Useful? React with 👍 / 👎.
_venv_t5_is_valid() crashed with ValueError on unpinned packages like "tiktoken" (no ==version). Handle by splitting safely and skipping version check for unpinned packages (existence check only). Also remove stray closing brace in setup.ps1 tiktoken install block.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request is a significant and well-executed effort to improve the Unsloth Studio setup process. It successfully combines three major improvements: frontend build caching, robust virtual environment isolation (especially for Colab), and support for CPU-only Windows setups. The additional bug fixes and reliability enhancements, such as the improved nvidia-smi and cmake discovery in the PowerShell script and the more robust .venv_t5 validation, demonstrate great attention to detail. The introduction of uv for faster package installation is also a welcome performance boost.
My review focuses on further improving the debuggability of the new Fast-Install helper functions in both setup.ps1 and setup.sh. By logging the error output from uv when it fails, users will have a clearer understanding of why the setup process is falling back to the slower pip installer.
Overall, these changes make the setup process faster, more reliable, and more accessible to a wider range of users.
| if ($UseUv) { | ||
| $VenvPy = (Get-Command python).Source | ||
| $result = & uv pip install --python $VenvPy @Args_ 2>&1 | ||
| if ($LASTEXITCODE -eq 0) { return } | ||
| } |
There was a problem hiding this comment.
The Fast-Install function provides a great performance improvement by using uv. However, if the uv command fails, it silently falls back to pip without logging the error from uv. This can make it difficult to diagnose issues with the uv installation or environment. It would be beneficial to log the output from uv upon failure.
if ($UseUv) {
$VenvPy = (Get-Command python).Source
$result = & uv pip install --python $VenvPy @Args_ 2>&1
if ($LASTEXITCODE -eq 0) { return }
Write-Host " [WARN] uv install failed, falling back to pip. uv output:" -ForegroundColor Yellow
Write-Host $result -ForegroundColor Yellow
}
| if [ "$USE_UV" = true ]; then | ||
| uv pip install --python "$(command -v python)" "$@" && return 0 | ||
| fi |
There was a problem hiding this comment.
The fast_install helper function improves installation speed by prioritizing uv. However, when the uv command fails, it silently falls back to pip. For better debuggability, it would be helpful to print the error output from uv so the user understands why the fallback to pip occurred.
| if [ "$USE_UV" = true ]; then | |
| uv pip install --python "$(command -v python)" "$@" && return 0 | |
| fi | |
| if [ "$USE_UV" = true ]; then | |
| if output=$(uv pip install --python "$(command -v python)" "$@" 2>&1); then | |
| return 0 | |
| else | |
| echo " ⚠️ uv install failed, falling back to pip. uv output:" | |
| echo "$output" | |
| fi | |
| fi |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 721577b65b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| f"transformers=={TRANSFORMERS_5_VERSION}", | ||
| "huggingface_hub==1.7.1", | ||
| "hf_xet==1.4.2", | ||
| "tiktoken", |
There was a problem hiding this comment.
Do not require tiktoken for all transformers 5 environments
_VENV_T5_PACKAGES now makes tiktoken mandatory for .venv_t5 validity, but setup intentionally treats tiktoken as best-effort on Windows (setup.ps1 warns and continues if install fails). In that scenario, _venv_t5_is_valid() will always return false and _ensure_venv_t5_exists() will rebuild/fail, so loading any model that needs transformers 5.x fails entirely (not just Qwen tokenizers). This turns an optional dependency failure into a hard activation failure.
Useful? React with 👍 / 👎.
…_BUILD=true (#4427) * revert: remove frontend build caching from setup scripts The mtime-based caching introduced in #4404/#4413 can incorrectly skip frontend builds -- e.g. after git pull when filesystem timestamps are not preserved, or after our Tailwind v4 discovery that the site-packages .gitignore must be hidden before vite build (which the cached path doesn't handle). Always rebuild the frontend on setup. The build takes ~15s and is safer than risking a stale dist/. * revert: disable frontend build caching, keep code commented out Caching disabled by always setting _NEED_FRONTEND_BUILD=true. The mtime-based logic is preserved in comments for future re-enabling. Reasons for disabling: - Git does not preserve file timestamps, so cached dist/ can appear newer than freshly checked-out source after a pull - Tailwind v4 requires hiding site-packages/.gitignore before vite build; the cache path bypasses this, producing broken CSS * revert: always rebuild frontend, remove mtime caching * revert: always rebuild frontend, override caching with _NEED_FRONTEND_BUILD=true
… CPU support (unslothai#4413) * Allow Windows setup to complete without NVIDIA GPU setup.ps1 previously hard-exited if nvidia-smi was not found, blocking setup entirely on CPU-only or non-NVIDIA machines. The backend already supports CPU and MLX (Apple Silicon) in chat-only GGUF mode, and the Linux/Mac setup.sh handles missing GPUs gracefully. Changes: - Convert the GPU check from a hard exit to a warning - Guard CUDA toolkit installation behind $HasNvidiaSmi - Install CPU-only PyTorch when no GPU is detected - Build llama.cpp without CUDA flags when no GPU is present - Update doc comment to reflect CPU support * Cache frontend build across setup runs Skip the frontend npm install + build if frontend/dist already exists. Previously setup.ps1 nuked node_modules and package-lock.json on every run, and both scripts always rebuilt even when dist/ was already present. On a git clone editable install, the first setup run still builds the frontend as before. Subsequent runs skip it, saving several minutes. To force a rebuild, delete frontend/dist and re-run setup. * Show pip progress for PyTorch download on Windows The torch CUDA wheel is ~2.8 GB and the CPU wheel is ~300 MB. With | Out-Null suppressing all output, the install appeared completely frozen with no feedback. Remove | Out-Null for the torch install lines so pip's download progress bar is visible. Add a size hint so users know the download is expected to take a while. Also moves the Triton success message inside the GPU branch so it only prints when Triton was actually installed. * Guard CUDA env re-sanitization behind GPU check in llama.cpp build The CUDA_PATH re-sanitization block (lines 1020-1033) references $CudaToolkitRoot which is only set when $HasNvidiaSmi is true and the CUDA Toolkit section runs. On CPU-only machines, $CudaToolkitRoot is null, causing Split-Path to throw: Split-Path : Cannot bind argument to parameter 'Path' because it is null. Wrap the entire block in `if ($HasNvidiaSmi -and $CudaToolkitRoot)`. * Rebuild frontend when source files are newer than dist/ Instead of only checking if dist/ exists, compare source file timestamps against the dist/ directory. If any file in frontend/src/ is newer than dist/, trigger a rebuild. This handles the case where a developer pulls new frontend changes and re-runs setup -- stale assets get rebuilt automatically. * Fix cmake not found on Windows after winget install Two issues fixed: 1. After winget installs cmake, Refresh-Environment may not pick up the new PATH entry (MSI PATH changes sometimes need a new shell). Added a fallback that probes cmake's default install locations (Program Files, LocalAppData) and adds the directory to PATH explicitly if found. 2. If cmake is still unavailable when the llama.cpp build starts (e.g. winget failed silently or PATH was not updated), the build now skips gracefully with a [SKIP] warning instead of crashing with "cmake : The term 'cmake' is not recognized". * Fix frontend rebuild detection and decouple oxc-validator install Address review feedback: - Check entire frontend/ directory for changes, not just src/. The build also depends on package.json, vite.config.ts, tailwind.config.ts, public/, and other config files. A change to any of these now triggers a rebuild. - Move oxc-validator npm install outside the frontend build gate in setup.sh so it always runs on setup, matching setup.ps1 which already had it outside the gate. * Show cmake errors on failure and retry CUDA VS integration with elevation Two fixes for issue unslothai#4405 (Windows setup fails at cmake configure): 1. cmake configure: capture output and display it on failure instead of piping to Out-Null. When the error mentions "No CUDA toolset found", print a hint about the CUDA VS integration files. 2. CUDA VS integration copy: when the direct Copy-Item fails (needs admin access to write to Program Files), retry with Start-Process -Verb RunAs to prompt for elevation. This is the root cause of the "No CUDA toolset found" cmake failure -- the .targets files that let MSBuild compile .cu files are missing from the VS BuildCustomizations directory. * Address reviewer feedback: cmake PATH persistence, stale cache, torch error check 1. Persist cmake PATH to user registry so Refresh-Environment cannot drop it later in the same setup run. Previously the process-only PATH addition at phase 1 could vanish when Refresh-Environment rebuilt PATH from registry during phase 2/3 installs. 2. Clean stale CMake cache before configure. If a previous run built with CUDA and the user reruns without a GPU (or vice versa), the cached GGML_CUDA value would persist. Now the build dir is removed before configure. 3. Explicitly set -DGGML_CUDA=OFF for CPU-only builds instead of just omitting CUDA flags. This prevents cmake from auto-detecting a partial CUDA installation. 4. Fix CUDA cmake flag indentation -- was misaligned from the original PR, now consistently indented inside the if/else block. 5. Fail hard if pip install torch returns a non-zero exit code instead of silently continuing with a broken environment. * Remove extra CUDA cmake flags to align Windows with Linux build Drop GGML_CUDA_FA_ALL_QUANTS, GGML_CUDA_F16, GGML_CUDA_GRAPHS, GGML_CUDA_FORCE_CUBLAS, and GGML_CUDA_PEER_MAX_BATCH_SIZE flags. The Linux build in setup.sh only sets GGML_CUDA=ON and lets llama.cpp use its defaults for everything else. Keep Windows consistent. * Address reviewer round 2: GPU probe fallback, Triton check, stale binary rebuild 1. GPU detection: fallback to default nvidia-smi install locations (Program Files\NVIDIA Corporation\NVSMI, System32) when nvidia-smi is not on PATH. Prevents silent CPU-only provisioning on machines that have a GPU but a broken PATH. 2. Triton: check $LASTEXITCODE after pip install and print [WARN] on failure instead of unconditional [OK]. 3. Stale llama-server: check CMakeCache.txt for GGML_CUDA setting and rebuild if the existing binary does not match the current GPU mode (e.g. CUDA binary on a now-CPU-only rerun, or vice versa). * Fix frontend rebuild detection and npm dependency issues Addresses reviewer feedback on the frontend caching logic: 1. setup.sh: Fix broken find command that caused exit under pipefail. The piped `find | xargs find -newer` had paths after the expression which GNU find rejects. Replaced with a simpler `find -maxdepth 1 -type f -newer dist/` that checks ALL top-level files (catches index.html, bun.lock, etc. that the extension allowlist missed). 2. setup.sh: Guard oxc-validator npm install behind `command -v npm` check. When the frontend build is skipped (dist/ is cached), Node bootstrap is also skipped, so npm may not be available. 3. setup.ps1: Replace Get-ChildItem -Include with explicit path probing for src/ and public/. PowerShell's -Include without a trailing wildcard silently returns nothing, so src/public changes were never detected. Also check ALL top-level files instead of just .json/.ts/.js/.mjs extensions. * Fix studio setup: venv isolation, centralized .venv_t5, uv targeting - All platforms (including Colab) now create ~/.unsloth/studio/.venv with --without-pip fallback for broken ensurepip environments - Add --python sys.executable to uv pip install in install_python_stack.py so uv targets the correct venv instead of system Python - Centralize .venv_t5 bootstrap in transformers_version.py with proper validation (checks required packages exist, not just non-empty dir) - Replace ~150 lines of duplicated install code across 3 worker files with calls to the shared _ensure_venv_t5_exists() helper - Use uv-if-present with pip fallback; do not install uv at runtime - Add site.addsitedir() shim in colab.py so notebook cells can import studio packages from the venv without system-Python double-install - Update .venv_t5 packages: huggingface_hub 1.3.0->1.7.1, add hf_xet - Bump transformers pin 4.57.1->4.57.6 in requirements + constraints - Add Fast-Install helper to setup.ps1 with uv+pip fallback - Keep Colab-specific completion banner in setup.sh * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix nvidia-smi PATH persistence and cmake requirement for CPU-only 1. Store nvidia-smi as an absolute path ($NvidiaSmiExe) on first detection. All later calls (Get-CudaComputeCapability, Get-PytorchCudaTag, CUDA toolkit detection) use this absolute path instead of relying on PATH. This survives Refresh-Environment which rebuilds PATH from the registry and drops process-only additions. 2. Make cmake fatal for CPU-only installs. CPU-only machines depend entirely on llama-server for GGUF chat mode, so reporting "Setup Complete!" without it is misleading. GPU machines can still skip the llama-server build since they have other inference paths. * Fix broken frontend freshness detection in setup scripts - setup.sh: Replace broken `find | xargs find -newer` pipeline with single `find ... -newer` call. The old pipeline produced "paths must precede expression" errors (silently suppressed by 2>/dev/null), causing top-level config changes to never trigger a rebuild. - setup.sh: Add `command -v npm` guard to oxc-validator block so it does not fail when Node was not installed (build-skip path). - setup.ps1: Replace `Get-ChildItem -Include` (unreliable without -Recurse on PS 5.1) with explicit directory paths for src/ and public/ scanning. - Both: Add *.html to tracked file patterns so index.html (Vite entry point) changes trigger a rebuild. - Both: Use -print -quit instead of piping to head -1 for efficiency. * Fix bugs found during review of PRs unslothai#4404, unslothai#4400, unslothai#4399 - setup.sh: Add || true guard to find command that checks frontend/src and frontend/public dirs, preventing script abort under set -euo pipefail when either directory is missing - colab.py: Use sys.path.insert(0, ...) instead of site.addsitedir() so Studio venv packages take priority over system copies. Add warning when venv is missing instead of silently failing. - transformers_version.py: _venv_t5_is_valid() now checks installed package versions via .dist-info metadata, not just directory presence. Prevents false positives from stale or wrong-version packages. - transformers_version.py: _install_to_venv_t5() now passes --upgrade so pip replaces existing stale packages in the target directory. - setup.ps1: CPU-only PyTorch install uses --index-url for cpu wheel and all install commands use Fast-Install (uv with pip fallback). * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix _venv_t5_is_valid dist-info loop exiting after first directory Remove premature break that caused the loop over .dist-info directories to exit after the first match even if it had no METADATA file. Now continues iterating until a valid METADATA is found or all dirs are exhausted. * Capture error output on failure instead of discarding with Out-Null setup.ps1: 6 locations changed from `| Out-Null` to `| Out-String` with output shown on failure -- PyTorch GPU/CPU install, Triton install, venv_t5 package loop, cmake llama-server and llama-quantize builds. transformers_version.py: clean stale .venv_t5 directory before reinstall when validation detects missing or version-mismatched packages. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix ModuleNotFoundError when CLI imports studio.backend.core The backend uses bare "from utils.*" imports everywhere, relying on backend/ being on sys.path. Workers and routes add it at startup, but the CLI imports studio.backend.core as a package -- backend/ was never added. Add sys.path setup at the top of core/__init__.py so lazy imports resolve correctly regardless of entry point. Fixes: unsloth inference unsloth/Qwen3-8B "who are you" crashing with "No module named 'utils'" * Fix frontend freshness check to detect all top-level file changes The extension allowlist (*.json, *.ts, *.js, *.mjs, *.html) missed files like bun.lock, so lockfile-only dependency changes could skip the frontend rebuild. Check all top-level files instead. * Add tiktoken to .venv_t5 for Qwen-family tokenizers Qwen models use tiktoken-based tokenizers which fail when routed through the transformers 5.x overlay without tiktoken installed. Add it to the setup scripts (with deps for Windows) and runtime fallback list. Integrates PR unslothai#4418. * Fix tiktoken crash in _venv_t5_is_valid and stray brace in setup.ps1 _venv_t5_is_valid() crashed with ValueError on unpinned packages like "tiktoken" (no ==version). Handle by splitting safely and skipping version check for unpinned packages (existence check only). Also remove stray closing brace in setup.ps1 tiktoken install block. --------- Co-authored-by: Daniel Han <danielhanchen@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…_BUILD=true (unslothai#4427) * revert: remove frontend build caching from setup scripts The mtime-based caching introduced in unslothai#4404/unslothai#4413 can incorrectly skip frontend builds -- e.g. after git pull when filesystem timestamps are not preserved, or after our Tailwind v4 discovery that the site-packages .gitignore must be hidden before vite build (which the cached path doesn't handle). Always rebuild the frontend on setup. The build takes ~15s and is safer than risking a stale dist/. * revert: disable frontend build caching, keep code commented out Caching disabled by always setting _NEED_FRONTEND_BUILD=true. The mtime-based logic is preserved in comments for future re-enabling. Reasons for disabling: - Git does not preserve file timestamps, so cached dist/ can appear newer than freshly checked-out source after a pull - Tailwind v4 requires hiding site-packages/.gitignore before vite build; the cache path bypasses this, producing broken CSS * revert: always rebuild frontend, remove mtime caching * revert: always rebuild frontend, override caching with _NEED_FRONTEND_BUILD=true
Summary
Combines three studio setup PRs into one cohesive change, with additional bug fixes found during review:
Bug fixes applied on top
|| trueguard onfindcommand checkingfrontend/srcandfrontend/publicdirs -- prevents script abort underset -euo pipefailwhen either directory is missingsite.addsitedir()tosys.path.insert(0, ...)so Studio venv packages take priority over system copies; added warning when venv is missing_venv_t5_is_valid()now checks installed package versions via.dist-infometadata, not just directory presence -- prevents false positives from stale packages_install_to_venv_t5()now passes--upgradeso pip replaces existing stale packages--index-url https://download.pytorch.org/whl/cputo get the correct wheel; all install commands useFast-Install(uv with pip fallback)Files changed
setup.ps1,setup.shinstall_python_stack.pybackend/colab.pybackend/utils/transformers_version.pybackend/core/{export,inference,training}/worker.pyextras-no-deps.txt,single-env/constraints.txtTest plan
unsloth studio setupon Linux creates~/.unsloth/studio/.venvwith all depsfrontend/dist/triggers a fresh buildpip install unsloth(non-editable) still skips correctlyfrom colab import startworks, packages importable.venv_t5with stale package versions is detected and rebuilt