Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions studio/setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -582,11 +582,30 @@ _LLAMA_CPP_DEGRADED=false
_LLAMA_FORCE_COMPILE="${UNSLOTH_LLAMA_FORCE_COMPILE:-0}"
_REQUESTED_LLAMA_TAG="${UNSLOTH_LLAMA_TAG:-${_DEFAULT_LLAMA_TAG}}"
_HOST_SYSTEM="$(uname -s 2>/dev/null || true)"
_HOST_MACHINE="$(uname -m 2>/dev/null || true)"

# Pick the release repo install_llama_prebuilt.py plans against.
# unslothai/llama.cpp ships only Linux CUDA bundles, so CPU-only Linux
# x86_64 routes to ggml-org for bin-ubuntu-x64.tar.gz. Anything with a
# GPU tool installed stays on unslothai (CUDA bundle / ROCm source build).
_LINUX_HAS_GPU=false
for _GPU_TOOL in nvidia-smi rocminfo amd-smi hipconfig hipinfo; do
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Include compiler probes in GPU routing

On Linux x86_64 hosts where the CUDA/ROCm compiler is installed but these runtime utilities are not on PATH (for example nvcc under /usr/local/cuda/bin or hipcc under /opt/rocm/bin), this loop leaves _LINUX_HAS_GPU=false and the new branch installs the upstream CPU tarball successfully. That suppresses the existing source-build path that explicitly checks those compiler locations and enables -DGGML_CUDA=ON / -DGGML_HIP=ON later in this same script, so those environments silently lose GPU-enabled llama.cpp instead of building it as before.

Useful? React with 👍 / 👎.

if command -v "$_GPU_TOOL" >/dev/null 2>&1; then
_LINUX_HAS_GPU=true
Comment on lines +593 to +594
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Probe usable GPUs instead of tool presence

On Linux x86_64 CPU-only environments that still have GPU utilities on PATH, such as CUDA-based Docker images run without --gpus or hosts with CUDA_VISIBLE_DEVICES hiding all devices, this command -v nvidia-smi check routes setup back to unslothai/llama.cpp. The Python installer already distinguishes this case as has_usable_nvidia=false, but with the unsloth repo it then scans CUDA-only Linux assets and falls back to a source build, so the new CPU prebuilt fast path is skipped exactly for these CPU-only installs. Please make this gate use the same active GPU probing semantics as detect_host() or defer the routing until after that detection.

Useful? React with 👍 / 👎.

break
fi
done

if [ "$_HOST_SYSTEM" = "Darwin" ]; then
_HELPER_RELEASE_REPO="ggml-org/llama.cpp"
elif [ "$_HOST_SYSTEM" = "Linux" ] \
&& [ "$_HOST_MACHINE" = "x86_64" ] \
&& [ "$_LINUX_HAS_GPU" = false ]; then
_HELPER_RELEASE_REPO="ggml-org/llama.cpp"
else
_HELPER_RELEASE_REPO="unslothai/llama.cpp"
fi
Comment on lines 599 to 607
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve conciseness and avoid repeating the assignment to _HELPER_RELEASE_REPO, you can combine the conditions for Darwin and CPU-only Linux into a single if block. Using the [[ ... ]] construct is also more modern and readable for complex conditions in Bash.

Suggested change
if [ "$_HOST_SYSTEM" = "Darwin" ]; then
_HELPER_RELEASE_REPO="ggml-org/llama.cpp"
elif [ "$_HOST_SYSTEM" = "Linux" ] \
&& [ "$_HOST_MACHINE" = "x86_64" ] \
&& [ "$_LINUX_HAS_GPU" = false ]; then
_HELPER_RELEASE_REPO="ggml-org/llama.cpp"
else
_HELPER_RELEASE_REPO="unslothai/llama.cpp"
fi
if [[ "$_HOST_SYSTEM" == "Darwin" || ( "$_HOST_SYSTEM" == "Linux" && "$_HOST_MACHINE" == "x86_64" && "$_LINUX_HAS_GPU" == false ) ]]; then
_HELPER_RELEASE_REPO="ggml-org/llama.cpp"
else
_HELPER_RELEASE_REPO="unslothai/llama.cpp"
fi

unset _GPU_TOOL
_LLAMA_PR="${UNSLOTH_LLAMA_PR:-}"
_SKIP_PREBUILT_INSTALL=false
_LLAMA_PR_FORCE="${UNSLOTH_LLAMA_PR_FORCE:-${_DEFAULT_LLAMA_PR_FORCE}}"
Expand Down
Loading