docker/unified: add cuda based unified container by mostlygeek · Pull Request #597 · mostlygeek/llama-swap

mostlygeek · 2026-03-22T12:01:14Z

No description provided.

Build llama-swap binary from local source code instead of downloading from GitHub releases. This ensures the container uses the exact code in the repository. - Add golang:1.25-alpine builder stage to compile llama-swap - Generate version from git hash with +dirty suffix for unstaged changes - Update build-image.sh to use repository root as build context - Remove LLAMA_SWAP_VERSION environment variable and related code - Add test-binaries.sh to Dockerfile.vulkan for consistency Both CUDA and Vulkan Dockerfiles now build llama-swap from source.

Replace Dockerfile.cuda and Dockerfile.vulkan with a single Dockerfile that uses BuildKit conditional FROM stages to select the GPU backend via --build-arg BACKEND=cuda|vulkan. - Eliminate ~40% code duplication between backend Dockerfiles - Scope BuildKit cache IDs by backend to prevent cross-contamination - Add binary validation that fails the build if binaries are missing - Add version tracking and convenience symlinks from Vulkan file - Update test-binaries.sh to auto-detect both CUDA and Vulkan - Update build-image.sh to pass BACKEND arg to unified Dockerfile - Document how to add new server projects in Dockerfile header https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX

Move all cmake flag selection and build logic from inline Dockerfile RUN commands into a single install.sh script. Each build stage is now just: COPY source, COPY install.sh, RUN install.sh $BACKEND $PROJECT. Adding a new server project means adding a case block to install.sh with the project-specific cmake flags and targets. https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX

Strip section divider banners and inline commentary from Dockerfile and install.sh. Keep only the header usage docs and short stage labels. https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX

Use nvidia/cuda:12.4.0-devel as the base and layer the Vulkan SDK on top. One builder image has everything needed for both backends, eliminating the conditional FROM and duplicate package installs. https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX

Download prebuilt vulkan binaries from GitHub releases instead of building from source. This significantly speeds up vulkan image builds. - install.sh downloads release archives for vulkan llama/sd builds - build-image.sh resolves latest release tags for vulkan builds - whisper.cpp still builds from source (no prebuilt vulkan releases) https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX

Add a new docker-unified/ directory with per-project install scripts for building a unified CUDA container with llama.cpp, whisper.cpp, stable-diffusion.cpp, and llama-swap. - install-whisper.sh, install-sd.sh, install-llama.sh: self-contained scripts that clone, build with CUDA, and install each project - install-llama-swap.sh: downloads latest release binary from GitHub - Dockerfile: multi-stage build with independent cached stages per project and a runtime-only final image - build-image.sh: orchestrates builds with auto-detected commit hashes https://claude.ai/code/session_01WUhLk2q6gSKxptSz8vnL8k

Let buildx run stages in parallel by default instead of forcing sequential execution with max-parallelism=1. https://claude.ai/code/session_01WUhLk2q6gSKxptSz8vnL8k

Rename LLAMA_COMMIT_HASH/WHISPER_COMMIT_HASH/SD_COMMIT_HASH env vars to LLAMA_REF/WHISPER_REF/SD_REF. These now accept commit hashes, tags, or branch names. A new resolve_ref() function queries the remote to resolve tags and branches to commit hashes before passing them to the Dockerfile. https://claude.ai/code/session_01WUhLk2q6gSKxptSz8vnL8k

Fix several build failures discovered during first end-to-end run: - Fix git clone failure when Docker cache mount pre-creates the source directory; switch to init-based clone (git init + fetch + checkout) - Fix cmake flag word-splitting by converting CMAKE_FLAGS strings to bash arrays, preserving quoting for flags with spaces - Add short commit hash resolution to resolve_ref() via git ls-remote prefix matching and GitHub API fallback - Remove sd-cli and sd-server from the build; stable-diffusion.cpp now builds the shared library only (SD_BUILD_EXAMPLES=OFF) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…able refs - Require 7+ hex chars before trying the GitHub API (GitHub returns 422 for shorter hashes, causing silent fallthrough to the unresolved ref) - Fix grep to not require surrounding quotes when extracting the SHA - Fail with a clear error instead of silently passing the unresolved ref to Docker (which then fails with a cryptic git error inside the build) - Add || exit 1 to all resolve_ref call sites so the script exits cleanly Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ver) SD_BUILD_EXAMPLES=ON is required to build sd-cli and sd-server binaries. Restore the binary copy, validation, and symlink that were removed in the previous commit. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add workflow_dispatch workflow to build the unified CUDA Docker image with pinnable versions for all components. - Add .github/workflows/unified-docker.yml with inputs for llama.cpp, whisper.cpp, stable-diffusion.cpp, and llama-swap versions - Default versions: llama.cpp=b8468, whisper=v1.8.4, sd=545fac4, ls=v198 - GHCR login and push disabled (if: false) until ready to publish - Include disk cleanup step for GitHub Actions runners - Skip setup-buildx-action under act, reuse existing llama-swap-builder - Fix install-llama-swap.sh to strip leading 'v' prefix so both '198' and 'v198' inputs work correctly Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai · 2026-03-22T12:01:27Z

Caution

Review failed

Pull request was closed or merged during review

Walkthrough

Adds a unified Docker image build infrastructure for llama-swap with CUDA support. Introduces a GitHub Actions workflow, new unified Dockerfile, and build/installation scripts for orchestrating compilation and packaging of llama.cpp, whisper.cpp, stable-diffusion.cpp, and llama-swap. Comments out ARM64 build target in Makefile.

Changes

Cohort / File(s)	Summary
GitHub Actions Workflow `.github/workflows/unified-docker.yml`	New workflow `Build Unified Docker Image` triggered manually with inputs for `llama_cpp_ref`, `whisper_ref`, `sd_ref`, and `llama_swap_version`. Executes `docker/unified/build-image.sh` after disk cleanup and conditional Docker Buildx setup; includes disabled publish/login steps.
Docker Unified Build System `docker/unified/Dockerfile`, `docker/unified/README.md`	Multi-stage CUDA-accelerated Dockerfile building whisper.cpp, stable-diffusion.cpp, llama.cpp, and llama-swap with dependency installation, binary validation, and `/versions.txt` generation. README introduces the unified container concept.
Docker Build & Installation Scripts `docker/unified/build-image.sh`, `docker/unified/install-llama.sh`, `docker/unified/install-whisper.sh`, `docker/unified/install-sd.sh`, `docker/unified/install-llama-swap.sh`	New build orchestration script resolving git refs to commit hashes and running `docker buildx build --load` with validation. Component installers clone repos, configure CMake with CUDA flags, build binaries, validate outputs, and copy artifacts to `/install/`.
Legacy Docker Build Script `docker/build-image.sh`	Standalone Docker build script with `--cuda`/`--vulkan` backend selection, component version pinning via git operations or GitHub releases API, BuildKit configuration, and post-build binary verification with container runtime checks.
Build Configuration `Makefile`	Comments out Linux ARM64 build step (`GOOS=linux GOARCH=arm64`) in `linux` target, retaining only AMD64 binary build.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~28 minutes

Possibly related PRs

Add support for building Linux ARM64 binary in Makefile #221: Modifies the same Makefile Linux ARM64 build command that is now commented out in this PR.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	❓ Inconclusive	No pull request description was provided by the author, making it impossible to evaluate relevance to the changeset.	Add a pull request description explaining the purpose, changes, and benefits of the unified CUDA container implementation.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: adding a CUDA-based unified Docker container under the docker/unified directory with accompanying build scripts and Dockerfile.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch claude/unified-docker-container-btUU9

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

mostlygeek · 2026-03-22T12:10:38Z

merging this into main to test out the new workflow. Improvements will be made in future PRs.

…ostlygeek#597) Add Docker build scripts for a unified cuda docker container with llama-server, stable-diffusion.cpp, whisper.cpp.

mostlygeek and others added 17 commits March 17, 2026 10:54

checkpoint

6969c52

fix llama-swap build in Dockerfile.cuda

c6a93d8

docker,Makefile: fix cuda build

74c625b

docker: remove extraneous comments

f6fe7c5

Strip section divider banners and inline commentary from Dockerfile and install.sh. Keep only the header usage docs and short stage labels. https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX

docker-unified: remove max-parallelism constraint

fb0e58c

Let buildx run stages in parallel by default instead of forcing sequential execution with max-parallelism=1. https://claude.ai/code/session_01WUhLk2q6gSKxptSz8vnL8k

docker: move unified scripts under docker/

1258f1d

docker: remove old files

6c134d3

mostlygeek merged commit 916d13f into main Mar 22, 2026
2 of 3 checks passed

mostlygeek deleted the claude/unified-docker-container-btUU9 branch March 22, 2026 12:11

This was referenced Mar 23, 2026

.github,docker/unified: improve caching and fix bugs #598

Merged

.github,docker/unified: include vulkan build #599

Merged

docker/unified: vulkan build fixes #600

Merged

coderabbitai bot mentioned this pull request Mar 31, 2026

docker/unified: build llama.cpp with static libraries #616

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docker/unified: add cuda based unified container#597

docker/unified: add cuda based unified container#597
mostlygeek merged 18 commits intomainfrom
claude/unified-docker-container-btUU9

mostlygeek commented Mar 22, 2026

Uh oh!

coderabbitai bot commented Mar 22, 2026 •

edited

Loading

Review failed

❌ Failed checks (1 inconclusive)

Uh oh!

mostlygeek commented Mar 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mostlygeek commented Mar 22, 2026

Uh oh!

coderabbitai bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 inconclusive)

Uh oh!

mostlygeek commented Mar 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Mar 22, 2026 •

edited

Loading