Skip to content

docker/unified: add cuda based unified container#597

Merged
mostlygeek merged 18 commits intomainfrom
claude/unified-docker-container-btUU9
Mar 22, 2026
Merged

docker/unified: add cuda based unified container#597
mostlygeek merged 18 commits intomainfrom
claude/unified-docker-container-btUU9

Conversation

@mostlygeek
Copy link
Copy Markdown
Owner

No description provided.

mostlygeek and others added 17 commits March 17, 2026 10:54
Build llama-swap binary from local source code instead of downloading
from GitHub releases. This ensures the container uses the exact code
in the repository.

- Add golang:1.25-alpine builder stage to compile llama-swap
- Generate version from git hash with +dirty suffix for unstaged changes
- Update build-image.sh to use repository root as build context
- Remove LLAMA_SWAP_VERSION environment variable and related code
- Add test-binaries.sh to Dockerfile.vulkan for consistency

Both CUDA and Vulkan Dockerfiles now build llama-swap from source.
Replace Dockerfile.cuda and Dockerfile.vulkan with a single
Dockerfile that uses BuildKit conditional FROM stages to select
the GPU backend via --build-arg BACKEND=cuda|vulkan.

- Eliminate ~40% code duplication between backend Dockerfiles
- Scope BuildKit cache IDs by backend to prevent cross-contamination
- Add binary validation that fails the build if binaries are missing
- Add version tracking and convenience symlinks from Vulkan file
- Update test-binaries.sh to auto-detect both CUDA and Vulkan
- Update build-image.sh to pass BACKEND arg to unified Dockerfile
- Document how to add new server projects in Dockerfile header

https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX
Move all cmake flag selection and build logic from inline Dockerfile
RUN commands into a single install.sh script. Each build stage is
now just: COPY source, COPY install.sh, RUN install.sh $BACKEND $PROJECT.

Adding a new server project means adding a case block to install.sh
with the project-specific cmake flags and targets.

https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX
Strip section divider banners and inline commentary from Dockerfile
and install.sh. Keep only the header usage docs and short stage labels.

https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX
Use nvidia/cuda:12.4.0-devel as the base and layer the Vulkan SDK on
top. One builder image has everything needed for both backends,
eliminating the conditional FROM and duplicate package installs.

https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX
Download prebuilt vulkan binaries from GitHub releases instead of
building from source. This significantly speeds up vulkan image builds.

- install.sh downloads release archives for vulkan llama/sd builds
- build-image.sh resolves latest release tags for vulkan builds
- whisper.cpp still builds from source (no prebuilt vulkan releases)

https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX
Add a new docker-unified/ directory with per-project install scripts
for building a unified CUDA container with llama.cpp, whisper.cpp,
stable-diffusion.cpp, and llama-swap.

- install-whisper.sh, install-sd.sh, install-llama.sh: self-contained
  scripts that clone, build with CUDA, and install each project
- install-llama-swap.sh: downloads latest release binary from GitHub
- Dockerfile: multi-stage build with independent cached stages per
  project and a runtime-only final image
- build-image.sh: orchestrates builds with auto-detected commit hashes

https://claude.ai/code/session_01WUhLk2q6gSKxptSz8vnL8k
Let buildx run stages in parallel by default instead of
forcing sequential execution with max-parallelism=1.

https://claude.ai/code/session_01WUhLk2q6gSKxptSz8vnL8k
Rename LLAMA_COMMIT_HASH/WHISPER_COMMIT_HASH/SD_COMMIT_HASH env vars
to LLAMA_REF/WHISPER_REF/SD_REF. These now accept commit hashes, tags,
or branch names. A new resolve_ref() function queries the remote to
resolve tags and branches to commit hashes before passing them to the
Dockerfile.

https://claude.ai/code/session_01WUhLk2q6gSKxptSz8vnL8k
Fix several build failures discovered during first end-to-end run:

- Fix git clone failure when Docker cache mount pre-creates the source
  directory; switch to init-based clone (git init + fetch + checkout)
- Fix cmake flag word-splitting by converting CMAKE_FLAGS strings to
  bash arrays, preserving quoting for flags with spaces
- Add short commit hash resolution to resolve_ref() via git ls-remote
  prefix matching and GitHub API fallback
- Remove sd-cli and sd-server from the build; stable-diffusion.cpp now
  builds the shared library only (SD_BUILD_EXAMPLES=OFF)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…able refs

- Require 7+ hex chars before trying the GitHub API (GitHub returns 422
  for shorter hashes, causing silent fallthrough to the unresolved ref)
- Fix grep to not require surrounding quotes when extracting the SHA
- Fail with a clear error instead of silently passing the unresolved ref
  to Docker (which then fails with a cryptic git error inside the build)
- Add || exit 1 to all resolve_ref call sites so the script exits cleanly

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ver)

SD_BUILD_EXAMPLES=ON is required to build sd-cli and sd-server binaries.
Restore the binary copy, validation, and symlink that were removed in
the previous commit.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add workflow_dispatch workflow to build the unified CUDA Docker image
with pinnable versions for all components.

- Add .github/workflows/unified-docker.yml with inputs for llama.cpp,
  whisper.cpp, stable-diffusion.cpp, and llama-swap versions
- Default versions: llama.cpp=b8468, whisper=v1.8.4, sd=545fac4, ls=v198
- GHCR login and push disabled (if: false) until ready to publish
- Include disk cleanup step for GitHub Actions runners
- Skip setup-buildx-action under act, reuse existing llama-swap-builder
- Fix install-llama-swap.sh to strip leading 'v' prefix so both
  '198' and 'v198' inputs work correctly

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 22, 2026

Caution

Review failed

Pull request was closed or merged during review

Walkthrough

Adds a unified Docker image build infrastructure for llama-swap with CUDA support. Introduces a GitHub Actions workflow, new unified Dockerfile, and build/installation scripts for orchestrating compilation and packaging of llama.cpp, whisper.cpp, stable-diffusion.cpp, and llama-swap. Comments out ARM64 build target in Makefile.

Changes

Cohort / File(s) Summary
GitHub Actions Workflow
.github/workflows/unified-docker.yml
New workflow Build Unified Docker Image triggered manually with inputs for llama_cpp_ref, whisper_ref, sd_ref, and llama_swap_version. Executes docker/unified/build-image.sh after disk cleanup and conditional Docker Buildx setup; includes disabled publish/login steps.
Docker Unified Build System
docker/unified/Dockerfile, docker/unified/README.md
Multi-stage CUDA-accelerated Dockerfile building whisper.cpp, stable-diffusion.cpp, llama.cpp, and llama-swap with dependency installation, binary validation, and /versions.txt generation. README introduces the unified container concept.
Docker Build & Installation Scripts
docker/unified/build-image.sh, docker/unified/install-llama.sh, docker/unified/install-whisper.sh, docker/unified/install-sd.sh, docker/unified/install-llama-swap.sh
New build orchestration script resolving git refs to commit hashes and running docker buildx build --load with validation. Component installers clone repos, configure CMake with CUDA flags, build binaries, validate outputs, and copy artifacts to /install/.
Legacy Docker Build Script
docker/build-image.sh
Standalone Docker build script with --cuda/--vulkan backend selection, component version pinning via git operations or GitHub releases API, BuildKit configuration, and post-build binary verification with container runtime checks.
Build Configuration
Makefile
Comments out Linux ARM64 build step (GOOS=linux GOARCH=arm64) in linux target, retaining only AMD64 binary build.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~28 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive No pull request description was provided by the author, making it impossible to evaluate relevance to the changeset. Add a pull request description explaining the purpose, changes, and benefits of the unified CUDA container implementation.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding a CUDA-based unified Docker container under the docker/unified directory with accompanying build scripts and Dockerfile.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/unified-docker-container-btUU9

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mostlygeek
Copy link
Copy Markdown
Owner Author

merging this into main to test out the new workflow. Improvements will be made in future PRs.

@mostlygeek mostlygeek merged commit 916d13f into main Mar 22, 2026
2 of 3 checks passed
@mostlygeek mostlygeek deleted the claude/unified-docker-container-btUU9 branch March 22, 2026 12:11
rohitpaul pushed a commit to rohitpaul/llama-swap that referenced this pull request Mar 29, 2026
…ostlygeek#597)

Add Docker build scripts for a unified cuda docker container with llama-server, stable-diffusion.cpp, whisper.cpp.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants