docker/unified: add cuda based unified container#597
Conversation
Build llama-swap binary from local source code instead of downloading from GitHub releases. This ensures the container uses the exact code in the repository. - Add golang:1.25-alpine builder stage to compile llama-swap - Generate version from git hash with +dirty suffix for unstaged changes - Update build-image.sh to use repository root as build context - Remove LLAMA_SWAP_VERSION environment variable and related code - Add test-binaries.sh to Dockerfile.vulkan for consistency Both CUDA and Vulkan Dockerfiles now build llama-swap from source.
Replace Dockerfile.cuda and Dockerfile.vulkan with a single Dockerfile that uses BuildKit conditional FROM stages to select the GPU backend via --build-arg BACKEND=cuda|vulkan. - Eliminate ~40% code duplication between backend Dockerfiles - Scope BuildKit cache IDs by backend to prevent cross-contamination - Add binary validation that fails the build if binaries are missing - Add version tracking and convenience symlinks from Vulkan file - Update test-binaries.sh to auto-detect both CUDA and Vulkan - Update build-image.sh to pass BACKEND arg to unified Dockerfile - Document how to add new server projects in Dockerfile header https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX
Move all cmake flag selection and build logic from inline Dockerfile RUN commands into a single install.sh script. Each build stage is now just: COPY source, COPY install.sh, RUN install.sh $BACKEND $PROJECT. Adding a new server project means adding a case block to install.sh with the project-specific cmake flags and targets. https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX
Strip section divider banners and inline commentary from Dockerfile and install.sh. Keep only the header usage docs and short stage labels. https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX
Use nvidia/cuda:12.4.0-devel as the base and layer the Vulkan SDK on top. One builder image has everything needed for both backends, eliminating the conditional FROM and duplicate package installs. https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX
Download prebuilt vulkan binaries from GitHub releases instead of building from source. This significantly speeds up vulkan image builds. - install.sh downloads release archives for vulkan llama/sd builds - build-image.sh resolves latest release tags for vulkan builds - whisper.cpp still builds from source (no prebuilt vulkan releases) https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX
Add a new docker-unified/ directory with per-project install scripts for building a unified CUDA container with llama.cpp, whisper.cpp, stable-diffusion.cpp, and llama-swap. - install-whisper.sh, install-sd.sh, install-llama.sh: self-contained scripts that clone, build with CUDA, and install each project - install-llama-swap.sh: downloads latest release binary from GitHub - Dockerfile: multi-stage build with independent cached stages per project and a runtime-only final image - build-image.sh: orchestrates builds with auto-detected commit hashes https://claude.ai/code/session_01WUhLk2q6gSKxptSz8vnL8k
Let buildx run stages in parallel by default instead of forcing sequential execution with max-parallelism=1. https://claude.ai/code/session_01WUhLk2q6gSKxptSz8vnL8k
Rename LLAMA_COMMIT_HASH/WHISPER_COMMIT_HASH/SD_COMMIT_HASH env vars to LLAMA_REF/WHISPER_REF/SD_REF. These now accept commit hashes, tags, or branch names. A new resolve_ref() function queries the remote to resolve tags and branches to commit hashes before passing them to the Dockerfile. https://claude.ai/code/session_01WUhLk2q6gSKxptSz8vnL8k
Fix several build failures discovered during first end-to-end run: - Fix git clone failure when Docker cache mount pre-creates the source directory; switch to init-based clone (git init + fetch + checkout) - Fix cmake flag word-splitting by converting CMAKE_FLAGS strings to bash arrays, preserving quoting for flags with spaces - Add short commit hash resolution to resolve_ref() via git ls-remote prefix matching and GitHub API fallback - Remove sd-cli and sd-server from the build; stable-diffusion.cpp now builds the shared library only (SD_BUILD_EXAMPLES=OFF) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…able refs - Require 7+ hex chars before trying the GitHub API (GitHub returns 422 for shorter hashes, causing silent fallthrough to the unresolved ref) - Fix grep to not require surrounding quotes when extracting the SHA - Fail with a clear error instead of silently passing the unresolved ref to Docker (which then fails with a cryptic git error inside the build) - Add || exit 1 to all resolve_ref call sites so the script exits cleanly Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ver) SD_BUILD_EXAMPLES=ON is required to build sd-cli and sd-server binaries. Restore the binary copy, validation, and symlink that were removed in the previous commit. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add workflow_dispatch workflow to build the unified CUDA Docker image with pinnable versions for all components. - Add .github/workflows/unified-docker.yml with inputs for llama.cpp, whisper.cpp, stable-diffusion.cpp, and llama-swap versions - Default versions: llama.cpp=b8468, whisper=v1.8.4, sd=545fac4, ls=v198 - GHCR login and push disabled (if: false) until ready to publish - Include disk cleanup step for GitHub Actions runners - Skip setup-buildx-action under act, reuse existing llama-swap-builder - Fix install-llama-swap.sh to strip leading 'v' prefix so both '198' and 'v198' inputs work correctly Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Caution Review failedPull request was closed or merged during review WalkthroughAdds a unified Docker image build infrastructure for Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~28 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
merging this into main to test out the new workflow. Improvements will be made in future PRs. |
…ostlygeek#597) Add Docker build scripts for a unified cuda docker container with llama-server, stable-diffusion.cpp, whisper.cpp.
No description provided.