Conversation
Build llama-swap binary from local source code instead of downloading from GitHub releases. This ensures the container uses the exact code in the repository. - Add golang:1.25-alpine builder stage to compile llama-swap - Generate version from git hash with +dirty suffix for unstaged changes - Update build-image.sh to use repository root as build context - Remove LLAMA_SWAP_VERSION environment variable and related code - Add test-binaries.sh to Dockerfile.vulkan for consistency Both CUDA and Vulkan Dockerfiles now build llama-swap from source.
Replace Dockerfile.cuda and Dockerfile.vulkan with a single Dockerfile that uses BuildKit conditional FROM stages to select the GPU backend via --build-arg BACKEND=cuda|vulkan. - Eliminate ~40% code duplication between backend Dockerfiles - Scope BuildKit cache IDs by backend to prevent cross-contamination - Add binary validation that fails the build if binaries are missing - Add version tracking and convenience symlinks from Vulkan file - Update test-binaries.sh to auto-detect both CUDA and Vulkan - Update build-image.sh to pass BACKEND arg to unified Dockerfile - Document how to add new server projects in Dockerfile header https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX
Move all cmake flag selection and build logic from inline Dockerfile RUN commands into a single install.sh script. Each build stage is now just: COPY source, COPY install.sh, RUN install.sh $BACKEND $PROJECT. Adding a new server project means adding a case block to install.sh with the project-specific cmake flags and targets. https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX
WalkthroughAdds a unified multi-stage CUDA/Vulkan Docker build, helper scripts and docs for building/testing GPU-enabled binaries, updates Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
🚥 Pre-merge checks | ❌ 3❌ Failed checks (1 warning, 2 inconclusive)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Strip section divider banners and inline commentary from Dockerfile and install.sh. Keep only the header usage docs and short stage labels. https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX
Use nvidia/cuda:12.4.0-devel as the base and layer the Vulkan SDK on top. One builder image has everything needed for both backends, eliminating the conditional FROM and duplicate package installs. https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (3)
docker/Dockerfile (2)
215-216: Consider adding a non-root user for improved container security.Static analysis flagged DS-0002: the container runs as root by default. While GPU access sometimes requires elevated privileges, consider adding a non-root user for the runtime stage where possible, or document the rationale for running as root.
Optional: Add non-root user
WORKDIR /models + +# Create non-root user (uncomment if GPU device permissions allow) +# RUN useradd -m -u 1000 llama-swap +# USER llama-swap + CMD ["bash"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker/Dockerfile` around lines 215 - 216, The Dockerfile currently sets WORKDIR /models and CMD ["bash"] and runs as root by default; add a non-root user (e.g., create user 'app' or 'nonroot') in the runtime stage, chown /models and any other runtime directories to that user, and switch to it via USER before CMD to avoid running container processes as root; if GPU/device access prevents dropping privileges, add a clear comment near WORKDIR /models and CMD explaining why root is required and what security mitigations are in place.
77-77: Architecture is hardcoded toamd64.
GOARCH=amd64limits the Go binary to x86_64. This is consistent with the Makefile change removing ARM64, but worth noting if ARM64 GPU support is ever desired.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker/Dockerfile` at line 77, The Dockerfile currently hardcodes GOARCH in the ENV line (ENV CGO_ENABLED=0 GOOS=linux GOARCH=amd64), which prevents building for non-x86_64 targets; change this to use a build-time argument or Docker meta-arg so the architecture is configurable (e.g. introduce ARG TARGETARCH and set GOARCH from that, or accept a build ARG like BUILD_GOARCH and set the ENV GOARCH accordingly), ensuring CGO_ENABLED and GOOS remain set while GOARCH is driven by the build arg so ARM64 or other architectures can be targeted when needed.docker/test-binaries.sh (1)
57-70: Vulkan detection is only reached if CUDA stubs don't exist.The current logic prioritizes CUDA stub fallback over Vulkan detection. If running a Vulkan image on a host without NVIDIA drivers but with
/usr/local/cuda/lib64/stubspresent (from a prior CUDA install), it will configure for CUDA stubs instead of Vulkan.Consider reordering or adding explicit backend detection
One option is to check for a backend indicator (e.g., presence of specific binaries or an environment variable) before falling back to stubs:
if detect_cuda_drivers; then print_info "Using real NVIDIA drivers" export LD_LIBRARY_PATH="/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}" +elif detect_vulkan; then + print_info "Vulkan backend detected" elif [ -d "/usr/local/cuda/lib64/stubs" ]; then print_warn "No real NVIDIA drivers detected" ... -elif detect_vulkan; then - print_info "Vulkan backend detected" else print_warn "No GPU drivers detected (CPU-only mode)" fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker/test-binaries.sh` around lines 57 - 70, The current conditional prioritizes CUDA stubs over Vulkan so detect_vulkan() is only evaluated after the stubs branch; update the conditional logic in docker/test-binaries.sh so Vulkan backend is detected before falling back to CUDA stubs (or add an explicit backend override like an env var BACKEND=vulkan/force_vulkan checked first). Concretely, call detect_vulkan (or check the override env var) prior to the "/usr/local/cuda/lib64/stubs" branch, or restructure the branches to: if detect_cuda_drivers; elif detect_vulkan || BACKEND=vulkan; elif [ -d "/usr/local/cuda/lib64/stubs" ]; else ... ensuring LD_LIBRARY_PATH is only modified in the stubs/real-cuda branches.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docker/build-image.sh`:
- Around line 116-127: The get_default_branch function currently checks git
ls-remote --heads "${repo_url}" master by exit code which is always 0; update
get_default_branch to capture the output of git ls-remote --heads "${repo_url}"
<branch> (or grep for refs/heads/<branch>) into a variable and test that the
output is non-empty before echoing the branch; if neither master nor main
produce non-empty output, keep the fallback to "master". Use the existing
function name get_default_branch and its repo_url parameter to locate the
change.
- Around line 206-217: The script writes buildkitd.toml to the current working
directory instead of the docker/ directory; update the heredoc target and the
buildx call to use the script directory (e.g., SCRIPT_DIR) so the file is
created and referenced consistently. Specifically, change the creation of
buildkitd.toml to write to "$SCRIPT_DIR/buildkitd.toml" (or ensure you cd to
SCRIPT_DIR before writing) and pass that same path to docker buildx create's
--buildkitd-config flag; ensure references to BUILDER_NAME remain unchanged.
In `@docker/install.sh`:
- Around line 22-23: The linker flag values for CMAKE_EXE_LINKER_FLAGS and
CMAKE_SHARED_LINKER_FLAGS are being split by the shell because the -D arguments
lack quotes; wrap the entire value for each CMake variable in quotes so the
space and the trailing -lcuda remain part of the same -D argument (i.e. quote
the values passed to CMAKE_EXE_LINKER_FLAGS and CMAKE_SHARED_LINKER_FLAGS in
docker/install.sh).
---
Nitpick comments:
In `@docker/Dockerfile`:
- Around line 215-216: The Dockerfile currently sets WORKDIR /models and CMD
["bash"] and runs as root by default; add a non-root user (e.g., create user
'app' or 'nonroot') in the runtime stage, chown /models and any other runtime
directories to that user, and switch to it via USER before CMD to avoid running
container processes as root; if GPU/device access prevents dropping privileges,
add a clear comment near WORKDIR /models and CMD explaining why root is required
and what security mitigations are in place.
- Line 77: The Dockerfile currently hardcodes GOARCH in the ENV line (ENV
CGO_ENABLED=0 GOOS=linux GOARCH=amd64), which prevents building for non-x86_64
targets; change this to use a build-time argument or Docker meta-arg so the
architecture is configurable (e.g. introduce ARG TARGETARCH and set GOARCH from
that, or accept a build ARG like BUILD_GOARCH and set the ENV GOARCH
accordingly), ensuring CGO_ENABLED and GOOS remain set while GOARCH is driven by
the build arg so ARM64 or other architectures can be targeted when needed.
In `@docker/test-binaries.sh`:
- Around line 57-70: The current conditional prioritizes CUDA stubs over Vulkan
so detect_vulkan() is only evaluated after the stubs branch; update the
conditional logic in docker/test-binaries.sh so Vulkan backend is detected
before falling back to CUDA stubs (or add an explicit backend override like an
env var BACKEND=vulkan/force_vulkan checked first). Concretely, call
detect_vulkan (or check the override env var) prior to the
"/usr/local/cuda/lib64/stubs" branch, or restructure the branches to: if
detect_cuda_drivers; elif detect_vulkan || BACKEND=vulkan; elif [ -d
"/usr/local/cuda/lib64/stubs" ]; else ... ensuring LD_LIBRARY_PATH is only
modified in the stubs/real-cuda branches.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 2c41dfab-9a5b-4ca5-8898-c60b01920346
📒 Files selected for processing (7)
Makefiledocker/.gitignoredocker/AGENTS.mddocker/Dockerfiledocker/build-image.shdocker/install.shdocker/test-binaries.sh
| get_default_branch() { | ||
| local repo_url="$1" | ||
|
|
||
| # Check for master first | ||
| if git ls-remote --heads "${repo_url}" master &>/dev/null; then | ||
| echo "master" | ||
| elif git ls-remote --heads "${repo_url}" main &>/dev/null; then | ||
| echo "main" | ||
| else | ||
| echo "master" # fallback | ||
| fi | ||
| } |
There was a problem hiding this comment.
get_default_branch doesn't reliably detect if a branch exists.
git ls-remote --heads returns exit code 0 regardless of whether the branch exists; it just returns empty output. The current logic will always fall through to "master" fallback even if neither branch exists.
Proposed fix to check output instead of exit code
get_default_branch() {
local repo_url="$1"
# Check for master first
- if git ls-remote --heads "${repo_url}" master &>/dev/null; then
+ if git ls-remote --heads "${repo_url}" master 2>/dev/null | grep -q .; then
echo "master"
- elif git ls-remote --heads "${repo_url}" main &>/dev/null; then
+ elif git ls-remote --heads "${repo_url}" main 2>/dev/null | grep -q .; then
echo "main"
else
echo "master" # fallback
fi
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| get_default_branch() { | |
| local repo_url="$1" | |
| # Check for master first | |
| if git ls-remote --heads "${repo_url}" master &>/dev/null; then | |
| echo "master" | |
| elif git ls-remote --heads "${repo_url}" main &>/dev/null; then | |
| echo "main" | |
| else | |
| echo "master" # fallback | |
| fi | |
| } | |
| get_default_branch() { | |
| local repo_url="$1" | |
| # Check for master first | |
| if git ls-remote --heads "${repo_url}" master 2>/dev/null | grep -q .; then | |
| echo "master" | |
| elif git ls-remote --heads "${repo_url}" main 2>/dev/null | grep -q .; then | |
| echo "main" | |
| else | |
| echo "master" # fallback | |
| fi | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docker/build-image.sh` around lines 116 - 127, The get_default_branch
function currently checks git ls-remote --heads "${repo_url}" master by exit
code which is always 0; update get_default_branch to capture the output of git
ls-remote --heads "${repo_url}" <branch> (or grep for refs/heads/<branch>) into
a variable and test that the output is non-empty before echoing the branch; if
neither master nor main produce non-empty output, keep the fallback to "master".
Use the existing function name get_default_branch and its repo_url parameter to
locate the change.
|
|
||
| # Create buildkitd.toml config file | ||
| cat > buildkitd.toml << 'BUILDKIT_EOF' | ||
| [worker.oci] | ||
| max-parallelism = 1 | ||
| BUILDKIT_EOF | ||
|
|
||
| # Create the builder with the config | ||
| docker buildx create --name "$BUILDER_NAME" \ | ||
| --driver docker-container \ | ||
| --buildkitd-config buildkitd.toml \ | ||
| --use |
There was a problem hiding this comment.
buildkitd.toml is written to CWD, not docker/ directory.
The script writes buildkitd.toml to the current working directory, but docker/.gitignore expects it in the docker/ directory. If the script is invoked as ./docker/build-image.sh from the repo root, the file will be created in the wrong location.
Proposed fix to use SCRIPT_DIR for consistent path
echo "Creating custom buildx builder with max-parallelism=1..."
# Create buildkitd.toml config file
- cat > buildkitd.toml << 'BUILDKIT_EOF'
+ cat > "${SCRIPT_DIR}/buildkitd.toml" << 'BUILDKIT_EOF'
[worker.oci]
max-parallelism = 1
BUILDKIT_EOF
# Create the builder with the config
docker buildx create --name "$BUILDER_NAME" \
--driver docker-container \
- --buildkitd-config buildkitd.toml \
+ --buildkitd-config "${SCRIPT_DIR}/buildkitd.toml" \
--use📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Create buildkitd.toml config file | |
| cat > buildkitd.toml << 'BUILDKIT_EOF' | |
| [worker.oci] | |
| max-parallelism = 1 | |
| BUILDKIT_EOF | |
| # Create the builder with the config | |
| docker buildx create --name "$BUILDER_NAME" \ | |
| --driver docker-container \ | |
| --buildkitd-config buildkitd.toml \ | |
| --use | |
| # Create buildkitd.toml config file | |
| cat > "${SCRIPT_DIR}/buildkitd.toml" << 'BUILDKIT_EOF' | |
| [worker.oci] | |
| max-parallelism = 1 | |
| BUILDKIT_EOF | |
| # Create the builder with the config | |
| docker buildx create --name "$BUILDER_NAME" \ | |
| --driver docker-container \ | |
| --buildkitd-config "${SCRIPT_DIR}/buildkitd.toml" \ | |
| --use |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docker/build-image.sh` around lines 206 - 217, The script writes
buildkitd.toml to the current working directory instead of the docker/
directory; update the heredoc target and the buildx call to use the script
directory (e.g., SCRIPT_DIR) so the file is created and referenced consistently.
Specifically, change the creation of buildkitd.toml to write to
"$SCRIPT_DIR/buildkitd.toml" (or ensure you cd to SCRIPT_DIR before writing) and
pass that same path to docker buildx create's --buildkitd-config flag; ensure
references to BUILDER_NAME remain unchanged.
| -DCMAKE_EXE_LINKER_FLAGS=-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -lcuda | ||
| -DCMAKE_SHARED_LINKER_FLAGS=-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -lcuda |
There was a problem hiding this comment.
Linker flags are split incorrectly due to missing quotes in the CMake variable.
The -lcuda is interpreted as a separate shell word rather than part of the CMake variable value. This will cause -lcuda to be passed as a separate argument to CMake, which is likely not the intent.
Proposed fix
- -DCMAKE_EXE_LINKER_FLAGS=-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -lcuda
- -DCMAKE_SHARED_LINKER_FLAGS=-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -lcuda
+ -DCMAKE_EXE_LINKER_FLAGS='-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -lcuda'
+ -DCMAKE_SHARED_LINKER_FLAGS='-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -lcuda'📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| -DCMAKE_EXE_LINKER_FLAGS=-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -lcuda | |
| -DCMAKE_SHARED_LINKER_FLAGS=-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -lcuda | |
| -DCMAKE_EXE_LINKER_FLAGS='-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -lcuda' | |
| -DCMAKE_SHARED_LINKER_FLAGS='-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -lcuda' |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docker/install.sh` around lines 22 - 23, The linker flag values for
CMAKE_EXE_LINKER_FLAGS and CMAKE_SHARED_LINKER_FLAGS are being split by the
shell because the -D arguments lack quotes; wrap the entire value for each CMake
variable in quotes so the space and the trailing -lcuda remain part of the same
-D argument (i.e. quote the values passed to CMAKE_EXE_LINKER_FLAGS and
CMAKE_SHARED_LINKER_FLAGS in docker/install.sh).
There was a problem hiding this comment.
🧹 Nitpick comments (2)
docker/Dockerfile (2)
156-156: Consider pinning Python dependency versions.Installing
numpyandsentencepiecewithout version pins can lead to non-reproducible builds if upstream releases introduce breaking changes.♻️ Suggested fix
-RUN pip3 install --no-cache-dir numpy sentencepiece +RUN pip3 install --no-cache-dir numpy==1.26.4 sentencepiece==0.2.0Or use a requirements file for better maintainability.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker/Dockerfile` at line 156, The Dockerfile currently runs "pip3 install --no-cache-dir numpy sentencepiece" without pins; update the RUN step to install fixed versions (e.g., pin numpy and sentencepiece to specific versions) or switch to installing from a committed requirements.txt and use "pip3 install --no-cache-dir -r requirements.txt" so builds are reproducible and maintainable; ensure the chosen versions are tested and documented.
140-192: Container runs as root user (security consideration).The static analysis (DS-0002) correctly identifies that no
USERdirective is set, meaning the container runs as root. For production deployments, running as a non-root user is a security best practice.However, for GPU workloads, non-root users require proper group membership (e.g.,
video,render) and device permissions, which may complicate setup. If this image is primarily for development/local use, running as root may be acceptable.🛡️ Optional: Add non-root user for hardened deployments
RUN echo "build_timestamp: $(date -u +%Y-%m-%dT%H:%M:%SZ)" >> /versions.txt +# Create non-root user for security (optional: uncomment for hardened deployments) +# RUN useradd -m -s /bin/bash llama && \ +# chown -R llama:llama /models /usr/local/bin +# USER llama WORKDIR /models CMD ["bash"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker/Dockerfile` around lines 140 - 192, The image currently never drops root (no USER directive) so containers run as root; create a non-root user and switch to it by adding user creation and a final USER instruction and ensure file ownership/permissions are adjusted for that user: create a dedicated user (e.g., "appuser") and group, chown /usr/local/bin, /usr/local/lib, /models and any runtime-writable paths (also ensure docker/test-binaries.sh remains executable), and set USER appuser before CMD; for GPU-runner compatibility optionally add the user to groups like "video" or "render" (or expose build args to control group membership) so binaries like llama-server, whisper-server, sd-server and llama-swap still validate when run as non-root.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@docker/Dockerfile`:
- Line 156: The Dockerfile currently runs "pip3 install --no-cache-dir numpy
sentencepiece" without pins; update the RUN step to install fixed versions
(e.g., pin numpy and sentencepiece to specific versions) or switch to installing
from a committed requirements.txt and use "pip3 install --no-cache-dir -r
requirements.txt" so builds are reproducible and maintainable; ensure the chosen
versions are tested and documented.
- Around line 140-192: The image currently never drops root (no USER directive)
so containers run as root; create a non-root user and switch to it by adding
user creation and a final USER instruction and ensure file ownership/permissions
are adjusted for that user: create a dedicated user (e.g., "appuser") and group,
chown /usr/local/bin, /usr/local/lib, /models and any runtime-writable paths
(also ensure docker/test-binaries.sh remains executable), and set USER appuser
before CMD; for GPU-runner compatibility optionally add the user to groups like
"video" or "render" (or expose build args to control group membership) so
binaries like llama-server, whisper-server, sd-server and llama-swap still
validate when run as non-root.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b98cb8e6-f819-4629-bc6b-192bd5a02e99
📒 Files selected for processing (1)
docker/Dockerfile
Download prebuilt vulkan binaries from GitHub releases instead of building from source. This significantly speeds up vulkan image builds. - install.sh downloads release archives for vulkan llama/sd builds - build-image.sh resolves latest release tags for vulkan builds - whisper.cpp still builds from source (no prebuilt vulkan releases) https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
docker/build-image.sh (1)
261-279: Presence checks won't catch broken startup.
whichonly proves the binary is onPATH; it still passes when the image has bad shared-library wiring or an ABI mismatch. This PR already copiesdocker/test-binaries.shinto the image indocker/DockerfileLines 155-156, so calling that here would give you a real post-build smoke test instead of a path check.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker/build-image.sh` around lines 261 - 279, Replace the superficial PATH-based checks (the MISSING_BINARIES array and the loop that runs which for each binary) with a real smoke-test invocation of the test script already copied into the image; run docker run --rm "${DOCKER_IMAGE_TAG}" and execute the docker/test-binaries.sh inside the container (use the same path used in the Dockerfile), capture its exit status and fail the build if it returns non-zero so shared-library/ABI/startup failures are detected; keep DOCKER_IMAGE_TAG as the image identifier and remove the which-based logic (MISSING_BINARIES and the loop) in build-image.sh.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docker/Dockerfile`:
- Around line 186-194: The final image leaves processes running as root (no USER
set), so change the final Dockerfile stage to create a non-root user (e.g.,
"llama" or "llama-swap") and switch to it before CMD/WORKDIR so the runtime
(including llama-swap) does not run as root; ensure files under /models and any
needed sockets/devices are chowned to that user and document that consumers
should pass the matching supplemental group (for /dev/dri) at container runtime
rather than running as root. Locate the final stage around the WORKDIR/CMD
commands and update it to adduser/addgroup, set ownership on /models and other
runtime paths, and add a USER line to drop root privileges while keeping
instructions to supply the supplemental group for /dev/dri access when needed.
- Around line 162-172: The Dockerfile currently copies all project shared libs
into a single /usr/local/lib/ (COPY --from=llama-build /install/lib/, COPY
--from=whisper-build /install/lib/, COPY --from=sd-build /install/lib/) causing
later builds to overwrite earlier .so files; update the Dockerfile and install
process to keep per-project library directories (e.g., /usr/local/lib/llama/,
/usr/local/lib/whisper/, /usr/local/lib/sd/) instead of merging into
/usr/local/lib/, and ensure each project's binaries have correct lookup paths by
setting RPATH on the built binaries or exporting LD_LIBRARY_PATH per service (or
alternatively build static libs) so each binary loads its matching libggml and
backend variants; also adjust docker/install.sh (the step that copies *.so* into
/install/lib) to place files into project-specific subdirs to match the new COPY
destinations.
In `@docker/install.sh`:
- Around line 49-52: The ASSET variable currently hardcodes the Ubuntu 24.04
artifact (ASSET) which mismatches the runtime-vulkan stage (ubuntu:22.04) and
will cause libc/ABI failures; update the install.sh logic to select the correct
release artifact for the runtime (either change ASSET to use Ubuntu-22.04
instead of Ubuntu-24.04, or make ASSET derived from an environment variable or
detected base image) and ensure URL uses that corrected ASSET; reference the
RELEASE_TAG, ASSET and URL variables and validate the docker/Dockerfile
runtime-vulkan base (ubuntu:22.04) so the downloaded sd binary matches the
runtime libc.
---
Nitpick comments:
In `@docker/build-image.sh`:
- Around line 261-279: Replace the superficial PATH-based checks (the
MISSING_BINARIES array and the loop that runs which for each binary) with a real
smoke-test invocation of the test script already copied into the image; run
docker run --rm "${DOCKER_IMAGE_TAG}" and execute the docker/test-binaries.sh
inside the container (use the same path used in the Dockerfile), capture its
exit status and fail the build if it returns non-zero so
shared-library/ABI/startup failures are detected; keep DOCKER_IMAGE_TAG as the
image identifier and remove the which-based logic (MISSING_BINARIES and the
loop) in build-image.sh.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: cc6d7a27-2bc5-4490-8e9c-5b3793dd5685
📒 Files selected for processing (3)
docker/Dockerfiledocker/build-image.shdocker/install.sh
| COPY --from=llama-build /install/bin/llama-server /usr/local/bin/ | ||
| COPY --from=llama-build /install/bin/llama-cli /usr/local/bin/ | ||
| COPY --from=llama-build /install/lib/ /usr/local/lib/ | ||
|
|
||
| COPY --from=whisper-build /install/bin/whisper-server /usr/local/bin/ | ||
| COPY --from=whisper-build /install/bin/whisper-cli /usr/local/bin/ | ||
| COPY --from=whisper-build /install/lib/ /usr/local/lib/ | ||
|
|
||
| COPY --from=sd-build /install/bin/sd-server /usr/local/bin/ | ||
| COPY --from=sd-build /install/bin/sd-cli /usr/local/bin/ | ||
| COPY --from=sd-build /install/lib/ /usr/local/lib/ |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate the docker/install.sh and Dockerfile files
find . -name "Dockerfile" -o -name "install.sh" | head -20Repository: mostlygeek/llama-swap
Length of output: 125
🏁 Script executed:
# Get the Dockerfile and examine the structure around lines 162-172
fd DockerfileRepository: mostlygeek/llama-swap
Length of output: 82
🏁 Script executed:
# Check the total line count and examine the Dockerfile around the specified lines
wc -l docker/DockerfileRepository: mostlygeek/llama-swap
Length of output: 86
🏁 Script executed:
# Read the Dockerfile to understand its full structure
cat -n docker/Dockerfile | head -200Repository: mostlygeek/llama-swap
Length of output: 8976
🏁 Script executed:
# Now examine docker/install.sh to verify the claims about line 153 and *.so* copying
wc -l docker/install.shRepository: mostlygeek/llama-swap
Length of output: 86
🏁 Script executed:
# Read docker/install.sh to understand the install process
cat -n docker/install.shRepository: mostlygeek/llama-swap
Length of output: 7086
🏁 Script executed:
# Check the CMakeLists.txt or build outputs to understand what libraries each project creates
# First, let's search for references to libggml or shared library creation in the build directories
rg -A 5 "\.so" docker/install.shRepository: mostlygeek/llama-swap
Length of output: 855
🏁 Script executed:
# Check if there's any documentation about what libraries are installed
rg -i "library\|libggml\|\.so" docker/ --max-count=20Repository: mostlygeek/llama-swap
Length of output: 47
Keep each project's shared libraries isolated.
docker/install.sh line 153 copies every *.so* file into /install/lib, and Dockerfile lines 164, 168, and 172 overlay all three project directories into the same /usr/local/lib/. Since llama.cpp, whisper.cpp, and stable-diffusion.cpp all use GGML, they share libggml.so and backend-specific libraries. Later COPYs overwrite earlier ones—sd-build's libraries override whisper-build's, which override llama-build's. A binary may load the wrong version or backend variant of a library at runtime. Keep them under per-project library paths and set RPATH or LD_LIBRARY_PATH per binary, or build static.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docker/Dockerfile` around lines 162 - 172, The Dockerfile currently copies
all project shared libs into a single /usr/local/lib/ (COPY --from=llama-build
/install/lib/, COPY --from=whisper-build /install/lib/, COPY --from=sd-build
/install/lib/) causing later builds to overwrite earlier .so files; update the
Dockerfile and install process to keep per-project library directories (e.g.,
/usr/local/lib/llama/, /usr/local/lib/whisper/, /usr/local/lib/sd/) instead of
merging into /usr/local/lib/, and ensure each project's binaries have correct
lookup paths by setting RPATH on the built binaries or exporting LD_LIBRARY_PATH
per service (or alternatively build static libs) so each binary loads its
matching libggml and backend variants; also adjust docker/install.sh (the step
that copies *.so* into /install/lib) to place files into project-specific
subdirs to match the new COPY destinations.
| RUN echo "llama.cpp: ${LLAMA_COMMIT_HASH}" > /versions.txt && \ | ||
| echo "whisper.cpp: ${WHISPER_COMMIT_HASH}" >> /versions.txt && \ | ||
| echo "stable-diffusion.cpp: ${SD_COMMIT_HASH}" >> /versions.txt && \ | ||
| echo "llama-swap: $(cat /tmp/llama-swap-version)" >> /versions.txt && \ | ||
| echo "backend: ${BACKEND}" >> /versions.txt && \ | ||
| echo "build_timestamp: $(date -u +%Y-%m-%dT%H:%M:%SZ)" >> /versions.txt | ||
|
|
||
| WORKDIR /models | ||
| CMD ["bash"] |
There was a problem hiding this comment.
Drop root before the final image is published.
No USER is set in the final runtime stage, so every upstream server and llama-swap process runs as root. If /dev/dri access is needed, add the matching supplemental group at runtime instead of keeping full root privileges.
One way to harden the final stage
RUN echo "llama.cpp: ${LLAMA_COMMIT_HASH}" > /versions.txt && \
echo "whisper.cpp: ${WHISPER_COMMIT_HASH}" >> /versions.txt && \
echo "stable-diffusion.cpp: ${SD_COMMIT_HASH}" >> /versions.txt && \
echo "llama-swap: $(cat /tmp/llama-swap-version)" >> /versions.txt && \
echo "backend: ${BACKEND}" >> /versions.txt && \
echo "build_timestamp: $(date -u +%Y-%m-%dT%H:%M:%SZ)" >> /versions.txt
+
+RUN useradd --create-home --shell /bin/bash --uid 10001 --user-group appuser && \
+ mkdir -p /models && \
+ chown -R appuser:appuser /models
+
+USER appuser
WORKDIR /models
CMD ["bash"]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| RUN echo "llama.cpp: ${LLAMA_COMMIT_HASH}" > /versions.txt && \ | |
| echo "whisper.cpp: ${WHISPER_COMMIT_HASH}" >> /versions.txt && \ | |
| echo "stable-diffusion.cpp: ${SD_COMMIT_HASH}" >> /versions.txt && \ | |
| echo "llama-swap: $(cat /tmp/llama-swap-version)" >> /versions.txt && \ | |
| echo "backend: ${BACKEND}" >> /versions.txt && \ | |
| echo "build_timestamp: $(date -u +%Y-%m-%dT%H:%M:%SZ)" >> /versions.txt | |
| WORKDIR /models | |
| CMD ["bash"] | |
| RUN echo "llama.cpp: ${LLAMA_COMMIT_HASH}" > /versions.txt && \ | |
| echo "whisper.cpp: ${WHISPER_COMMIT_HASH}" >> /versions.txt && \ | |
| echo "stable-diffusion.cpp: ${SD_COMMIT_HASH}" >> /versions.txt && \ | |
| echo "llama-swap: $(cat /tmp/llama-swap-version)" >> /versions.txt && \ | |
| echo "backend: ${BACKEND}" >> /versions.txt && \ | |
| echo "build_timestamp: $(date -u +%Y-%m-%dT%H:%M:%SZ)" >> /versions.txt | |
| RUN useradd --create-home --shell /bin/bash --uid 10001 --user-group appuser && \ | |
| mkdir -p /models && \ | |
| chown -R appuser:appuser /models | |
| USER appuser | |
| WORKDIR /models | |
| CMD ["bash"] |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docker/Dockerfile` around lines 186 - 194, The final image leaves processes
running as root (no USER set), so change the final Dockerfile stage to create a
non-root user (e.g., "llama" or "llama-swap") and switch to it before
CMD/WORKDIR so the runtime (including llama-swap) does not run as root; ensure
files under /models and any needed sockets/devices are chowned to that user and
document that consumers should pass the matching supplemental group (for
/dev/dri) at container runtime rather than running as root. Locate the final
stage around the WORKDIR/CMD commands and update it to adduser/addgroup, set
ownership on /models and other runtime paths, and add a USER line to drop root
privileges while keeping instructions to supply the supplemental group for
/dev/dri access when needed.
| SD_BRANCH=$(echo "$RELEASE_TAG" | cut -d'-' -f1) | ||
| SD_HASH=$(echo "$RELEASE_TAG" | rev | cut -d'-' -f1 | rev) | ||
| ASSET="sd-${SD_BRANCH}-${SD_HASH}-bin-Linux-Ubuntu-24.04-x86_64-vulkan.zip" | ||
| URL="https://github.com/leejet/stable-diffusion.cpp/releases/download/${RELEASE_TAG}/${ASSET}" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check docker/install.sh lines 49-52 and surrounding context
cat -n docker/install.sh | sed -n '45,60p'Repository: mostlygeek/llama-swap
Length of output: 1167
🏁 Script executed:
# Check docker/Dockerfile to verify the runtime-vulkan stage base image
cat -n docker/Dockerfile | sed -n '130,145p'Repository: mostlygeek/llama-swap
Length of output: 706
🏁 Script executed:
# Look for other hardcoded Ubuntu versions in install.sh to understand the pattern
grep -n "Ubuntu" docker/install.shRepository: mostlygeek/llama-swap
Length of output: 155
🏁 Script executed:
# Check if there are other Ubuntu versions or variants mentioned in the script
grep -i "ubuntu\|artifact\|asset" docker/install.sh | head -20Repository: mostlygeek/llama-swap
Length of output: 656
🏁 Script executed:
# Check if there's any conditional logic or other sd backend handling
grep -B5 -A10 "\"sd\")" docker/install.sh | head -40Repository: mostlygeek/llama-swap
Length of output: 47
Don't ship the Ubuntu 24.04 sd artifact in a 22.04 image.
ASSET on line 51 is hardcoded to the ...Ubuntu-24.04... Vulkan release, but the final runtime-vulkan stage in docker/Dockerfile line 132 is ubuntu:22.04. This mismatch causes GLIBC/ABI incompatibility—binaries compiled against Ubuntu 24.04's libc will fail at startup with missing symbol errors on 22.04. Either align the runtime to Ubuntu 24.04 or build Vulkan sd from source on 22.04 instead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docker/install.sh` around lines 49 - 52, The ASSET variable currently
hardcodes the Ubuntu 24.04 artifact (ASSET) which mismatches the runtime-vulkan
stage (ubuntu:22.04) and will cause libc/ABI failures; update the install.sh
logic to select the correct release artifact for the runtime (either change
ASSET to use Ubuntu-22.04 instead of Ubuntu-24.04, or make ASSET derived from an
environment variable or detected base image) and ensure URL uses that corrected
ASSET; reference the RELEASE_TAG, ASSET and URL variables and validate the
docker/Dockerfile runtime-vulkan base (ubuntu:22.04) so the downloaded sd binary
matches the runtime libc.
No description provided.