Skip to content

checkpoint#594

Closed
mostlygeek wants to merge 9 commits intomainfrom
claude/unified-ai-docker-container-G9FdP
Closed

checkpoint#594
mostlygeek wants to merge 9 commits intomainfrom
claude/unified-ai-docker-container-G9FdP

Conversation

@mostlygeek
Copy link
Copy Markdown
Owner

No description provided.

mostlygeek and others added 6 commits March 17, 2026 10:54
Build llama-swap binary from local source code instead of downloading
from GitHub releases. This ensures the container uses the exact code
in the repository.

- Add golang:1.25-alpine builder stage to compile llama-swap
- Generate version from git hash with +dirty suffix for unstaged changes
- Update build-image.sh to use repository root as build context
- Remove LLAMA_SWAP_VERSION environment variable and related code
- Add test-binaries.sh to Dockerfile.vulkan for consistency

Both CUDA and Vulkan Dockerfiles now build llama-swap from source.
Replace Dockerfile.cuda and Dockerfile.vulkan with a single
Dockerfile that uses BuildKit conditional FROM stages to select
the GPU backend via --build-arg BACKEND=cuda|vulkan.

- Eliminate ~40% code duplication between backend Dockerfiles
- Scope BuildKit cache IDs by backend to prevent cross-contamination
- Add binary validation that fails the build if binaries are missing
- Add version tracking and convenience symlinks from Vulkan file
- Update test-binaries.sh to auto-detect both CUDA and Vulkan
- Update build-image.sh to pass BACKEND arg to unified Dockerfile
- Document how to add new server projects in Dockerfile header

https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX
Move all cmake flag selection and build logic from inline Dockerfile
RUN commands into a single install.sh script. Each build stage is
now just: COPY source, COPY install.sh, RUN install.sh $BACKEND $PROJECT.

Adding a new server project means adding a case block to install.sh
with the project-specific cmake flags and targets.

https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 20, 2026

Walkthrough

Adds a unified multi-stage CUDA/Vulkan Docker build, helper scripts and docs for building/testing GPU-enabled binaries, updates docker/.gitignore, and comments out the Linux ARM64 build step in the Makefile so the linux-arm64 artifact is no longer produced.

Changes

Cohort / File(s) Summary
Makefile
Makefile
Comments out the GOOS=linux GOARCH=arm64 go build ... -o $(BUILD_DIR)/$(APP_NAME)-linux-arm64 line; make linux/make all no longer produces the linux-arm64 artifact.
Docker build & runtime
docker/Dockerfile, docker/build-image.sh, docker/install.sh
Adds a unified multi-stage Dockerfile with BACKEND (`cuda
Tests & CI helpers
docker/test-binaries.sh
Adds a GPU-aware smoke-test script that detects NVIDIA/Vulkan drivers, adjusts LD_LIBRARY_PATH, verifies expected server binaries can start (--help/-h), and emits diagnostics.
Docs & ignore
docker/AGENTS.md, docker/.gitignore
Adds operational documentation for VM sandbox and Docker build workflow; adds buildkitd.toml to docker/.gitignore.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

🚥 Pre-merge checks | ❌ 3

❌ Failed checks (1 warning, 2 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'checkpoint' is vague and generic, providing no meaningful information about the substantial changeset involving Docker build infrastructure and multiple new scripts. Use a more descriptive title that captures the main change, such as 'Add unified Docker build infrastructure for GPU-enabled AI servers' or 'Introduce Docker multi-stage build with CUDA/Vulkan support'.
Description check ❓ Inconclusive No pull request description was provided, making it impossible to assess whether the author's intent relates to the changeset. Add a description explaining the purpose of the Docker infrastructure changes, the supported backends, and how to use the new build scripts.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/unified-ai-docker-container-G9FdP
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

claude added 2 commits March 20, 2026 08:51
Strip section divider banners and inline commentary from Dockerfile
and install.sh. Keep only the header usage docs and short stage labels.

https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX
Use nvidia/cuda:12.4.0-devel as the base and layer the Vulkan SDK on
top. One builder image has everything needed for both backends,
eliminating the conditional FROM and duplicate package installs.

https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
docker/Dockerfile (2)

215-216: Consider adding a non-root user for improved container security.

Static analysis flagged DS-0002: the container runs as root by default. While GPU access sometimes requires elevated privileges, consider adding a non-root user for the runtime stage where possible, or document the rationale for running as root.

Optional: Add non-root user
 WORKDIR /models
+
+# Create non-root user (uncomment if GPU device permissions allow)
+# RUN useradd -m -u 1000 llama-swap
+# USER llama-swap
+
 CMD ["bash"]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker/Dockerfile` around lines 215 - 216, The Dockerfile currently sets
WORKDIR /models and CMD ["bash"] and runs as root by default; add a non-root
user (e.g., create user 'app' or 'nonroot') in the runtime stage, chown /models
and any other runtime directories to that user, and switch to it via USER before
CMD to avoid running container processes as root; if GPU/device access prevents
dropping privileges, add a clear comment near WORKDIR /models and CMD explaining
why root is required and what security mitigations are in place.

77-77: Architecture is hardcoded to amd64.

GOARCH=amd64 limits the Go binary to x86_64. This is consistent with the Makefile change removing ARM64, but worth noting if ARM64 GPU support is ever desired.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker/Dockerfile` at line 77, The Dockerfile currently hardcodes GOARCH in
the ENV line (ENV CGO_ENABLED=0 GOOS=linux GOARCH=amd64), which prevents
building for non-x86_64 targets; change this to use a build-time argument or
Docker meta-arg so the architecture is configurable (e.g. introduce ARG
TARGETARCH and set GOARCH from that, or accept a build ARG like BUILD_GOARCH and
set the ENV GOARCH accordingly), ensuring CGO_ENABLED and GOOS remain set while
GOARCH is driven by the build arg so ARM64 or other architectures can be
targeted when needed.
docker/test-binaries.sh (1)

57-70: Vulkan detection is only reached if CUDA stubs don't exist.

The current logic prioritizes CUDA stub fallback over Vulkan detection. If running a Vulkan image on a host without NVIDIA drivers but with /usr/local/cuda/lib64/stubs present (from a prior CUDA install), it will configure for CUDA stubs instead of Vulkan.

Consider reordering or adding explicit backend detection

One option is to check for a backend indicator (e.g., presence of specific binaries or an environment variable) before falling back to stubs:

 if detect_cuda_drivers; then
     print_info "Using real NVIDIA drivers"
     export LD_LIBRARY_PATH="/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}"
+elif detect_vulkan; then
+    print_info "Vulkan backend detected"
 elif [ -d "/usr/local/cuda/lib64/stubs" ]; then
     print_warn "No real NVIDIA drivers detected"
     ...
-elif detect_vulkan; then
-    print_info "Vulkan backend detected"
 else
     print_warn "No GPU drivers detected (CPU-only mode)"
 fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker/test-binaries.sh` around lines 57 - 70, The current conditional
prioritizes CUDA stubs over Vulkan so detect_vulkan() is only evaluated after
the stubs branch; update the conditional logic in docker/test-binaries.sh so
Vulkan backend is detected before falling back to CUDA stubs (or add an explicit
backend override like an env var BACKEND=vulkan/force_vulkan checked first).
Concretely, call detect_vulkan (or check the override env var) prior to the
"/usr/local/cuda/lib64/stubs" branch, or restructure the branches to: if
detect_cuda_drivers; elif detect_vulkan || BACKEND=vulkan; elif [ -d
"/usr/local/cuda/lib64/stubs" ]; else ... ensuring LD_LIBRARY_PATH is only
modified in the stubs/real-cuda branches.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docker/build-image.sh`:
- Around line 116-127: The get_default_branch function currently checks git
ls-remote --heads "${repo_url}" master by exit code which is always 0; update
get_default_branch to capture the output of git ls-remote --heads "${repo_url}"
<branch> (or grep for refs/heads/<branch>) into a variable and test that the
output is non-empty before echoing the branch; if neither master nor main
produce non-empty output, keep the fallback to "master". Use the existing
function name get_default_branch and its repo_url parameter to locate the
change.
- Around line 206-217: The script writes buildkitd.toml to the current working
directory instead of the docker/ directory; update the heredoc target and the
buildx call to use the script directory (e.g., SCRIPT_DIR) so the file is
created and referenced consistently. Specifically, change the creation of
buildkitd.toml to write to "$SCRIPT_DIR/buildkitd.toml" (or ensure you cd to
SCRIPT_DIR before writing) and pass that same path to docker buildx create's
--buildkitd-config flag; ensure references to BUILDER_NAME remain unchanged.

In `@docker/install.sh`:
- Around line 22-23: The linker flag values for CMAKE_EXE_LINKER_FLAGS and
CMAKE_SHARED_LINKER_FLAGS are being split by the shell because the -D arguments
lack quotes; wrap the entire value for each CMake variable in quotes so the
space and the trailing -lcuda remain part of the same -D argument (i.e. quote
the values passed to CMAKE_EXE_LINKER_FLAGS and CMAKE_SHARED_LINKER_FLAGS in
docker/install.sh).

---

Nitpick comments:
In `@docker/Dockerfile`:
- Around line 215-216: The Dockerfile currently sets WORKDIR /models and CMD
["bash"] and runs as root by default; add a non-root user (e.g., create user
'app' or 'nonroot') in the runtime stage, chown /models and any other runtime
directories to that user, and switch to it via USER before CMD to avoid running
container processes as root; if GPU/device access prevents dropping privileges,
add a clear comment near WORKDIR /models and CMD explaining why root is required
and what security mitigations are in place.
- Line 77: The Dockerfile currently hardcodes GOARCH in the ENV line (ENV
CGO_ENABLED=0 GOOS=linux GOARCH=amd64), which prevents building for non-x86_64
targets; change this to use a build-time argument or Docker meta-arg so the
architecture is configurable (e.g. introduce ARG TARGETARCH and set GOARCH from
that, or accept a build ARG like BUILD_GOARCH and set the ENV GOARCH
accordingly), ensuring CGO_ENABLED and GOOS remain set while GOARCH is driven by
the build arg so ARM64 or other architectures can be targeted when needed.

In `@docker/test-binaries.sh`:
- Around line 57-70: The current conditional prioritizes CUDA stubs over Vulkan
so detect_vulkan() is only evaluated after the stubs branch; update the
conditional logic in docker/test-binaries.sh so Vulkan backend is detected
before falling back to CUDA stubs (or add an explicit backend override like an
env var BACKEND=vulkan/force_vulkan checked first). Concretely, call
detect_vulkan (or check the override env var) prior to the
"/usr/local/cuda/lib64/stubs" branch, or restructure the branches to: if
detect_cuda_drivers; elif detect_vulkan || BACKEND=vulkan; elif [ -d
"/usr/local/cuda/lib64/stubs" ]; else ... ensuring LD_LIBRARY_PATH is only
modified in the stubs/real-cuda branches.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2c41dfab-9a5b-4ca5-8898-c60b01920346

📥 Commits

Reviewing files that changed from the base of the PR and between a3725e7 and f6fe7c5.

📒 Files selected for processing (7)
  • Makefile
  • docker/.gitignore
  • docker/AGENTS.md
  • docker/Dockerfile
  • docker/build-image.sh
  • docker/install.sh
  • docker/test-binaries.sh

Comment on lines +116 to +127
get_default_branch() {
local repo_url="$1"

# Check for master first
if git ls-remote --heads "${repo_url}" master &>/dev/null; then
echo "master"
elif git ls-remote --heads "${repo_url}" main &>/dev/null; then
echo "main"
else
echo "master" # fallback
fi
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

get_default_branch doesn't reliably detect if a branch exists.

git ls-remote --heads returns exit code 0 regardless of whether the branch exists; it just returns empty output. The current logic will always fall through to "master" fallback even if neither branch exists.

Proposed fix to check output instead of exit code
 get_default_branch() {
     local repo_url="$1"
 
     # Check for master first
-    if git ls-remote --heads "${repo_url}" master &>/dev/null; then
+    if git ls-remote --heads "${repo_url}" master 2>/dev/null | grep -q .; then
         echo "master"
-    elif git ls-remote --heads "${repo_url}" main &>/dev/null; then
+    elif git ls-remote --heads "${repo_url}" main 2>/dev/null | grep -q .; then
         echo "main"
     else
         echo "master"  # fallback
     fi
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
get_default_branch() {
local repo_url="$1"
# Check for master first
if git ls-remote --heads "${repo_url}" master &>/dev/null; then
echo "master"
elif git ls-remote --heads "${repo_url}" main &>/dev/null; then
echo "main"
else
echo "master" # fallback
fi
}
get_default_branch() {
local repo_url="$1"
# Check for master first
if git ls-remote --heads "${repo_url}" master 2>/dev/null | grep -q .; then
echo "master"
elif git ls-remote --heads "${repo_url}" main 2>/dev/null | grep -q .; then
echo "main"
else
echo "master" # fallback
fi
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker/build-image.sh` around lines 116 - 127, The get_default_branch
function currently checks git ls-remote --heads "${repo_url}" master by exit
code which is always 0; update get_default_branch to capture the output of git
ls-remote --heads "${repo_url}" <branch> (or grep for refs/heads/<branch>) into
a variable and test that the output is non-empty before echoing the branch; if
neither master nor main produce non-empty output, keep the fallback to "master".
Use the existing function name get_default_branch and its repo_url parameter to
locate the change.

Comment on lines +206 to +217

# Create buildkitd.toml config file
cat > buildkitd.toml << 'BUILDKIT_EOF'
[worker.oci]
max-parallelism = 1
BUILDKIT_EOF

# Create the builder with the config
docker buildx create --name "$BUILDER_NAME" \
--driver docker-container \
--buildkitd-config buildkitd.toml \
--use
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

buildkitd.toml is written to CWD, not docker/ directory.

The script writes buildkitd.toml to the current working directory, but docker/.gitignore expects it in the docker/ directory. If the script is invoked as ./docker/build-image.sh from the repo root, the file will be created in the wrong location.

Proposed fix to use SCRIPT_DIR for consistent path
     echo "Creating custom buildx builder with max-parallelism=1..."
     
     # Create buildkitd.toml config file
-    cat > buildkitd.toml << 'BUILDKIT_EOF'
+    cat > "${SCRIPT_DIR}/buildkitd.toml" << 'BUILDKIT_EOF'
 [worker.oci]
   max-parallelism = 1
 BUILDKIT_EOF
     
     # Create the builder with the config
     docker buildx create --name "$BUILDER_NAME" \
         --driver docker-container \
-        --buildkitd-config buildkitd.toml \
+        --buildkitd-config "${SCRIPT_DIR}/buildkitd.toml" \
         --use
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Create buildkitd.toml config file
cat > buildkitd.toml << 'BUILDKIT_EOF'
[worker.oci]
max-parallelism = 1
BUILDKIT_EOF
# Create the builder with the config
docker buildx create --name "$BUILDER_NAME" \
--driver docker-container \
--buildkitd-config buildkitd.toml \
--use
# Create buildkitd.toml config file
cat > "${SCRIPT_DIR}/buildkitd.toml" << 'BUILDKIT_EOF'
[worker.oci]
max-parallelism = 1
BUILDKIT_EOF
# Create the builder with the config
docker buildx create --name "$BUILDER_NAME" \
--driver docker-container \
--buildkitd-config "${SCRIPT_DIR}/buildkitd.toml" \
--use
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker/build-image.sh` around lines 206 - 217, The script writes
buildkitd.toml to the current working directory instead of the docker/
directory; update the heredoc target and the buildx call to use the script
directory (e.g., SCRIPT_DIR) so the file is created and referenced consistently.
Specifically, change the creation of buildkitd.toml to write to
"$SCRIPT_DIR/buildkitd.toml" (or ensure you cd to SCRIPT_DIR before writing) and
pass that same path to docker buildx create's --buildkitd-config flag; ensure
references to BUILDER_NAME remain unchanged.

Comment on lines +22 to +23
-DCMAKE_EXE_LINKER_FLAGS=-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -lcuda
-DCMAKE_SHARED_LINKER_FLAGS=-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -lcuda
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Linker flags are split incorrectly due to missing quotes in the CMake variable.

The -lcuda is interpreted as a separate shell word rather than part of the CMake variable value. This will cause -lcuda to be passed as a separate argument to CMake, which is likely not the intent.

Proposed fix
-            -DCMAKE_EXE_LINKER_FLAGS=-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -lcuda
-            -DCMAKE_SHARED_LINKER_FLAGS=-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -lcuda
+            -DCMAKE_EXE_LINKER_FLAGS='-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -lcuda'
+            -DCMAKE_SHARED_LINKER_FLAGS='-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -lcuda'
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
-DCMAKE_EXE_LINKER_FLAGS=-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -lcuda
-DCMAKE_SHARED_LINKER_FLAGS=-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -lcuda
-DCMAKE_EXE_LINKER_FLAGS='-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -lcuda'
-DCMAKE_SHARED_LINKER_FLAGS='-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -lcuda'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker/install.sh` around lines 22 - 23, The linker flag values for
CMAKE_EXE_LINKER_FLAGS and CMAKE_SHARED_LINKER_FLAGS are being split by the
shell because the -D arguments lack quotes; wrap the entire value for each CMake
variable in quotes so the space and the trailing -lcuda remain part of the same
-D argument (i.e. quote the values passed to CMAKE_EXE_LINKER_FLAGS and
CMAKE_SHARED_LINKER_FLAGS in docker/install.sh).

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
docker/Dockerfile (2)

156-156: Consider pinning Python dependency versions.

Installing numpy and sentencepiece without version pins can lead to non-reproducible builds if upstream releases introduce breaking changes.

♻️ Suggested fix
-RUN pip3 install --no-cache-dir numpy sentencepiece
+RUN pip3 install --no-cache-dir numpy==1.26.4 sentencepiece==0.2.0

Or use a requirements file for better maintainability.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker/Dockerfile` at line 156, The Dockerfile currently runs "pip3 install
--no-cache-dir numpy sentencepiece" without pins; update the RUN step to install
fixed versions (e.g., pin numpy and sentencepiece to specific versions) or
switch to installing from a committed requirements.txt and use "pip3 install
--no-cache-dir -r requirements.txt" so builds are reproducible and maintainable;
ensure the chosen versions are tested and documented.

140-192: Container runs as root user (security consideration).

The static analysis (DS-0002) correctly identifies that no USER directive is set, meaning the container runs as root. For production deployments, running as a non-root user is a security best practice.

However, for GPU workloads, non-root users require proper group membership (e.g., video, render) and device permissions, which may complicate setup. If this image is primarily for development/local use, running as root may be acceptable.

🛡️ Optional: Add non-root user for hardened deployments
 RUN echo "build_timestamp: $(date -u +%Y-%m-%dT%H:%M:%SZ)" >> /versions.txt

+# Create non-root user for security (optional: uncomment for hardened deployments)
+# RUN useradd -m -s /bin/bash llama && \
+#     chown -R llama:llama /models /usr/local/bin
+# USER llama

 WORKDIR /models
 CMD ["bash"]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker/Dockerfile` around lines 140 - 192, The image currently never drops
root (no USER directive) so containers run as root; create a non-root user and
switch to it by adding user creation and a final USER instruction and ensure
file ownership/permissions are adjusted for that user: create a dedicated user
(e.g., "appuser") and group, chown /usr/local/bin, /usr/local/lib, /models and
any runtime-writable paths (also ensure docker/test-binaries.sh remains
executable), and set USER appuser before CMD; for GPU-runner compatibility
optionally add the user to groups like "video" or "render" (or expose build args
to control group membership) so binaries like llama-server, whisper-server,
sd-server and llama-swap still validate when run as non-root.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@docker/Dockerfile`:
- Line 156: The Dockerfile currently runs "pip3 install --no-cache-dir numpy
sentencepiece" without pins; update the RUN step to install fixed versions
(e.g., pin numpy and sentencepiece to specific versions) or switch to installing
from a committed requirements.txt and use "pip3 install --no-cache-dir -r
requirements.txt" so builds are reproducible and maintainable; ensure the chosen
versions are tested and documented.
- Around line 140-192: The image currently never drops root (no USER directive)
so containers run as root; create a non-root user and switch to it by adding
user creation and a final USER instruction and ensure file ownership/permissions
are adjusted for that user: create a dedicated user (e.g., "appuser") and group,
chown /usr/local/bin, /usr/local/lib, /models and any runtime-writable paths
(also ensure docker/test-binaries.sh remains executable), and set USER appuser
before CMD; for GPU-runner compatibility optionally add the user to groups like
"video" or "render" (or expose build args to control group membership) so
binaries like llama-server, whisper-server, sd-server and llama-swap still
validate when run as non-root.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b98cb8e6-f819-4629-bc6b-192bd5a02e99

📥 Commits

Reviewing files that changed from the base of the PR and between f6fe7c5 and 5f6540e.

📒 Files selected for processing (1)
  • docker/Dockerfile

Download prebuilt vulkan binaries from GitHub releases instead of
building from source. This significantly speeds up vulkan image builds.

- install.sh downloads release archives for vulkan llama/sd builds
- build-image.sh resolves latest release tags for vulkan builds
- whisper.cpp still builds from source (no prebuilt vulkan releases)

https://claude.ai/code/session_01Fin2hBgPifbgF8H9kWyEvX
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
docker/build-image.sh (1)

261-279: Presence checks won't catch broken startup.

which only proves the binary is on PATH; it still passes when the image has bad shared-library wiring or an ABI mismatch. This PR already copies docker/test-binaries.sh into the image in docker/Dockerfile Lines 155-156, so calling that here would give you a real post-build smoke test instead of a path check.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker/build-image.sh` around lines 261 - 279, Replace the superficial
PATH-based checks (the MISSING_BINARIES array and the loop that runs which for
each binary) with a real smoke-test invocation of the test script already copied
into the image; run docker run --rm "${DOCKER_IMAGE_TAG}" and execute the
docker/test-binaries.sh inside the container (use the same path used in the
Dockerfile), capture its exit status and fail the build if it returns non-zero
so shared-library/ABI/startup failures are detected; keep DOCKER_IMAGE_TAG as
the image identifier and remove the which-based logic (MISSING_BINARIES and the
loop) in build-image.sh.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docker/Dockerfile`:
- Around line 186-194: The final image leaves processes running as root (no USER
set), so change the final Dockerfile stage to create a non-root user (e.g.,
"llama" or "llama-swap") and switch to it before CMD/WORKDIR so the runtime
(including llama-swap) does not run as root; ensure files under /models and any
needed sockets/devices are chowned to that user and document that consumers
should pass the matching supplemental group (for /dev/dri) at container runtime
rather than running as root. Locate the final stage around the WORKDIR/CMD
commands and update it to adduser/addgroup, set ownership on /models and other
runtime paths, and add a USER line to drop root privileges while keeping
instructions to supply the supplemental group for /dev/dri access when needed.
- Around line 162-172: The Dockerfile currently copies all project shared libs
into a single /usr/local/lib/ (COPY --from=llama-build /install/lib/, COPY
--from=whisper-build /install/lib/, COPY --from=sd-build /install/lib/) causing
later builds to overwrite earlier .so files; update the Dockerfile and install
process to keep per-project library directories (e.g., /usr/local/lib/llama/,
/usr/local/lib/whisper/, /usr/local/lib/sd/) instead of merging into
/usr/local/lib/, and ensure each project's binaries have correct lookup paths by
setting RPATH on the built binaries or exporting LD_LIBRARY_PATH per service (or
alternatively build static libs) so each binary loads its matching libggml and
backend variants; also adjust docker/install.sh (the step that copies *.so* into
/install/lib) to place files into project-specific subdirs to match the new COPY
destinations.

In `@docker/install.sh`:
- Around line 49-52: The ASSET variable currently hardcodes the Ubuntu 24.04
artifact (ASSET) which mismatches the runtime-vulkan stage (ubuntu:22.04) and
will cause libc/ABI failures; update the install.sh logic to select the correct
release artifact for the runtime (either change ASSET to use Ubuntu-22.04
instead of Ubuntu-24.04, or make ASSET derived from an environment variable or
detected base image) and ensure URL uses that corrected ASSET; reference the
RELEASE_TAG, ASSET and URL variables and validate the docker/Dockerfile
runtime-vulkan base (ubuntu:22.04) so the downloaded sd binary matches the
runtime libc.

---

Nitpick comments:
In `@docker/build-image.sh`:
- Around line 261-279: Replace the superficial PATH-based checks (the
MISSING_BINARIES array and the loop that runs which for each binary) with a real
smoke-test invocation of the test script already copied into the image; run
docker run --rm "${DOCKER_IMAGE_TAG}" and execute the docker/test-binaries.sh
inside the container (use the same path used in the Dockerfile), capture its
exit status and fail the build if it returns non-zero so
shared-library/ABI/startup failures are detected; keep DOCKER_IMAGE_TAG as the
image identifier and remove the which-based logic (MISSING_BINARIES and the
loop) in build-image.sh.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: cc6d7a27-2bc5-4490-8e9c-5b3793dd5685

📥 Commits

Reviewing files that changed from the base of the PR and between 5f6540e and b7e0d26.

📒 Files selected for processing (3)
  • docker/Dockerfile
  • docker/build-image.sh
  • docker/install.sh

Comment on lines +162 to +172
COPY --from=llama-build /install/bin/llama-server /usr/local/bin/
COPY --from=llama-build /install/bin/llama-cli /usr/local/bin/
COPY --from=llama-build /install/lib/ /usr/local/lib/

COPY --from=whisper-build /install/bin/whisper-server /usr/local/bin/
COPY --from=whisper-build /install/bin/whisper-cli /usr/local/bin/
COPY --from=whisper-build /install/lib/ /usr/local/lib/

COPY --from=sd-build /install/bin/sd-server /usr/local/bin/
COPY --from=sd-build /install/bin/sd-cli /usr/local/bin/
COPY --from=sd-build /install/lib/ /usr/local/lib/
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate the docker/install.sh and Dockerfile files
find . -name "Dockerfile" -o -name "install.sh" | head -20

Repository: mostlygeek/llama-swap

Length of output: 125


🏁 Script executed:

# Get the Dockerfile and examine the structure around lines 162-172
fd Dockerfile

Repository: mostlygeek/llama-swap

Length of output: 82


🏁 Script executed:

# Check the total line count and examine the Dockerfile around the specified lines
wc -l docker/Dockerfile

Repository: mostlygeek/llama-swap

Length of output: 86


🏁 Script executed:

# Read the Dockerfile to understand its full structure
cat -n docker/Dockerfile | head -200

Repository: mostlygeek/llama-swap

Length of output: 8976


🏁 Script executed:

# Now examine docker/install.sh to verify the claims about line 153 and *.so* copying
wc -l docker/install.sh

Repository: mostlygeek/llama-swap

Length of output: 86


🏁 Script executed:

# Read docker/install.sh to understand the install process
cat -n docker/install.sh

Repository: mostlygeek/llama-swap

Length of output: 7086


🏁 Script executed:

# Check the CMakeLists.txt or build outputs to understand what libraries each project creates
# First, let's search for references to libggml or shared library creation in the build directories
rg -A 5 "\.so" docker/install.sh

Repository: mostlygeek/llama-swap

Length of output: 855


🏁 Script executed:

# Check if there's any documentation about what libraries are installed
rg -i "library\|libggml\|\.so" docker/ --max-count=20

Repository: mostlygeek/llama-swap

Length of output: 47


Keep each project's shared libraries isolated.

docker/install.sh line 153 copies every *.so* file into /install/lib, and Dockerfile lines 164, 168, and 172 overlay all three project directories into the same /usr/local/lib/. Since llama.cpp, whisper.cpp, and stable-diffusion.cpp all use GGML, they share libggml.so and backend-specific libraries. Later COPYs overwrite earlier ones—sd-build's libraries override whisper-build's, which override llama-build's. A binary may load the wrong version or backend variant of a library at runtime. Keep them under per-project library paths and set RPATH or LD_LIBRARY_PATH per binary, or build static.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker/Dockerfile` around lines 162 - 172, The Dockerfile currently copies
all project shared libs into a single /usr/local/lib/ (COPY --from=llama-build
/install/lib/, COPY --from=whisper-build /install/lib/, COPY --from=sd-build
/install/lib/) causing later builds to overwrite earlier .so files; update the
Dockerfile and install process to keep per-project library directories (e.g.,
/usr/local/lib/llama/, /usr/local/lib/whisper/, /usr/local/lib/sd/) instead of
merging into /usr/local/lib/, and ensure each project's binaries have correct
lookup paths by setting RPATH on the built binaries or exporting LD_LIBRARY_PATH
per service (or alternatively build static libs) so each binary loads its
matching libggml and backend variants; also adjust docker/install.sh (the step
that copies *.so* into /install/lib) to place files into project-specific
subdirs to match the new COPY destinations.

Comment on lines +186 to +194
RUN echo "llama.cpp: ${LLAMA_COMMIT_HASH}" > /versions.txt && \
echo "whisper.cpp: ${WHISPER_COMMIT_HASH}" >> /versions.txt && \
echo "stable-diffusion.cpp: ${SD_COMMIT_HASH}" >> /versions.txt && \
echo "llama-swap: $(cat /tmp/llama-swap-version)" >> /versions.txt && \
echo "backend: ${BACKEND}" >> /versions.txt && \
echo "build_timestamp: $(date -u +%Y-%m-%dT%H:%M:%SZ)" >> /versions.txt

WORKDIR /models
CMD ["bash"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Drop root before the final image is published.

No USER is set in the final runtime stage, so every upstream server and llama-swap process runs as root. If /dev/dri access is needed, add the matching supplemental group at runtime instead of keeping full root privileges.

One way to harden the final stage
 RUN echo "llama.cpp: ${LLAMA_COMMIT_HASH}" > /versions.txt && \
     echo "whisper.cpp: ${WHISPER_COMMIT_HASH}" >> /versions.txt && \
     echo "stable-diffusion.cpp: ${SD_COMMIT_HASH}" >> /versions.txt && \
     echo "llama-swap: $(cat /tmp/llama-swap-version)" >> /versions.txt && \
     echo "backend: ${BACKEND}" >> /versions.txt && \
     echo "build_timestamp: $(date -u +%Y-%m-%dT%H:%M:%SZ)" >> /versions.txt
+
+RUN useradd --create-home --shell /bin/bash --uid 10001 --user-group appuser && \
+    mkdir -p /models && \
+    chown -R appuser:appuser /models
+
+USER appuser
 
 WORKDIR /models
 CMD ["bash"]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
RUN echo "llama.cpp: ${LLAMA_COMMIT_HASH}" > /versions.txt && \
echo "whisper.cpp: ${WHISPER_COMMIT_HASH}" >> /versions.txt && \
echo "stable-diffusion.cpp: ${SD_COMMIT_HASH}" >> /versions.txt && \
echo "llama-swap: $(cat /tmp/llama-swap-version)" >> /versions.txt && \
echo "backend: ${BACKEND}" >> /versions.txt && \
echo "build_timestamp: $(date -u +%Y-%m-%dT%H:%M:%SZ)" >> /versions.txt
WORKDIR /models
CMD ["bash"]
RUN echo "llama.cpp: ${LLAMA_COMMIT_HASH}" > /versions.txt && \
echo "whisper.cpp: ${WHISPER_COMMIT_HASH}" >> /versions.txt && \
echo "stable-diffusion.cpp: ${SD_COMMIT_HASH}" >> /versions.txt && \
echo "llama-swap: $(cat /tmp/llama-swap-version)" >> /versions.txt && \
echo "backend: ${BACKEND}" >> /versions.txt && \
echo "build_timestamp: $(date -u +%Y-%m-%dT%H:%M:%SZ)" >> /versions.txt
RUN useradd --create-home --shell /bin/bash --uid 10001 --user-group appuser && \
mkdir -p /models && \
chown -R appuser:appuser /models
USER appuser
WORKDIR /models
CMD ["bash"]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker/Dockerfile` around lines 186 - 194, The final image leaves processes
running as root (no USER set), so change the final Dockerfile stage to create a
non-root user (e.g., "llama" or "llama-swap") and switch to it before
CMD/WORKDIR so the runtime (including llama-swap) does not run as root; ensure
files under /models and any needed sockets/devices are chowned to that user and
document that consumers should pass the matching supplemental group (for
/dev/dri) at container runtime rather than running as root. Locate the final
stage around the WORKDIR/CMD commands and update it to adduser/addgroup, set
ownership on /models and other runtime paths, and add a USER line to drop root
privileges while keeping instructions to supply the supplemental group for
/dev/dri access when needed.

Comment on lines +49 to +52
SD_BRANCH=$(echo "$RELEASE_TAG" | cut -d'-' -f1)
SD_HASH=$(echo "$RELEASE_TAG" | rev | cut -d'-' -f1 | rev)
ASSET="sd-${SD_BRANCH}-${SD_HASH}-bin-Linux-Ubuntu-24.04-x86_64-vulkan.zip"
URL="https://github.com/leejet/stable-diffusion.cpp/releases/download/${RELEASE_TAG}/${ASSET}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Check docker/install.sh lines 49-52 and surrounding context
cat -n docker/install.sh | sed -n '45,60p'

Repository: mostlygeek/llama-swap

Length of output: 1167


🏁 Script executed:

# Check docker/Dockerfile to verify the runtime-vulkan stage base image
cat -n docker/Dockerfile | sed -n '130,145p'

Repository: mostlygeek/llama-swap

Length of output: 706


🏁 Script executed:

# Look for other hardcoded Ubuntu versions in install.sh to understand the pattern
grep -n "Ubuntu" docker/install.sh

Repository: mostlygeek/llama-swap

Length of output: 155


🏁 Script executed:

# Check if there are other Ubuntu versions or variants mentioned in the script
grep -i "ubuntu\|artifact\|asset" docker/install.sh | head -20

Repository: mostlygeek/llama-swap

Length of output: 656


🏁 Script executed:

# Check if there's any conditional logic or other sd backend handling
grep -B5 -A10 "\"sd\")" docker/install.sh | head -40

Repository: mostlygeek/llama-swap

Length of output: 47


Don't ship the Ubuntu 24.04 sd artifact in a 22.04 image.

ASSET on line 51 is hardcoded to the ...Ubuntu-24.04... Vulkan release, but the final runtime-vulkan stage in docker/Dockerfile line 132 is ubuntu:22.04. This mismatch causes GLIBC/ABI incompatibility—binaries compiled against Ubuntu 24.04's libc will fail at startup with missing symbol errors on 22.04. Either align the runtime to Ubuntu 24.04 or build Vulkan sd from source on 22.04 instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker/install.sh` around lines 49 - 52, The ASSET variable currently
hardcodes the Ubuntu 24.04 artifact (ASSET) which mismatches the runtime-vulkan
stage (ubuntu:22.04) and will cause libc/ABI failures; update the install.sh
logic to select the correct release artifact for the runtime (either change
ASSET to use Ubuntu-22.04 instead of Ubuntu-24.04, or make ASSET derived from an
environment variable or detected base image) and ensure URL uses that corrected
ASSET; reference the RELEASE_TAG, ASSET and URL variables and validate the
docker/Dockerfile runtime-vulkan base (ubuntu:22.04) so the downloaded sd binary
matches the runtime libc.

@mostlygeek mostlygeek marked this pull request as draft March 21, 2026 11:51
@mostlygeek mostlygeek closed this Mar 22, 2026
@mostlygeek mostlygeek deleted the claude/unified-ai-docker-container-G9FdP branch March 22, 2026 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants