build: add cuda13 architecture support by pdscomp · Pull Request #551 · mostlygeek/llama-swap

pdscomp · 2026-03-01T17:10:19Z

Add cuda13 as a supported build architecture, targeting the ghcr.io/ggml-org/llama.cpp:server-cuda13 upstream base image.

The server-cuda13 image ships with CUDA 13 libraries, providing improved performance on recent NVIDIA hardware compared to the existing server-cuda (CUDA 12) image. Users with newer GPUs (e.g., RTX 50-series) benefit from reduced model load latency and higher token throughput.

Add cuda13 to the allowed architectures list in docker/build-container.sh
Add cuda13 to the CI matrix in .github/workflows/containers.yml so the container is built and pushed automatically

…-swap container against the ghcr.io/ggml-org/llama.cpp:server-cuda13 target vice server-cuda

coderabbitai · 2026-03-01T17:10:38Z

Walkthrough

The pull request adds "cuda13" to the container build matrix in the GitHub Actions workflow and extends the allowed architectures list in the build script to permit the new platform during validation.

Changes

Cohort / File(s)	Summary
Add cuda13 platform support `.github/workflows/containers.yml`, `docker/build-container.sh`	Extended build matrix and allowed architectures to include cuda13 platform.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

Improve container workflow and build script #457: Similarly extends container workflow build matrix and ALLOWED_ARCHS list to add a new platform (rocm).

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description check	✅ Passed	The description provides relevant context about the cuda13 addition, including the upstream base image, performance rationale, and lists the specific file changes made.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Title check	✅ Passed	The title 'build: add cuda13 architecture support' directly summarizes the main change—adding CUDA 13 as a supported architecture across build configuration and CI matrix.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

Adds a new Docker build “architecture” option (cuda13) so this repo can build and publish llama-swap images on top of the upstream ghcr.io/ggml-org/llama.cpp CUDA 13 server images.

Changes:

Allow cuda13 as a valid ARCH in docker/build-container.sh.
Add cuda13 to the GitHub Actions container build matrix so CI builds it automatically.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
`docker/build-container.sh`	Extends the allowed architecture list to include `cuda13` so the script will construct and build tags for it.
`.github/workflows/containers.yml`	Extends the CI matrix to run the build for `cuda13` on schedule/manual runs (and build-only on workflow-change pushes).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

coderabbitai

🧹 Nitpick comments (1)

.github/workflows/containers.yml (1)
32-32: Consider centralizing supported platform definitions to avoid drift.

matrix.platform and ALLOWED_ARCHS now require manual sync. A shared source (e.g., a checked-in arch list consumed by both workflow and script) would reduce future mismatch risk.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/containers.yml at line 32, Centralize the supported
platform list by creating a single checked-in source (e.g., platforms.json or
SUPPORTED_ARCHS.txt) and update both matrix.platform in the GitHub Actions
workflow and the ALLOWED_ARCHS variable used by scripts to read from that file;
modify the workflow to load the list (via fromJSON or matrix generation step)
and change the script entrypoint to parse the same file (referencing
ALLOWED_ARCHS and matrix.platform) so both consumers derive the values from the
single source to prevent drift.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.github/workflows/containers.yml:
- Line 32: Centralize the supported platform list by creating a single
checked-in source (e.g., platforms.json or SUPPORTED_ARCHS.txt) and update both
matrix.platform in the GitHub Actions workflow and the ALLOWED_ARCHS variable
used by scripts to read from that file; modify the workflow to load the list
(via fromJSON or matrix generation step) and change the script entrypoint to
parse the same file (referencing ALLOWED_ARCHS and matrix.platform) so both
consumers derive the values from the single source to prevent drift.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 49546e2 and 1607f87.

📒 Files selected for processing (2)

.github/workflows/containers.yml
docker/build-container.sh

mostlygeek · 2026-03-01T17:37:25Z

Thanks!

Add `cuda13` as a supported build architecture, targeting the `ghcr.io/ggml-org/llama.cpp:server-cuda13` upstream base image. The `server-cuda13` image ships with CUDA 13 libraries, providing improved performance on recent NVIDIA hardware compared to the existing `server-cuda` (CUDA 12) image. Users with newer GPUs (e.g., RTX 50-series) benefit from reduced model load latency and higher token throughput. - Add `cuda13` to the allowed architectures list in `docker/build-container.sh` - Add `cuda13` to the CI matrix in `.github/workflows/containers.yml` so the container is built and pushed automatically

Add support for a new cuda13 architecture to allow building the llama…

1607f87

…-swap container against the ghcr.io/ggml-org/llama.cpp:server-cuda13 target vice server-cuda

Copilot AI review requested due to automatic review settings March 1, 2026 17:10

Copilot started reviewing on behalf of pdscomp March 1, 2026 17:10 View session

Copilot AI reviewed Mar 1, 2026

View reviewed changes

coderabbitai bot reviewed Mar 1, 2026

View reviewed changes

pdscomp changed the title ~~docker: add cuda13 architecture support~~ build: add cuda13 architecture support Mar 1, 2026

mostlygeek merged commit 181f71c into mostlygeek:main Mar 1, 2026
4 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build: add cuda13 architecture support#551

build: add cuda13 architecture support#551
mostlygeek merged 1 commit intomostlygeek:mainfrom
pdscomp:cuda13

pdscomp commented Mar 1, 2026

Uh oh!

coderabbitai bot commented Mar 1, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

mostlygeek commented Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pdscomp commented Mar 1, 2026

Uh oh!

coderabbitai bot commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mostlygeek commented Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

coderabbitai bot commented Mar 1, 2026 •

edited

Loading