feat: add whisper.cpp server to cuda and musa containers by rare-magma · Pull Request #507 · mostlygeek/llama-swap

rare-magma · 2026-02-03T20:13:09Z

Add whisper-server from whisper.cpp docker image for cuda, musa and vulkan containers.

Summary by CodeRabbit

New Features
- Added support for whisper.cpp / whisper-server as a local model option and deployable service, included in nightly container images for supported architectures (cuda, musa, vulkan), and exposed via a transcription endpoint.
Documentation
- Updated README and install notes to list whisper.cpp among supported local servers and note inclusion of whisper-server in relevant images; added example service configuration showing endpoint and startup command.

Signed-off-by: rare-magma <rare-magma@posteo.eu>

coderabbitai · 2026-02-03T20:13:44Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds whisper.cpp/whisper-server support: README updates, new WH_IMAGE/WH_TAG build variables, architecture-gated whisper layering in the container build script, a new multi-stage Containerfile to inject whisper-server, and an example whisper model entry in docker/config.example.yaml. (≤50 words)

Changes

Cohort / File(s)	Summary
Documentation `README.md`	Added `whisper.cpp` to the list of local OpenAI-compatible servers and noted `whisper-server` inclusion for `cuda`, `musa`, and `vulkan` Docker installs.
Build script `docker/build-container.sh`	Introduced `WH_IMAGE` and `WH_TAG` env vars; extended per-architecture build loop to include an architecture-gated whisper-server layering step for `cuda`, `musa`, and `vulkan` using `docker/llama-swap-whisper.Containerfile`, passing `BASE`, `WH_IMAGE`, `WH_TAG`, `UID`, and `GID`.
Service configuration `docker/config.example.yaml`	Added `whisper` entry under `models` with `checkEndpoint: /v1/audio/transcriptions/` and a `cmd` block to start the whisper server (host/port, model path, request/inference routes).
Containerfile `docker/llama-swap-whisper.Containerfile`	New multi-stage Containerfile with ARGs `WH_IMAGE`, `WH_TAG`, `BASE`, `UID`, `GID`; copies `/app/build/bin/whisper-server` and shared `.so` files from the WH image into the final `BASE` image and sets ownership to `UID:GID`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

build: add stable-diffusion server to musa and vulkan container images #504: Adds a whisper.cpp server layer and a Containerfile plus build-script layering step—closely matches the changes here.
feat: Add support for custom llama.cpp base image and forked llama-swap repositories #396: Modifies the Docker build pipeline and ARG handling; overlaps with the build-script ARG additions and multi-stage layering in this PR.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: add whisper.cpp server to cuda and musa containers' accurately reflects the main changes in the pull request, which adds whisper.cpp server support to container builds.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

No actionable comments were generated in the recent review. 🎉

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@docker/llama-swap-whisper.Containerfile`:
- Line 12: The Docker COPY command using the recursive glob pattern "COPY
--from=ws-source --chown=${UID}:${GID} /app/build/**/*.so* /app/whisper/" won't
work without BuildKit; either enable BuildKit when building (set
DOCKER_BUILDKIT=1) or replace the recursive glob with explicit paths that match
where the .so files actually land in the ws-source stage (e.g., list the exact
subdirectories or filenames under /app/build/). Update the COPY invocation
accordingly or document/enforce BuildKit in the CI/build scripts so the pattern
is supported.

🧹 Nitpick comments (1)

docker/build-container.sh (1)

48-48: Consider version pinning for whisper.cpp images.

The WH_TAG=main-${ARCH} pattern always pulls the latest main branch build, unlike llama.cpp which uses versioned build identifiers (e.g., b6981). This could lead to non-reproducible builds if the whisper.cpp upstream changes.

If whisper.cpp publishes versioned tags similar to llama.cpp, consider adding a fetch_whisper_tag function to retrieve a specific version. Otherwise, this is acceptable for now as it tracks the latest stable main branch.

Also applies to: 116-116

docker/llama-swap-whisper.Containerfile

Signed-off-by: rare-magma <rare-magma@posteo.eu>

rare-magma · 2026-02-03T21:00:04Z

docker/llama-swap-whisper.Containerfile

+ARG GID=10001
+
+COPY --from=ws-source --chown=${UID}:${GID} /app/build/bin/whisper-server /app/whisper-server
+COPY --from=ws-source --chown=${UID}:${GID} /app/build/src/*.so* /app/


@mostlygeek if the ggml library version diverges between llama-server and whisper-server then the whisper-server might fail as it was built with a different shared library. In order to mitigate this problem the whisper-server binary would have to be compiled with statically linked libraries. sd-server doesn't have that problem because it is built with statically linked libs

This creates some risk where whisper stops working with the llama-swap:cuda and musa containers. Ideally, in the container build step a statically compiled whisper is built and copied in. I attempted that before and found that it was not trivial to do.

Clarified Docker installation instructions by specifying the inclusion of the whisper.cpp server for cuda and musa platforms.

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@README.md`:
- Line 16: Update the README phrase "Use any local OpenAI compatible server
(llama.cpp, vllm, tabbyAPI, whisper.cpp, etc.)" to hyphenate "OpenAI-compatible"
so it reads "Use any local OpenAI-compatible server (llama.cpp, vllm, tabbyAPI,
whisper.cpp, etc.)" to ensure grammatical consistency; locate the exact string
in the README and replace the unhyphenated version with the hyphenated one.

coderabbitai · 2026-02-03T21:55:04Z

README.md

 - ✅ Easy to deploy and configure: one binary, one configuration file. no external dependencies
 - ✅ On-demand model switching
- ✅ Use any local OpenAI compatible server (llama.cpp, vllm, tabbyAPI, etc.)
+- ✅ Use any local OpenAI compatible server (llama.cpp, vllm, tabbyAPI, whisper.cpp, etc.)


⚠️ Potential issue | 🟡 Minor

Hyphenate “OpenAI‑compatible” for grammar consistency.

Minor wording: use “OpenAI‑compatible server” instead of “OpenAI compatible server.”

🧰 Tools

🪛 LanguageTool

[grammar] ~16-~16: Use a hyphen to join words.
Context: ...model switching - ✅ Use any local OpenAI compatible server (llama.cpp, vllm, tabb...

(QB_NEW_EN_HYPHEN)

🤖 Prompt for AI Agents

In `@README.md` at line 16, Update the README phrase "Use any local OpenAI compatible server (llama.cpp, vllm, tabbyAPI, whisper.cpp, etc.)" to hyphenate "OpenAI-compatible" so it reads "Use any local OpenAI-compatible server (llama.cpp, vllm, tabbyAPI, whisper.cpp, etc.)" to ensure grammatical consistency; locate the exact string in the README and replace the unhyphenated version with the hyphenated one.

Signed-off-by: rare-magma <rare-magma@posteo.eu>

feat: add whisper.cpp server to docker images

c54cf10

Signed-off-by: rare-magma <rare-magma@posteo.eu>

coderabbitai bot reviewed Feb 3, 2026

View reviewed changes

docker/llama-swap-whisper.Containerfile Outdated Show resolved Hide resolved

feat: fix path

d986567

Signed-off-by: rare-magma <rare-magma@posteo.eu>

rare-magma commented Feb 3, 2026

View reviewed changes

Update Docker install instructions in README

61d3b2d

Clarified Docker installation instructions by specifying the inclusion of the whisper.cpp server for cuda and musa platforms.

coderabbitai bot reviewed Feb 3, 2026

View reviewed changes

rare-magma added 2 commits February 9, 2026 22:24

feat: add vulkan

c50fe0b

Signed-off-by: rare-magma <rare-magma@posteo.eu>

Merge branch 'main' into whisper-cpp

e4abc4a

Signed-off-by: rare-magma <rare-magma@posteo.eu>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add whisper.cpp server to cuda and musa containers#507

feat: add whisper.cpp server to cuda and musa containers#507
rare-magma wants to merge 5 commits intomostlygeek:mainfrom
rare-magma:whisper-cpp

rare-magma commented Feb 3, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 3, 2026 •

edited

Loading

Reviews paused

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

rare-magma Feb 3, 2026 •

edited

Loading

Uh oh!

mostlygeek Feb 3, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rare-magma commented Feb 3, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rare-magma Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mostlygeek Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rare-magma commented Feb 3, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 3, 2026 •

edited

Loading

rare-magma Feb 3, 2026 •

edited

Loading