Skip to content

feat: add whisper.cpp server to cuda and musa containers#507

Open
rare-magma wants to merge 5 commits intomostlygeek:mainfrom
rare-magma:whisper-cpp
Open

feat: add whisper.cpp server to cuda and musa containers#507
rare-magma wants to merge 5 commits intomostlygeek:mainfrom
rare-magma:whisper-cpp

Conversation

@rare-magma
Copy link
Copy Markdown
Contributor

@rare-magma rare-magma commented Feb 3, 2026

Add whisper-server from whisper.cpp docker image for cuda, musa and vulkan containers.

Summary by CodeRabbit

  • New Features

    • Added support for whisper.cpp / whisper-server as a local model option and deployable service, included in nightly container images for supported architectures (cuda, musa, vulkan), and exposed via a transcription endpoint.
  • Documentation

    • Updated README and install notes to list whisper.cpp among supported local servers and note inclusion of whisper-server in relevant images; added example service configuration showing endpoint and startup command.

Signed-off-by: rare-magma <rare-magma@posteo.eu>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 3, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds whisper.cpp/whisper-server support: README updates, new WH_IMAGE/WH_TAG build variables, architecture-gated whisper layering in the container build script, a new multi-stage Containerfile to inject whisper-server, and an example whisper model entry in docker/config.example.yaml. (≤50 words)

Changes

Cohort / File(s) Summary
Documentation
README.md
Added whisper.cpp to the list of local OpenAI-compatible servers and noted whisper-server inclusion for cuda, musa, and vulkan Docker installs.
Build script
docker/build-container.sh
Introduced WH_IMAGE and WH_TAG env vars; extended per-architecture build loop to include an architecture-gated whisper-server layering step for cuda, musa, and vulkan using docker/llama-swap-whisper.Containerfile, passing BASE, WH_IMAGE, WH_TAG, UID, and GID.
Service configuration
docker/config.example.yaml
Added whisper entry under models with checkEndpoint: /v1/audio/transcriptions/ and a cmd block to start the whisper server (host/port, model path, request/inference routes).
Containerfile
docker/llama-swap-whisper.Containerfile
New multi-stage Containerfile with ARGs WH_IMAGE, WH_TAG, BASE, UID, GID; copies /app/build/bin/whisper-server and shared *.so* files from the WH image into the final BASE image and sets ownership to UID:GID.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: add whisper.cpp server to cuda and musa containers' accurately reflects the main changes in the pull request, which adds whisper.cpp server support to container builds.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

No actionable comments were generated in the recent review. 🎉


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@docker/llama-swap-whisper.Containerfile`:
- Line 12: The Docker COPY command using the recursive glob pattern "COPY
--from=ws-source --chown=${UID}:${GID} /app/build/**/*.so* /app/whisper/" won't
work without BuildKit; either enable BuildKit when building (set
DOCKER_BUILDKIT=1) or replace the recursive glob with explicit paths that match
where the .so files actually land in the ws-source stage (e.g., list the exact
subdirectories or filenames under /app/build/). Update the COPY invocation
accordingly or document/enforce BuildKit in the CI/build scripts so the pattern
is supported.
🧹 Nitpick comments (1)
docker/build-container.sh (1)

48-48: Consider version pinning for whisper.cpp images.

The WH_TAG=main-${ARCH} pattern always pulls the latest main branch build, unlike llama.cpp which uses versioned build identifiers (e.g., b6981). This could lead to non-reproducible builds if the whisper.cpp upstream changes.

If whisper.cpp publishes versioned tags similar to llama.cpp, consider adding a fetch_whisper_tag function to retrieve a specific version. Otherwise, this is acceptable for now as it tracks the latest stable main branch.

Also applies to: 116-116

Signed-off-by: rare-magma <rare-magma@posteo.eu>
ARG GID=10001

COPY --from=ws-source --chown=${UID}:${GID} /app/build/bin/whisper-server /app/whisper-server
COPY --from=ws-source --chown=${UID}:${GID} /app/build/src/*.so* /app/
Copy link
Copy Markdown
Contributor Author

@rare-magma rare-magma Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mostlygeek if the ggml library version diverges between llama-server and whisper-server then the whisper-server might fail as it was built with a different shared library. In order to mitigate this problem the whisper-server binary would have to be compiled with statically linked libraries. sd-server doesn't have that problem because it is built with statically linked libs

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates some risk where whisper stops working with the llama-swap:cuda and musa containers. Ideally, in the container build step a statically compiled whisper is built and copied in. I attempted that before and found that it was not trivial to do.

Clarified Docker installation instructions by specifying the inclusion of the whisper.cpp server for cuda and musa platforms.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@README.md`:
- Line 16: Update the README phrase "Use any local OpenAI compatible server
(llama.cpp, vllm, tabbyAPI, whisper.cpp, etc.)" to hyphenate "OpenAI-compatible"
so it reads "Use any local OpenAI-compatible server (llama.cpp, vllm, tabbyAPI,
whisper.cpp, etc.)" to ensure grammatical consistency; locate the exact string
in the README and replace the unhyphenated version with the hyphenated one.

README.md Outdated
- ✅ Easy to deploy and configure: one binary, one configuration file. no external dependencies
- ✅ On-demand model switching
- ✅ Use any local OpenAI compatible server (llama.cpp, vllm, tabbyAPI, etc.)
- ✅ Use any local OpenAI compatible server (llama.cpp, vllm, tabbyAPI, whisper.cpp, etc.)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Hyphenate “OpenAI‑compatible” for grammar consistency.

Minor wording: use “OpenAI‑compatible server” instead of “OpenAI compatible server.”

🧰 Tools
🪛 LanguageTool

[grammar] ~16-~16: Use a hyphen to join words.
Context: ...model switching - ✅ Use any local OpenAI compatible server (llama.cpp, vllm, tabb...

(QB_NEW_EN_HYPHEN)

🤖 Prompt for AI Agents
In `@README.md` at line 16, Update the README phrase "Use any local OpenAI
compatible server (llama.cpp, vllm, tabbyAPI, whisper.cpp, etc.)" to hyphenate
"OpenAI-compatible" so it reads "Use any local OpenAI-compatible server
(llama.cpp, vllm, tabbyAPI, whisper.cpp, etc.)" to ensure grammatical
consistency; locate the exact string in the README and replace the unhyphenated
version with the hyphenated one.

Signed-off-by: rare-magma <rare-magma@posteo.eu>
Signed-off-by: rare-magma <rare-magma@posteo.eu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants