feat: add whisper.cpp server to cuda and musa containers#507
feat: add whisper.cpp server to cuda and musa containers#507rare-magma wants to merge 5 commits intomostlygeek:mainfrom
Conversation
Signed-off-by: rare-magma <rare-magma@posteo.eu>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds whisper.cpp/whisper-server support: README updates, new Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
No actionable comments were generated in the recent review. 🎉 Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@docker/llama-swap-whisper.Containerfile`:
- Line 12: The Docker COPY command using the recursive glob pattern "COPY
--from=ws-source --chown=${UID}:${GID} /app/build/**/*.so* /app/whisper/" won't
work without BuildKit; either enable BuildKit when building (set
DOCKER_BUILDKIT=1) or replace the recursive glob with explicit paths that match
where the .so files actually land in the ws-source stage (e.g., list the exact
subdirectories or filenames under /app/build/). Update the COPY invocation
accordingly or document/enforce BuildKit in the CI/build scripts so the pattern
is supported.
🧹 Nitpick comments (1)
docker/build-container.sh (1)
48-48: Consider version pinning for whisper.cpp images.The
WH_TAG=main-${ARCH}pattern always pulls the latest main branch build, unlike llama.cpp which uses versioned build identifiers (e.g.,b6981). This could lead to non-reproducible builds if the whisper.cpp upstream changes.If whisper.cpp publishes versioned tags similar to llama.cpp, consider adding a
fetch_whisper_tagfunction to retrieve a specific version. Otherwise, this is acceptable for now as it tracks the latest stable main branch.Also applies to: 116-116
Signed-off-by: rare-magma <rare-magma@posteo.eu>
| ARG GID=10001 | ||
|
|
||
| COPY --from=ws-source --chown=${UID}:${GID} /app/build/bin/whisper-server /app/whisper-server | ||
| COPY --from=ws-source --chown=${UID}:${GID} /app/build/src/*.so* /app/ |
There was a problem hiding this comment.
@mostlygeek if the ggml library version diverges between llama-server and whisper-server then the whisper-server might fail as it was built with a different shared library. In order to mitigate this problem the whisper-server binary would have to be compiled with statically linked libraries. sd-server doesn't have that problem because it is built with statically linked libs
There was a problem hiding this comment.
This creates some risk where whisper stops working with the llama-swap:cuda and musa containers. Ideally, in the container build step a statically compiled whisper is built and copied in. I attempted that before and found that it was not trivial to do.
Clarified Docker installation instructions by specifying the inclusion of the whisper.cpp server for cuda and musa platforms.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@README.md`:
- Line 16: Update the README phrase "Use any local OpenAI compatible server
(llama.cpp, vllm, tabbyAPI, whisper.cpp, etc.)" to hyphenate "OpenAI-compatible"
so it reads "Use any local OpenAI-compatible server (llama.cpp, vllm, tabbyAPI,
whisper.cpp, etc.)" to ensure grammatical consistency; locate the exact string
in the README and replace the unhyphenated version with the hyphenated one.
README.md
Outdated
| - ✅ Easy to deploy and configure: one binary, one configuration file. no external dependencies | ||
| - ✅ On-demand model switching | ||
| - ✅ Use any local OpenAI compatible server (llama.cpp, vllm, tabbyAPI, etc.) | ||
| - ✅ Use any local OpenAI compatible server (llama.cpp, vllm, tabbyAPI, whisper.cpp, etc.) |
There was a problem hiding this comment.
Hyphenate “OpenAI‑compatible” for grammar consistency.
Minor wording: use “OpenAI‑compatible server” instead of “OpenAI compatible server.”
🧰 Tools
🪛 LanguageTool
[grammar] ~16-~16: Use a hyphen to join words.
Context: ...model switching - ✅ Use any local OpenAI compatible server (llama.cpp, vllm, tabb...
(QB_NEW_EN_HYPHEN)
🤖 Prompt for AI Agents
In `@README.md` at line 16, Update the README phrase "Use any local OpenAI
compatible server (llama.cpp, vllm, tabbyAPI, whisper.cpp, etc.)" to hyphenate
"OpenAI-compatible" so it reads "Use any local OpenAI-compatible server
(llama.cpp, vllm, tabbyAPI, whisper.cpp, etc.)" to ensure grammatical
consistency; locate the exact string in the README and replace the unhyphenated
version with the hyphenated one.
Signed-off-by: rare-magma <rare-magma@posteo.eu>
Signed-off-by: rare-magma <rare-magma@posteo.eu>
Add whisper-server from whisper.cpp docker image for cuda, musa and vulkan containers.
Summary by CodeRabbit
New Features
Documentation