Add tts-cpp/ subtree (Chatterbox Turbo + Multilingual + Supertonic TTS) + integration fixes by GustavoA1604 · Pull Request #14 · tetherto/qvac-ext-lib-whisper.cpp

GustavoA1604 · 2026-05-06T16:34:23Z

Summary

Adds the tts-cpp/ in-tree subtree alongside the existing parakeet-cpp/ subtree, completing the QVAC speech stack inside this whisper.cpp fork. tts-cpp/ is a port of gianni-cor/chatterbox.cpp — Resemble AI's Chatterbox TTS (Turbo, English-only; Multilingual, 18 tier-1 languages) plus Supertonic TTS — running on ggml with CPU / Metal / Vulkan / CUDA backends, no Python/PyTorch dependency.

The PR also wires the subtree into the QVAC speech-stack consumption pattern (ggml-speech vcpkg port from qvac-ext-ggml/speech), folds in the round-3 review fixes that landed in the standalone repo after the initial port, and adds one parakeet-cpp streaming-API change required by downstream consumers.

16 commits
103 files changed, +57 184 lines (subtree drop is the bulk; the integration / review commits are small)
Branch: tts-cpp → master

What's in the PR

1. `tts-cpp/` subtree drop (commit `ef840d5`)

Initial squashed import from the standalone chatterbox.cpp source-of-truth. Layout follows parakeet-cpp/:

tts-cpp/include/tts-cpp/ — public C++ API (chatterbox/engine.h, supertonic/engine.h, backend.h, log.h, tts-cpp.h, export.h).
tts-cpp/src/ — implementation (Chatterbox Turbo + MTL T3, Supertonic flow + vocoder, S3Gen, S3Tokenizer, CAMPPlus voice encoder, mel2wav HiFT, MTL tokenizer, GPT-2 BPE, log).
tts-cpp/test/ — gtest suite mirroring each engine stage (T3, S3Gen, vocoder, CAMPPlus, MTL tokenizer, streaming, etc.).
tts-cpp/scripts/ — Python conversion scripts (HF safetensors → GGUF for T3-Turbo / T3-MTL / S3Gen / Supertonic2; reference-trace dumps for diffing against PyTorch; voice extraction).
tts-cpp/CMakeLists.txt + tts-cpp/cmake/tts-cppConfig.cmake.in — install/export, OpenMP detection, find_package(tts-cpp CONFIG) consumer surface.
tts-cpp/README.md + tts-cpp/PROGRESS.md + tts-cpp/PROGRESS_SUPERTONIC.md — developer docs.

Performance summary (full table in tts-cpp/README.md):

Turbo, Vulkan RTX 5090, Q4_0: 463 ms wall, 13.8× faster than ONNX Runtime CPU Q4.
Multilingual, Metal M3 Ultra, Q4_0 + --cfm-steps 7: 1.05 s wall, 48.4× faster than ONNX Runtime CPU Q4.

2. Top-level `README.md` pointer (commit `a2f2dd6`)

Adds a small "QVAC speech-stack ports" section between the upstream whisper.cpp intro media and the Quick start, pointing at both tts-cpp/ and parakeet-cpp/ as in-tree subtrees with their own READMEs / build flows / public C++ APIs. Upstream whisper.cpp build below is unaffected.

3. Integration deltas vs the standalone repo (commits `fa0d490`, `ae34c58`, `8ba10a6`, `e673182`)

The standalone chatterbox.cpp ships with scripts/setup-ggml.sh + patches/ for bundled-ggml dev builds. Per reviewer guidance, the in-tree subtree is converged on a single ggml source-of-truth — the QVAC speech-stack ggml-speech vcpkg port (which ships the patches pre-applied) — instead of carrying a parallel patches tree:

fa0d490 (review #5+#6): default TTS_CPP_USE_SYSTEM_GGML=ON in this subtree; bundled-ggml branch hard-errors with a pointer at the canonical consumption path; deletes tts-cpp/scripts/setup-ggml.sh (it referenced patches/ that no longer exist; it would have errored out anyway).
ae34c58 (review #26): rewrite tts-cpp/README.md §1 from "clone + setup-ggml.sh + patches" to "build via the QVAC speech stack"; flip the TTS_CPP_USE_SYSTEM_GGML row default in the cmake-options table; rewrite Consumer integration from the standalone perspective to the wrapper-port perspective; relabel benchmark and code-tree references from chatterbox.cpp/ to tts-cpp/ where they refer to this directory (kept where they refer to the upstream project itself).
8ba10a6 (review N8): drop the now-unreachable TTS_CPP_GGML_LIB_PREFIX block (the find_package(ggml) path resolves libspeech-ggml-* natively from the vcpkg port, the local-rename helper is dead code in this subtree). README row updated with a strikethrough + "n/a in this subtree" note pointing at the standalone repo for the OFF path.
e673182 (review N10): scrub two stale patches/ references from tts-cpp/README.md (§3.24 Metal-optimisation explanation and the repository-layout tree); rewrite to attribute the patch to the ggml-speech vcpkg port. Five other patches/ mentions stay on purpose — they're cross-references to the standalone repo or describe the rejection rationale.

4. Round-3 review fixes mirrored from standalone (commits `4b5d2d7`, `28ef67d`, `e8f6065`, `04b87ea`, `64abb81`, `1963f9f`, `942686d`)

Re-syncs the subtree with the standalone source-of-truth for fixes that landed there after the initial port. Each commit is a 1:1 git apply --directory=tts-cpp/ of the corresponding upstream diff:

4b5d2d7 (review N1–N7): Supertonic alive-registry guards thread_local cache teardown vs freed backend (N1); Engine BPE try/catch + dead-state cleanup (N3, N6, N7); log docstring softening (N2); s3gen cancel checkpoint between STFT and HiFT + Engine::cancel() doc (N4); s3gen_preload/unload refcount semantics on the public header (N5).
28ef67d: drop internal ggml-quants.h include from supertonic_gguf.cpp; use ggml_get_type_traits()->to_float instead. Required because ggml-speech doesn't install internal ggml headers. Bit-equivalent runtime.
e8f6065: tts-cppConfig.cmake.in re-imports OpenMP via find_dependency before including the targets file. Without this, consumers of the static archive failed at target-property time with OpenMP::OpenMP_CXX … target was not found.
04b87ea: scope the find_dependency(OpenMP) to COMPONENTS CXX. Bare find_dependency(OpenMP) was probing OpenMP_C in consumers, which fails on bare-make's clang-cl-style toolchain even when CXX-side OpenMP is fine.
64abb81: TTS_CPP_OPENMP option, defaults OFF on Windows non-MinGW (where vcpkg's MSVC toolchain links OpenMP transitively but consumers can't re-probe it). Override via -DTTS_CPP_OPENMP_USER_OVERRIDE=ON -DTTS_CPP_OPENMP=ON. Trade-off: 9 #pragma omp parallel for loops in campplus.cpp run serially in this build mode; CAMPPlus is a small fraction of total synth time.
1963f9f: Engine::backend_device() public API + BackendDevice enum on chatterbox::Engine and supertonic::Engine, mirroring parakeet::Engine::backend_device(). Routes through ggml_backend_get_device + ggml_backend_dev_type; works in both GGML_BACKEND_DL modes. Required by qvac3's tts-ggml addon to mirror ParakeetModel's load-time backend resolution.
942686d: synthesize_batch gates apply_trim_fade on the actual presence of a voice override. The previous unconditional fade was clipping the leading consonant of the first word ("Hello" → "lo", "El" → "l", "A" → nothing) for the chatterbox-mtl built-in-voice path. Reference-audio / voice_dir paths keep the fade.

5. MTL Chatterbox correctness fixes (commits `db87f42`, `0b44674`)

Two issues that surfaced once chatterbox-mtl ran end-to-end on real workloads:

db87f42: Engine::run_t3 for MTL wraps text tokens with start_text_token (255) / stop_text_token, matching chatterbox_cli.cpp's tokenisation path and the Python ChatterboxMultilingualTTS.generate reference (mtl_tts.py:288-291). Without it, the autoregressive decode dropped the first speech tokens, audible as a missing leading syllable. Turbo path is unaffected.
0b44674: port the CLI's existing 3-identical-token early-stop guard into Engine::run_t3 (gated on is_mtl). MTL T3 occasionally emits an end-of-speech silence cadence mid-utterance and then hallucinates ~40 s of trailing low-energy junk until n_predict=1000. The CLI was already guarded; the addon path (which doesn't go through the CLI) was hitting the regression on certain language/seed combinations (most reliably reproduced on German with seed=42).

6. `parakeet-cpp/` SentencePiece word-start signal (commit `761eca0`)

Adds bool starts_word on parakeet::StreamingSegment, set true when the segment's first token's piece carries the SentencePiece ▁ (U+2581) word-boundary marker. Streaming consumers can use it to decide whether to insert a space between successive segments without re-parsing whitespace from seg.text (the inner detokenizer strips leading whitespace at the session level). Also exposes bool token_is_word_start(BpeVocab, int32_t) from sentencepiece_bpe.h so other engines that build their own segments (EOU per-utterance, attributed) can stamp the flag the same way. Defaults starts_word = true so existing callers are byte-equivalent.

Bundled into this PR rather than its own because the parakeet-cpp/ consumer in qvac3/packages/parakeet-ggml and the tts-cpp/ consumer in qvac3/packages/tts-ggml ship via the same ggml-speech vcpkg port version bump; splitting them would force two coordinated registry flips for a single addon release.

Design notes (preempting review questions)

These call out the deliberate choices that look unusual at first glance — flagging here so re-review doesn't re-litigate them.

Why is `TTS_CPP_USE_SYSTEM_GGML=ON` the default in this subtree?

Reviewer guidance (#5, #6): converge the speech stack on a single ggml source-of-truth. The ggml-speech vcpkg port (qvac-ext-ggml/speech) ships the chatterbox-specific Metal / OpenCL patches pre-applied — carrying a parallel patches/ tree inside this subtree would mean two sources of truth for the same patches and a coordination tax on every ggml bump. The bundled-ggml dev flow is preserved in the standalone chatterbox.cpp repo, which keeps setup-ggml.sh + patches/ + TTS_CPP_USE_SYSTEM_GGML=OFF default; this subtree is the integrated artefact, not a parallel dev environment.

-DTTS_CPP_USE_SYSTEM_GGML=OFF here intentionally hard-errors at configure time with a pointer at both the canonical consumption path (ggml-speech vcpkg port) and the standalone repo for users who need bundled ggml.

Why mirror commits from `chatterbox.cpp` instead of squashing into the initial port?

Two reasons:

The initial port (ef840d5) was force-pushed five times during the integration cycle (review-iteration round 1 → round 3); each round-3 fix landed on the standalone repo after this PR's force-push window closed. Mirroring them as separate commits keeps the diff against the standalone source-of-truth reviewable: chatterbox.cpp commit hash → tts-cpp commit, one-to-one.
Each mirror commit captures the upstream commit hash in its message, so a future bisect or audit can trace back to the standalone PR / discussion.

If a squashed history is preferred at merge time, GitHub's "Squash and merge" handles it; the per-commit messages are written to survive that.

Why is the `parakeet-cpp/` change in this PR?

parakeet-cpp/ and tts-cpp/ ship through the same ggml-speech vcpkg port version bump (single port-version flip in qvac-registry-vcpkg). Consumers (qvac3/packages/parakeet-ggml, qvac3/packages/tts-ggml) bump together. Splitting parakeet-cpp/'s starts_word change into its own PR would force two coordinated registry flips for one release. The change is small (+15 lines of API surface, no behaviour change for existing callers) and gated by an additive bool field that defaults to true.

Why is OpenMP defaulted OFF on Windows non-MinGW?

vcpkg's MSVC toolchain port build links OpenMP::OpenMP_CXX into the static archive's transitive interface; consumers — including bare-make's clang-cl CMake — then re-probe OpenMP_CXX (or OpenMP_C) at find_package(tts-cpp) time, and that probe fails on toolchains where CXX-OpenMP isn't auto-detected. Defaulting OFF on the affected toolchain combination keeps the consumer surface portable; the perf cost is bounded to 9 #pragma omp parallel for loops in campplus.cpp (CAMPPlus runs once per voice-encode at session init, small fraction of total synth time). Override available for users on toolchains with working CXX OpenMP.

Why ~57 K added lines?

The bulk is the standalone chatterbox.cpp source dropped under tts-cpp/ (ef840d5). Major contributors:

tts-cpp/src/mtl_unicode_tables.inc — autogenerated NFKD lookup tables (one-time, regenerable via tts-cpp/scripts/gen-nfkd-table.py).
tts-cpp/src/dr_wav.h, tts-cpp/src/npy.h — vendored single-header libs (verbatim upstream copies; their licences are in tts-cpp/NOTICE).
tts-cpp/test/ — gtest suite (~22 test files, one per engine stage).
tts-cpp/PROGRESS.md, tts-cpp/PROGRESS_SUPERTONIC.md — developer notebooks.

Implementation source under tts-cpp/src/ (excluding the autogenerated table and vendored headers) is roughly 12 K lines.

Why doesn't this PR bump `qvac-registry-vcpkg/ports/tts-cpp`?

Per the standalone repo's pre-merge convention: while the PR is open, the port-version 0 entry is force-amended in place to point at the latest tip of this branch. The actual port bump (port-version 0 → port-version 1, or new commit hash for version 0) lands in qvac-registry-vcpkg after this PR merges to master, in a follow-up PR there.

Why are some `chatterbox.cpp` references kept verbatim in `tts-cpp/README.md`?

Five intentional ones survived the §1 rewrite (commit ae34c58):

The title-card upstream URL.
The §1 "use the standalone repo for bundled-ggml dev builds" pointer.
The cmake-options note about the OFF default (lives upstream).
The "How TTS_CPP_USE_SYSTEM_GGML=ON resolves ggml" prose (cross-references the standalone build flow).
The repo-layout caveat naming the source-of-truth.

Each one points at github.com/gianni-cor/chatterbox.cpp by URL or repo name, not at this directory. They stay because the standalone repo is the development source-of-truth for the engine code and we want a single grep to find it.

Why is `BackendDevice` shaped exactly like `parakeet::Engine::backend_device()`?

Intentional API parallelism so the qvac3 addons can share their backend-resolution code path between tts-ggml and parakeet-ggml. Both addons read backend_device() + backend_name() at session init, map through a shared backendIdFromName(), and expose the same RuntimeStats shape on the JS API. Diverging the C++ API would mean two parallel addon-side wrappers for the same data.

Test plan

tts-cpp/ source-of-truth: github.com/gianni-cor/chatterbox.cpp.
parakeet-cpp/ source-of-truth: standalone parakeet.cpp repo.
ggml consumption: qvac-ext-ggml/speech branch via the ggml-speech vcpkg port.
Downstream consumers: qvac3/packages/tts-ggml, qvac3/packages/parakeet-ggml.
Pre-merge port-version policy: amend qvac-registry-vcpkg/ports/tts-cpp port-version 0 in place; the actual port bump lands in qvac-registry-vcpkg in a follow-up after this PR merges to master.

…l-org#6) The standalone setup-ggml.sh + patches/ tooling was dropped from qvac-ext-lib-whisper.cpp/tts-cpp/ in the integration commit, but the CMakeLists.txt still: * defaulted TTS_CPP_USE_SYSTEM_GGML=OFF, and * unconditionally compile-defined GGML_BACKEND_DL_PROJECT_PREFIX="speech-" on the bundled ggml target. That combination quietly broke standalone bundled-ggml builds: the filename-prefix patch was no longer applied, so libspeech-ggml-*.so files existed on disk but ggml's runtime loader still searched for libggml-*.so under GGML_BACKEND_DL=ON. Vulkan / OpenCL / CUDA backends silently failed to load on Android. Fix per reviewer guidance: converge the speech stack on a single ggml source-of-truth. Standalone-bundled-ggml is no longer a supported build mode out of this in-tree subtree; the canonical path is `-DTTS_CPP_USE_SYSTEM_GGML=ON` against the QVAC speech-stack `ggml-speech` vcpkg port (qvac-ext-ggml/speech branch), which ships the patches pre-applied. Edits: - TTS_CPP_USE_SYSTEM_GGML default flipped from OFF to ON in this tree. Docstring spells out the rationale + points users at the standalone github.com/gianni-cor/chatterbox.cpp repo if they need a bundled-ggml dev build with patches/ present. - The bundled-ggml branch of `if (NOT TARGET ggml)` now refuses to configure when patches/ is absent: a FATAL_ERROR points at the right consumption path (vcpkg ggml-speech) and the standalone fallback. Doesn't break in-tree-with-patches builds (parakeet-cpp in this same repo still ships patches/, so its bundled path is unaffected by this guard inside tts-cpp). - Verified locally: `cmake -S tts-cpp -B build` (no flags) errors out at find_package(ggml CONFIG REQUIRED) with our new message pointing at the ggml-speech port; `cmake -S tts-cpp -B build -DTTS_CPP_USE_SYSTEM_GGML=OFF` errors out at the patches/ guard with the no-patches message. - tts-cpp/scripts/setup-ggml.sh deleted: it referenced patches/ that no longer exist; running it would have errored out anyway. The standalone repo keeps its own setup-ggml.sh; only the in-tree subtree drops it. The standalone chatterbox.cpp repo (the one tts-cpp/ was copied from) keeps TTS_CPP_USE_SYSTEM_GGML=OFF default + the patches/ folder + scripts/setup-ggml.sh. This commit is therefore an integration-time delta against that source, not a change to the standalone build flow. Co-authored-by: Cursor <cursoragent@cursor.com>

) The README was a verbatim copy of the standalone chatterbox.cpp repo, which makes it read as 'I cloned the wrong repo' to anyone landing on tts-cpp/ inside qvac-ext-lib-whisper.cpp. Per the reviewer's two-line ask: rewrite section 1 + global s/chatterbox.cpp/tts-cpp where it's a directory or repo-name reference (kept where it points at the upstream chatterbox.cpp project itself). Edits: - Title changes from `# chatterbox.cpp` to `# tts-cpp` plus a blockquote note up top: this is the in-tree subtree of github.com/gianni-cor/chatterbox.cpp; the integration drops setup-ggml.sh + patches/, ggml comes through the qvac-ext-ggml speech-branch vcpkg port, see section 1 for the build flow. - Section 1 (was '## 1. Clone and build', the standalone clone + setup-ggml.sh + patches/ flow) replaced with '## 1. Build from the qvac speech stack': * one find_package(tts-cpp CONFIG REQUIRED) cmake snippet for downstream consumption; * one cmake -S tts-cpp -B build -DCMAKE_TOOLCHAIN_FILE=vcpkg.cmake flow for in-tree dev; * pointer at the standalone github.com/gianni-cor/chatterbox.cpp repo for anyone needing a bundled-ggml dev build. Drops the entire setup-ggml.sh paragraph + GPU-acceleration paragraph that referenced patches/. - 'Useful CMake options' table: TTS_CPP_USE_SYSTEM_GGML row default flipped from OFF to 'ON (this in-tree subtree)', cell explains that flipping OFF is rejected here (no patches/) and points at the standalone repo for the OFF default. - 'Alternative: consume ggml from vcpkg' subsection collapsed to 'How TTS_CPP_USE_SYSTEM_GGML=ON resolves ggml' since it's now the canonical path, not the alternative. Drops the now-stale 'preserves the standalone flow above untouched, opt-in escape hatch for package-manager-driven builds' paragraph. - 'Consumer integration' subsection rewritten from the wrapper-port perspective ('this in-tree subtree IS the wrapper port') instead of the standalone perspective ('downstream projects consume through the wrapper port'). - Benchmark tables (Mac M3 Ultra + Linux RTX 5090): four '`chatterbox.cpp` Q4_0' implementation-name cells become '`tts-cpp` Q4_0'; the '`chatterbox.cpp` (Metal) is...' / '`chatterbox.cpp` (Vulkan) is...' captions follow. - Repository layout tree: root dir name `chatterbox.cpp/` becomes `tts-cpp/` with a one-line caveat naming the standalone source- of-truth. Drops the `ggml/` entry (no bundled ggml in this subtree by default), drops the `setup-ggml.sh` line under scripts/ (the file no longer exists - removed in the previous commit), updates the chatterbox_cli.cpp comment from 'tts-cli + chatterbox binaries' to 'tts-cli binary' since the back-compat chatterbox alias is dropped in the standalone source too. - One '# Build chatterbox.cpp, then:' bash comment in the reproduction snippet becomes '# Build tts-cpp, then:'. - Lower 'tts-cli / chatterbox binaries' API-overview phrasing becomes 'tts-cli binary' to match the actual built artefact. Five `chatterbox.cpp` references stay on purpose: the title-card URL, the section-1 'use the standalone repo' pointer, the useful-cmake-options note about the OFF default, the how-system-ggml-resolves prose, and the repo-layout caveat. Each one points at the upstream project github.com/gianni-cor/chatterbox.cpp by URL/name, not at this directory. No code changes; README.md only. Co-authored-by: Cursor <cursoragent@cursor.com>

…gml-org#27) Adds a small 'QVAC speech-stack ports' section between the upstream whisper.cpp intro media and the 'Quick start' section, pointing at the two in-tree subtrees this fork carries: - tts-cpp/ - Chatterbox (Turbo + Multilingual) + Supertonic TTS, in-tree subtree of github.com/gianni-cor/chatterbox.cpp. - parakeet-cpp/ - NVIDIA Parakeet FastConformer ASR + Sortformer diarization, in-tree subtree of the parakeet.cpp standalone repo. Both consume ggml through the `ggml-speech` vcpkg port (the qvac-ext-ggml/speech branch). Each subtree has its own README, build flow, and public C++ API; the upstream whisper.cpp build below the new section is unaffected. Closes review ggml-org#27 ('one-line pointer to tts-cpp/ from the top-level qvac-ext-lib-whisper.cpp/README.md'). The reviewer specifically asked for tts-cpp; included parakeet-cpp at the same time so a future 'fix the un-fixed parakeet-cpp version of this bullet' commit doesn't need to revisit the same paragraph. Co-authored-by: Cursor <cursoragent@cursor.com>

Re-syncs the in-tree subtree with the standalone chatterbox.cpp source-of-truth after seven round-3 review items landed there. The diff was generated from chatterbox.cpp commits 2d3632b..0a5ad2d and applied with `git apply --directory=tts-cpp/`; no path-level conflicts because the subtree was last copied from the same source. Mirrored commits (chatterbox.cpp side): - ef0eb36 supertonic: alive-registry guards thread_local cache teardown vs freed backend (N1) - fcbff16 engine: Turbo BPE try/catch + drop dead cached_text_lc + clarify view-vs-copy log (N3 + N6 + N7) - 055ce84 log: drop dead g_sink_* state, soften thread-safety docstring (N2) - 75fbd22 s3gen: cancel checkpoint between STFT and HiFT + tighten Engine::cancel() doc (N4) - 0a5ad2d s3gen: document s3gen_preload/unload refcount semantics on the public header (N5) Files touched (11): include/tts-cpp/chatterbox/engine.h (N4 docstring) include/tts-cpp/chatterbox/s3gen_pipeline.h (N5 docstring) include/tts-cpp/log.h (N2 docstring) src/chatterbox_engine.cpp (N3 try/catch) src/chatterbox_tts.cpp (N4 stft cancel + N7 log) src/log.cpp (N2 dead-state drop) src/supertonic_gguf.cpp (N1 alive-registry) src/supertonic_internal.h (N1 helper API) src/supertonic_text_encoder.cpp (N1 free-cache gate) src/supertonic_vector_estimator.cpp (N1 + N6) src/supertonic_vocoder.cpp (N1 free-cache gate) The two integration-only review items (N8 unreachable LIB_PREFIX block, N10 stale patches/ refs in README) land in separate commits on this branch since they don't correspond to chatterbox.cpp changes. N9 (per-call seed override) and N11 (richer backend_name) were dropped per user direction. Build verification was done on chatterbox.cpp's standalone build (the source-of-truth); not re-built here because TTS_CPP_USE_SYSTEM_GGML defaults ON in this in-tree subtree and requires the ggml-speech vcpkg port installed to configure. Co-authored-by: Cursor <cursoragent@cursor.com>

After commit fa0d490 (review ggml-org#5+ggml-org#6) made bundled-add_subdirectory(ggml) hard-error in this in-tree subtree when patches/ is absent, the TTS_CPP_GGML_LIB_PREFIX block became dead code: if (TTS_CPP_GGML_LIB_PREFIX AND NOT TTS_CPP_USE_SYSTEM_GGML) NOT TTS_CPP_USE_SYSTEM_GGML can never reach this `if` here - configure has already FATAL_ERROR'd at the patches/-absent guard. The option, the helper function, the foreach loop, the GGML_BACKEND_DL_PROJECT_PREFIX define, and the STATUS message were all unreachable. The next maintainer flipping -DTTS_CPP_GGML_LIB_PREFIX=OFF to disable prefixing would have been silently confused when nothing changed. Edits: tts-cpp/CMakeLists.txt: - The option() declaration at line 22 removed. Replaced with a one-paragraph cross-reference to the standalone chatterbox.cpp repo for the locally-rename flow + the rationale (ggml-speech vcpkg port emits the libspeech-ggml-* filenames itself). - The 41-line block at lines 131-176 (tts_cpp_apply_ggml_prefix function + foreach + target_compile_definitions + STATUS message) replaced with a 9-line note telling future readers where the standalone counterpart lives. tts-cpp/README.md: - Useful CMake options table row for TTS_CPP_GGML_LIB_PREFIX rewritten with a strikethrough + "n/a in this subtree" cell: explains the standalone option exists at chatterbox.cpp upstream, why it's unnecessary here (ggml-speech vcpkg port handles the rename at its own build time), and that the file-prefix surface is whatever vcpkg installs. Doc-only behavior visible to consumers: the integrated subtree no longer has a TTS_CPP_GGML_LIB_PREFIX option at all. Build behaviour unchanged - the vcpkg find_package path was already taking effect and emitting libspeech-ggml-* as designed. Co-authored-by: Cursor <cursoragent@cursor.com>

Two spots in the README still pointed at a `patches/` directory that isn't in this in-tree subtree (deleted in the integration commit; the ggml-speech vcpkg port carries the equivalent pre-applied): (a) §3.24-§3.30 Metal optimisation explanation: "Patch `patches/ggml-metal-chatterbox-ops.patch` (1088 lines) applies cleanly on a fresh ggml clone at pinned `58c38058`." Reads as if the file lives at this subtree's patches/ today. (b) The "Repository layout" project-tree diagram listed `patches/ggml-metal-chatterbox-ops.patch` / `ggml-opencl-chatterbox-ops.patch` / `README.md` as if they were here. Edits: (a) Reworded to "the 1088-line ggml-metal patch backing these kernel changes is shipped pre-applied by the `ggml-speech` vcpkg port (qvac-ext-ggml/speech branch); the standalone chatterbox.cpp repo carries it under `patches/ggml-metal-chatterbox-ops.patch` against pinned ggml `58c38058`." Same technical claim, accurate provenance for this subtree. (b) The patches/ block in the project-tree diagram replaced with a parenthetical note pointing at the standalone repo for the locally-applied flow. The other five `patches/` mentions in the README (lines 5, 352, 362, 376, 434) are deliberate cross-references to the standalone chatterbox.cpp repo or describe the "flipping TTS_CPP_USE_SYSTEM_GGML=OFF rejected because patches/ is absent here" rationale. Those stay. Doc-only; no code or build behaviour change. Co-authored-by: Cursor <cursoragent@cursor.com>

….cpp System-ggml build of this in-tree subtree was failing in the ggml-speech vcpkg port because the standalone source included the internal ggml/src/ggml-quants.h header which isn't installed by ggml-speech. The standalone chatterbox.cpp source was just bumped to use ggml_get_type_traits() + tr->to_float instead, mirroring the parakeet.cpp pattern. Mirrored from chatterbox.cpp commit edf9e50 via \`git apply --directory=tts-cpp\` against the standalone diff. src/supertonic_gguf.cpp: - Drop \`#include "ggml-quants.h"\`. - expand_supertonic_tensor_to_f32() now uses ggml_get_type_traits(src->type)->to_float instead of the direct ggml_fp16_to_fp32_row / dequantize_row_q8_0 calls. No public API change; runtime behaviour is bit-equivalent because to_float dispatches into the same row dequantizers internally. The qvac-registry-vcpkg/ports/tts-cpp portfile + version bump to pick up this commit lands in a follow-up. Co-authored-by: Cursor <cursoragent@cursor.com>

…rbox.cpp Mirrors chatterbox.cpp commit e481901 to the in-tree subtree. tts-cpp builds as a STATIC archive by default and links OpenMP as PRIVATE; install(EXPORT) records that as an IMPORTED_LINK_DEPENDENT_LIBRARY in tts-cppTargets.cmake, so consumers doing find_package(tts-cpp CONFIG REQUIRED) failed at target-property time with The link interface of target "tts-cpp::tts-cpp" contains: OpenMP::OpenMP_CXX but the target was not found. That hit qvac3/packages/tts-ggml after the integrated tts-cpp vcpkg port @ 2026-05-07#0 finally compiled and installed. Fix: tts-cppConfig.cmake re-imports OpenMP via find_dependency before including tts-cppTargets.cmake; conditionally injected so the dep is only required of consumers when OpenMP was actually found and linked at build time. tts-cpp/cmake/tts-cppConfig.cmake.in: Add @TTS_CPP_OPTIONAL_DEPS@ substitution slot directly after the existing find_dependency(ggml CONFIG). tts-cpp/CMakeLists.txt (install block): Build TTS_CPP_OPTIONAL_DEPS by appending "find_dependency(OpenMP)\n" iff OpenMP_CXX_FOUND, otherwise leave empty; configure_package_config_file substitutes it in. Backwards- compatible with builds where OpenMP isn't available (find_package(OpenMP) is non-REQUIRED). The qvac-registry-vcpkg/ports/tts-cpp port-version 0 entry will be amended in place to point at this commit (pre-merge convention: single squashed commit + force-push until upstream merge). Co-authored-by: Cursor <cursoragent@cursor.com>

…rbox.cpp Mirrors chatterbox.cpp commit c91f2d9. Follow-up to commit e8f6065. The unscoped find_dependency(OpenMP) emitted into tts-cppConfig.cmake by the previous fix made consumers' CMake also probe OpenMP_C, which fails on bare-make's clang-cl-style toolchain even when CXX-side OpenMP is fine: Could NOT find OpenMP_C (missing: OpenMP_C_FLAGS OpenMP_C_LIB_NAMES) ...share/tts-cpp/tts-cppConfig.cmake:29 (find_dependency) tts-cpp only links OpenMP::OpenMP_CXX, never the C variant. Fix: in tts-cpp/CMakeLists.txt install block, change the line that appends to TTS_CPP_OPTIONAL_DEPS so it emits find_dependency(OpenMP COMPONENTS CXX) instead of bare find_dependency(OpenMP). CMake's FindOpenMP module respects COMPONENTS and scopes the probe to that language only; OpenMP::OpenMP_CXX is still imported, OpenMP_C is not required. The qvac-registry-vcpkg/ports/tts-cpp port-version 0 entry will be amended in place to point at this commit (pre-merge convention). Co-authored-by: Cursor <cursoragent@cursor.com>

Mirrors chatterbox.cpp commit e6031b2. Replaces the bare find_package(OpenMP) call with parakeet-style gating so OpenMP auto-defaults OFF on Windows non-MinGW (the toolchain combination where vcpkg's MSVC port build ends up linking OpenMP::OpenMP_CXX into the static-archive transitive interface, only for consumers - including bare-make's clang-cl CMake - to fail re-probing OpenMP_CXX or OpenMP_C at find_package(tts-cpp) time). Edit (tts-cpp/CMakeLists.txt, ~line 150): option(TTS_CPP_OPENMP "tts-cpp: enable OpenMP for the tts-cpp target" ON) if (WIN32 AND NOT MINGW AND TTS_CPP_OPENMP AND NOT DEFINED CACHE{TTS_CPP_OPENMP_USER_OVERRIDE}) set(TTS_CPP_OPENMP OFF CACHE BOOL "" FORCE) message(STATUS "...") endif() if (TTS_CPP_OPENMP) find_package(OpenMP) endif() Net effect inside the qvac-registry-vcpkg/ports/tts-cpp port build (x64-windows triplet, vcpkg's MSVC toolchain): OpenMP_CXX is never searched, the target_link_libraries(... PRIVATE OpenMP::OpenMP_CXX) lines are skipped, the install(EXPORT) emits no OpenMP transitive dep, and tts-cppConfig.cmake's @TTS_CPP_OPTIONAL_DEPS@ slot stays empty (no find_dependency(OpenMP) is generated). Consumer toolchains with broken or missing OpenMP detection are no longer blocked. Trade-off: the 9 #pragma omp parallel for loops in src/campplus.cpp run serially in this build mode. CAMPPlus preprocessing is a small fraction of total synth time; the perf delta is bounded. Override available via -DTTS_CPP_OPENMP_USER_OVERRIDE=ON -DTTS_CPP_OPENMP=ON for toolchains that do have working CXX OpenMP. The qvac-registry-vcpkg/ports/tts-cpp port-version 0 entry will be amended in place to point at this commit (pre-merge convention: single squashed commit + force-push until upstream merge). Co-authored-by: Cursor <cursoragent@cursor.com>

Mirrors chatterbox.cpp commit 8c849cc. Adds tts-cpp/include/tts-cpp/backend.h with the BackendDevice enum (CPU = 0, GPU = 1) and a backend_device() method on both chatterbox::Engine and supertonic::Engine. Implementation routes through the ggml backend registry (ggml_backend_get_device + ggml_backend_dev_type) so it works in both GGML_BACKEND_DL modes. Same shape as parakeet.cpp's parakeet::Engine::backend_device(), matched intentionally so the qvac3 tts-ggml addon can mirror ParakeetModel's load-time backend resolution (read backend_device() + backend_name(), map to backendIdFromName(), expose both on JS via RuntimeStats). See chatterbox.cpp commit 8c849cc message for the full technical rationale. The qvac-registry-vcpkg/ports/tts-cpp port-version 0 entry will be amended in place to point at this commit (pre-merge convention). Co-authored-by: Cursor <cursoragent@cursor.com>

… chatterbox.cpp Mirrors chatterbox.cpp commit 78ae3c5. Engine::synthesize_batch now gates apply_trim_fade on the actual presence of a voice override (reference_audio path or voice_dir). When both are empty - i.e. the chatterbox::Engine built-in-voice default that loads s3gen/builtin/{embedding,prompt_token,prompt_feat} from the GGUF - apply_trim_fade is false so the first 40 ms of synthesized speech is no longer zeroed + faded. This unblocks the chatterbox-mtl variant in particular: its upstream conds.pt produces audio with zero leading silence, and the previous unconditional apply_trim_fade was clipping the leading consonant of the first word ("Hello" -> "lo", "El" -> "l", "A" -> nothing) under that configuration. See chatterbox.cpp commit 78ae3c5 for the full diagnosis + empirical confirmation. Reference-audio / voice_dir paths keep apply_trim_fade=true and behave exactly as before; streaming path is unchanged. The qvac-registry-vcpkg/ports/tts-cpp port-version 0 entry will be amended in place to point at this commit (pre-merge convention: single squashed commit + force-push until upstream merge). Co-authored-by: Cursor <cursoragent@cursor.com>

…ment Adds `bool starts_word` to `parakeet::StreamingSegment`, set true when the segment's first token's piece carries the SentencePiece "▁" word- boundary marker (U+2581) and false when it is a wordpiece continuation. Streaming consumers can use the flag to decide whether to insert a space between successive segments without re-parsing whitespace from `seg.text` (the inner detokenizer strips leading whitespace at the session level, which loses the signal for the chunk that opens a session). With the flag, "see" + "if" stays as "see if" while the chunk-boundary split "pun" + "ctuation" rejoins as "punctuation". Also exposes `bool token_is_word_start(BpeVocab, int32_t)` from sentencepiece_bpe.h so other engines that build their own segments (EOU per-utterance, attributed) can stamp the flag the same way. Defaults `starts_word = true` so existing callers that ignore the field see no behavioural change. Co-authored-by: Cursor <cursoragent@cursor.com>

…_token Mirrors src/chatterbox_cli.cpp's MTL tokenisation path and the Python ChatterboxMultilingualTTS.generate reference (chatterbox-ref/src/chatterbox/ mtl_tts.py:288-291). The MTL T3 prompt graph anchors position 0 on start_text_token (255); without it the autoregressive decode drops the first speech tokens, audible as a missing leading syllable ("Hello" -> "lo from the multilingual"). Turbo (gpt2_bpe) is unaffected and keeps the existing single-line tokenise + punc_norm path. Co-authored-by: Cursor <cursoragent@cursor.com>

…gine::run_t3 MTL T3 occasionally emits a plausible end-of-speech silence cadence (three identical tokens in a row) mid-utterance and then hallucinates low-energy content -- silence, hissing, garbage tokens -- until n_predict (1000) is reached, producing ~40 s of trailing junk on a short input. chatterbox_cli.cpp already guards against this via the AlignmentStreamAnalyzer token_repetition port, but Engine::run_t3 was missing the same check, so the addon path (which doesn't go through the CLI) saw the regression on whichever language/seed combinations happen to hit the cadence (most reliably reproduced on German with the default seed=42). Mirrors the CLI's existing guard 1:1, gated on is_mtl since the Turbo codebook has a different cadence signature. Co-authored-by: Cursor <cursoragent@cursor.com>

Add tts-cpp/ subtree (Chatterbox Turbo + Multilingual + Supertonic TTS) + integration fixes

…in init_gpu_backend On Adreno + PR #14/#15 the policy correctly picks OpenCL and Chatterbox runs to completion. On Vulkan-on-Mali (Google Pixel 9 Pro XL / Tensor G4) ggml_backend_dev_init throws an unhandled C++ exception during pipeline init, which bubbles up to libc++abi::terminate() and SIGABRT crashes the host process before the caller can react. Wrap the call in try-catch inside try_init: on any exception, log verbosely and 'continue' to the next candidate; if every candidate in a bucket throws or returns null, the lambda returns nullptr and the policy proceeds to the next bucket. After all buckets fail init_gpu_backend returns nullptr and the caller falls back to CPU -- which is exactly what 'no usable GPU available' should mean. Defensive layer that handles any future bad-GPU vendor (not Mali specific): SIGABRT during GPU init is never an acceptable failure mode for a TTS engine that has a working CPU path. Validated against Pixel 9 Pro XL on AWS Device Farm via the QVAC-19254 [DO NOT MERGE] test PR (tetherto/qvac#2320). QVAC-19254

GustavoA1604 requested review from a team as code owners May 6, 2026 16:34

GustavoA1604 force-pushed the tts-cpp branch 5 times, most recently from fe33e3b to 99188e5 Compare May 6, 2026 23:37

Add tts-cpp files

ef840d5

GustavoA1604 force-pushed the tts-cpp branch from 99188e5 to ef840d5 Compare May 6, 2026 23:43

GustavoA1604 and others added 15 commits May 6, 2026 20:57

GustavoA1604 changed the title ~~Add tts-cpp files~~ Add tts-cpp/ subtree (Chatterbox Turbo + Multilingual + Supertonic TTS) + integration fixes May 7, 2026

GustavoA1604 merged commit be913c8 into tetherto:master May 7, 2026
58 of 66 checks passed

GustavoA1604 mentioned this pull request May 7, 2026

Add tts-cpp port + bump parakeet-cpp / ggml-speech to port-version 1 tetherto/qvac-registry-vcpkg#137

Merged

10 tasks

Zbig9000 mentioned this pull request May 11, 2026

Qvac 18607 tts ggml add and optimize open cl for supertonic #16

Merged

5 tasks

pratiknarola-t mentioned this pull request May 28, 2026

QVAC-19213 tts-cpp: Supertonic + Chatterbox/S3Gen GPU sched for Adreno OpenCL #35

Closed

gianni-cor pushed a commit that referenced this pull request May 28, 2026

Merge pull request #14 from GustavoA1604/tts-cpp

8c03ba0

Add tts-cpp/ subtree (Chatterbox Turbo + Multilingual + Supertonic TTS) + integration fixes

pratiknarola-t mentioned this pull request Jun 9, 2026

QVAC-19253 tts-cpp: Supertonic + Chatterbox on Adreno-Vulkan #41

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tts-cpp/ subtree (Chatterbox Turbo + Multilingual + Supertonic TTS) + integration fixes#14

Add tts-cpp/ subtree (Chatterbox Turbo + Multilingual + Supertonic TTS) + integration fixes#14
GustavoA1604 merged 16 commits into
tetherto:masterfrom
GustavoA1604:tts-cpp

GustavoA1604 commented May 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

GustavoA1604 commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's in the PR

1. tts-cpp/ subtree drop (commit ef840d5)

2. Top-level README.md pointer (commit a2f2dd6)

3. Integration deltas vs the standalone repo (commits fa0d490, ae34c58, 8ba10a6, e673182)

4. Round-3 review fixes mirrored from standalone (commits 4b5d2d7, 28ef67d, e8f6065, 04b87ea, 64abb81, 1963f9f, 942686d)

5. MTL Chatterbox correctness fixes (commits db87f42, 0b44674)

6. parakeet-cpp/ SentencePiece word-start signal (commit 761eca0)

Design notes (preempting review questions)

Why is TTS_CPP_USE_SYSTEM_GGML=ON the default in this subtree?

Why mirror commits from chatterbox.cpp instead of squashing into the initial port?

Why is the parakeet-cpp/ change in this PR?

Why is OpenMP defaulted OFF on Windows non-MinGW?

Why ~57 K added lines?

Why doesn't this PR bump qvac-registry-vcpkg/ports/tts-cpp?

Why are some chatterbox.cpp references kept verbatim in tts-cpp/README.md?

Why is BackendDevice shaped exactly like parakeet::Engine::backend_device()?

Test plan

Related

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

GustavoA1604 commented May 6, 2026 •

edited

Loading

1. `tts-cpp/` subtree drop (commit `ef840d5`)

2. Top-level `README.md` pointer (commit `a2f2dd6`)

3. Integration deltas vs the standalone repo (commits `fa0d490`, `ae34c58`, `8ba10a6`, `e673182`)

4. Round-3 review fixes mirrored from standalone (commits `4b5d2d7`, `28ef67d`, `e8f6065`, `04b87ea`, `64abb81`, `1963f9f`, `942686d`)

5. MTL Chatterbox correctness fixes (commits `db87f42`, `0b44674`)

6. `parakeet-cpp/` SentencePiece word-start signal (commit `761eca0`)

Why is `TTS_CPP_USE_SYSTEM_GGML=ON` the default in this subtree?

Why mirror commits from `chatterbox.cpp` instead of squashing into the initial port?

Why is the `parakeet-cpp/` change in this PR?

Why doesn't this PR bump `qvac-registry-vcpkg/ports/tts-cpp`?

Why are some `chatterbox.cpp` references kept verbatim in `tts-cpp/README.md`?

Why is `BackendDevice` shaped exactly like `parakeet::Engine::backend_device()`?