Skip to content

feat: add @qvac/tts-ggml package (Chatterbox + Supertonic on tts-cpp)#1983

Merged
GustavoA1604 merged 29 commits into
mainfrom
feat/tts-ggml
May 11, 2026
Merged

feat: add @qvac/tts-ggml package (Chatterbox + Supertonic on tts-cpp)#1983
GustavoA1604 merged 29 commits into
mainfrom
feat/tts-ggml

Conversation

@GustavoA1604

@GustavoA1604 GustavoA1604 commented May 11, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Adds a new @qvac/tts-ggml addon (v0.1.0) that wraps the tts-cpp
    library (GGML backend) and exposes both tts_cpp::chatterbox::Engine
    and tts_cpp::supertonic::Engine behind a single engine-agnostic JS
    surface — intended as a substitute for @qvac/tts-onnx.
  • Engine auto-detection from files (chatterbox-* gguf vs
    supertonic.gguf), with explicit override via the
    engine: 'chatterbox' | 'supertonic' option. Static
    TTSGgml.ENGINE_CHATTERBOX / ENGINE_SUPERTONIC constants and a
    getEngineType() accessor.
  • Chatterbox supports English and multilingual GGUFs, voice cloning from
    a reference wav path, baked voice profiles via voicesDir /
    voiceDir, and a GPU backend cascade (Metal / CUDA / Vulkan / OpenCL)
    with nGpuLayers offload. Supertonic runs CPU-only today with
    per-voice voice / voiceName selection and .npy initial-noise
    reproduction.
  • Streaming APIs aligned with @qvac/tts-onnx
    (run({ streamOutput: true }), runStream, runStreaming) plus
    Chatterbox-only native streaming knobs (streamChunkTokens,
    streamFirstChunkTokens, cfmSteps) and cross-compat aliases
    (voiceNamevoice, numInferenceStepssteps).
  • Output sample-rate control via runtimeConfig.outputSampleRate and
    per-job TTSRunInput.outputSampleRate (8000–192000 Hz);
    TTSOutputChunk.sampleRate reported on every chunk; new
    RuntimeStats.backendDevice / backendId fields surface the active
    compute device.

Differences vs @qvac/tts-onnx

For callers migrating from @qvac/tts-onnx, the following surface
differences are intentional and documented in
packages/tts-ggml/CHANGELOG.md:

  • No LavaSR enhancer. EnhancerConfig / LavaSREnhancerConfig,
    constructor enhancer, and per-job TTSRunInput.enhancer are not
    ported — the GGML backend has no neural bandwidth-extension or
    denoiser path today.
  • referenceAudio is a path string, not Float32Array | number[].
  • numThreadsthreads.
  • supertonicMultilingual is removed. Multilingual is driven by the
    loaded GGUF + engine selection.
  • Supertonic rejects useGPU: true / non-zero nGpuLayers at
    construction time (CPU-only).
  • No ONNX-style *Path file aliases. The GGML file set is much
    smaller (single GGUF per component).

What's included

  • New package directory packages/tts-ggml/ — addon (C++ + CMake +
    vcpkg), JS surface (index.js, index.d.ts, tts.js, lib/),
    unit + integration tests (brittle-bare), C++ unit tests
    (GoogleTest) with coverage:cpp (llvm-cov) target, mobile
    integration test generator, examples for each engine + streaming
    mode, README, and CHANGELOG.md (0.1.0).
  • vcpkg manifest pointing at the tetherto registry, consuming the
    renamed tts-cpp port; baseline bumped to the
    2026-05-07 registry commit.
  • Linux clang toolchain uses unversioned clang/clang++ (no more
    clang-19 pin); matches the monorepo's setup-llvm action default.
  • Mobile bundle ships Supertonic only — Chatterbox tokenizer exceeds
    the Metro V8 string limit, so it's excluded from the mobile wrapper.
  • CI integration tests unblocked on every desktop runner; useGPU
    semantics tightened to fail loud rather than silently fall back.

Test plan

Tested here

GustavoA1604 and others added 25 commits April 21, 2026 16:52
New Bare addon wrapping the `qvac-tts::qvac-tts` static library (backed
by the `tts-cpp` port added in tetherto/qvac-registry-vcpkg).  API-compatible
with the Chatterbox engine exposed by `@qvac/tts-onnx` so downstream
consumers can swap backends without touching orchestration code.

## Scope

* First iteration.  Supports Chatterbox **English** only.  Chatterbox
  multilingual, LavaSR enhancer, Supertonic engine, and streaming are
  out of scope and remain in `@qvac/tts-onnx`.  They'll land alongside
  the evolution of qvac-tts.cpp.
* Native backend is the static `qvac-tts` library from the QVAC vcpkg
  registry (`ports/tts-cpp`, baseline `2026-04-21`).  No ONNX Runtime
  dependency.

## JS surface

* `@qvac/tts-ggml` exports `TTSGgml` with the same method shape as
  `ONNXTTS`:  `run` / `runStream` / `runStreaming` / `reload` /
  `unload` / `destroy`.
* `files: { modelDir }` looks for `chatterbox-t3-turbo.gguf` +
  `chatterbox-s3gen.gguf` side-by-side; `files.t3Model` /
  `files.s3genModel` override the defaults.
* Options: `referenceAudio`, `voiceDir` (baked profile), `seed`,
  `nGpuLayers`, `threads`, `outputSampleRate`, plus placeholders for
  the upcoming streaming flags (`streamChunkTokens`,
  `streamFirstChunkTokens`, `cfmSteps`).
* Shared reusable lib code (`lib/textChunker.js`,
  `lib/textStreamAccumulator.js`, `addonLogging.*`) is copied verbatim
  from `@qvac/tts-onnx`.
* New error class `QvacErrorAddonTTSGgml` uses codes **13001–14000**
  to avoid collisions with `@qvac/tts-onnx` (7001–7011) when both
  packages are loaded in the same Bare process.

## Native addon

* `addon/src/model-interface/chatterbox/ChatterboxModel.{hpp,cpp}` —
  `IModel` + `IModelCancel` implementation.  First-iteration strategy:
  assemble argv for `qvac_tts_cli_main` with a scratch `.wav` output
  path, call it synchronously, then parse the resulting 16-bit mono
  PCM wav back into `std::vector<int16_t>` for the JS handler.
  Consequences: every job re-loads the model (~700 ms + inference
  time), no mid-synthesis cancellation, no streaming.  The follow-up
  milestone replaces this with a persistent, struct-based API once
  qvac-tts.cpp exposes one.
* `addon/src/js-interface/{JSAdapter.{hpp,cpp}, binding.cpp}` — JS-to-C++
  config bridging (same string-map pattern as `@qvac/tts-onnx`) and the
  `BARE_MODULE(qvac_tts_ggml, ...)` registration exposing
  `createInstance` / `runJob` / `reload` / `activate` / `cancel` /
  `destroyInstance` / `loadWeights` / `setLogger` / `releaseLogger`.
* `addon/src/addon/AddonJs.hpp` — JS-facing `createInstance` / `runJob`
  / `reload` wrappers that register a `JsAudioOutputHandler` emitting
  `{ outputArray: Int16Array, sampleRate: number }` to JS.

## Build / registry

* `CMakeLists.txt` uses `find_package(qvac-tts-cpp CONFIG REQUIRED)`
  and the standard `cmake-bare` + `cmake-vcpkg` scaffolding (shape
  matches `@qvac/transcription-whispercpp`).
* `vcpkg.json` depends on `tts-cpp` (with a `vulkan` feature passthrough)
  plus `qvac-lib-inference-addon-cpp`, `qvac-lint-cpp`, and `gtest`.
* `vcpkg-configuration.json` points at tetherto/qvac-registry-vcpkg.
  NOTE: the baseline pin here is inherited from
  `@qvac/transcription-whispercpp` and **must be bumped** to a commit
  that contains the `tts-cpp` port once that registry PR lands.  A
  follow-up commit will update it.

## Tests & examples

* Integration + unit test files for Chatterbox English are copied
  verbatim from `@qvac/tts-onnx` with only mechanical renames
  (`ONNXTTS` -> `TTSGgml`, `QvacErrorAddonTTS` -> `QvacErrorAddonTTSGgml`,
  `@qvac/tts-onnx/text-chunker` -> `../../lib/textChunker.js`).  Some
  paths in `test/integration/addon.test.js` still import Supertonic /
  LavaSR helpers that don't exist in this package — those test blocks
  will fail fast when the file loads, which is expected until those
  backends get their own ggml packages.
* Examples: `chatterbox-tts.js`, `chatterbox-streaming-tts.js`, plus
  shared `wav-helper.js` + `pcm-chunk-player.js`.

## What's not in this PR (known gaps)

* No docs: README, NOTICE, CHANGELOG, PULL_REQUEST_TEMPLATE changes
  will land in a single documentation pass once the registry + fork
  commits have merged upstream.
* `vcpkg-configuration.json` baseline needs to point at a
  qvac-registry-vcpkg commit that ships `tts-cpp` (pending the
  registry PR).
* Actual `npm run build` requires the registry and fork commits to be
  on `main` of their respective upstream repos.
…commit

Bumps `vcpkg-configuration.json` to GustavoA1604/qvac-registry-vcpkg
at commit 1e2839680b6be8d8ffff889a9c29b966c176098c — the commit that
adds the `tts-cpp` port.  Paired with the `qvac-tts` library already
pinned in the port's `portfile.cmake` (GustavoA1604/chatterbox.cpp
@ 0fe4a521618cc30358040b29d75d4261b31cbb60).

Will be re-pointed at tetherto/qvac-registry-vcpkg once the registry
PR lands upstream.
… mobile wrapper

Second pass over @qvac/tts-ggml after the build started passing: prune
everything that only made sense for the ONNX-era multi-engine scope and
adapt the remaining Chatterbox-English bits to the GGUF + file-path
reference-audio contract.  Restores `test/mobile/` so the Android build
has something to point at.

## C++

* `ChatterboxModel.cpp`: the `ArgvBuilder::buildArgv` doc comment
  contained `**/` which closed the block comment early and broke the
  build.  Rewrote as a `//` comment.

## Examples

* `examples/chatterbox-tts.js` — rewrite for v0 contract: single
  `<text>` argv, `files: { modelDir }` pointing at the two GGUFs,
  `referenceAudio` is now a wav **path** (addon passes it to
  `--reference-audio`) instead of a Float32Array.  Drops
  english/multilingual arg and the CHATTERBOX_VARIANT switch that
  picked which `.onnx` files to load.
* Removed `examples/chatterbox-streaming-tts.js` +
  `examples/pcm-chunk-player.js`.  The v0 addon re-loads the model
  per `run()` call — exposing streaming would mislead.  Both come
  back alongside the persistent-engine milestone.
* `package.json`: `npm run example` now passes a default text so it
  runs without extra args.

## Tests

### Kept as-is (engine-agnostic)

* `test/unit/textChunker.test.js`
* `test/mock/{MockedBinding,utils}.js`
* `test/utils/{wav-helper,pcmConcatenator,loader.fake,runWhisper,runTTS}.js`
* `test/reference-audio/jfk.wav`, `test/data/sentences-*.js`

### Mechanical fixes

* `test/unit/tts.error.test.js` — fix error-code assertions to the
  tts-ggml range (`13001–14000`); was still checking the
  `@qvac/tts-onnx` range (`7001–7011`).
* `test/unit/tts-ggml.lifecycle.test.js` — fix stale
  `QvacErrorAddonTTS` import to `QvacErrorAddonTTSGgml`; switch the
  stubbed model to `{ t3Model, s3genModel }` GGUFs and drop the
  non-existent `engine: 'chatterbox'` option.
* `test/unit/tts-ggml.sentence-stream.test.js` — same GGUF/engine
  cleanup.

### Rewritten

* `test/unit/chatterbox.inference.test.js` — drop tests that asserted
  the old ONNX file shape (`tokenizer / speechEncoder / embedTokens /
  conditionalDecoder / languageModel`), the removed `engine` detection
  and the wrong `getModelKey` return value (`'onnx-tts'` -> `'tts-ggml'`).
  New tests cover: `modelDir` derives the two GGUF paths; explicit
  `t3Model` / `s3genModel` override the defaults.  The mocked-binding
  run/reload/cancel flow stays.
* `test/integration/addon.test.js` — fresh, ~180 LoC, Chatterbox-English
  only.  Ensures the GGUFs are present, runs the short sentence set
  through `loadChatterboxTTS` + `runChatterboxTTS[WithSplit]`, and
  (on darwin only) runs a whisper-based WER check via the existing
  `runWhisper` util.  Drops the Chatterbox-multilingual block + every
  Supertonic + LavaSR block that doesn't apply to this package.
* `test/utils/runChatterboxTTS.js` — rewrite for the GGUF contract:
  `files: { modelDir, t3Model, s3genModel }`, `referenceAudio` as a
  file path that falls back to `test/reference-audio/jfk.wav` (or the
  mobile test-asset when `global.assetPaths` is present).  No more
  WAV decode / resample on the JS side.
* `test/utils/downloadModel.js` — trim from 1007 LoC to 280.  Drops
  the Supertonic + LavaSR + Chatterbox-multilingual + Cangjie
  downloaders.  Keeps the shared HTTP/curl infrastructure and
  `ensureWhisperModel` (still used by the integration WER check).
  `ensureChatterboxModels` is now **check-only**: it verifies
  `chatterbox-t3-turbo.gguf` + `chatterbox-s3gen.gguf` exist locally
  and, if missing, prints the exact commands for generating them
  from the qvac-tts.cpp (née chatterbox.cpp) conversion scripts.
  Once the GGUFs land on a canonical HuggingFace repo we'll wire up
  download URLs here.

## Scripts

* `scripts/ensure-chatterbox.js` — simplify to a single invocation
  against `./models/`.  Drops the variant / language matrix that the
  ONNX downloader needed.
* `scripts/ensure-models.js` — now a thin alias to
  `ensure-chatterbox.js`.  Drops the Supertonic + LavaSR orchestration.

## Mobile

* Restored `test/mobile/{integration.auto.cjs, integration-runtime.cjs,
  testAssets/jfk.wav}` so the Android build has a wrapper to point at.
* `package.json`: re-added `test/mobile` to the `files` list.

## Gitignore

* Ignore generated `.clang-format` / `.clang-tidy` / `.valgrind.supp`
  (produced by the top-level `configure_file(...)` calls) and
  `build_*/` dirs (bare-make convention).

## Verified locally

* `npx standard "test/**/*.js" "*.js" "lib/*.js"` — clean.
* `npm run test:unit` — 38/38 pass (105/105 asserts).
* `npm run build && bare examples/chatterbox-tts.js "Hello from qvac tts ggml."`
  produces a 24 kHz wav as expected.
Upstream chatterbox.cpp renamed the package + namespace + target from
qvac-tts to tts-cpp and tightened the library boundary; pick up the
new artefacts here:

- find_package(qvac-tts-cpp CONFIG REQUIRED)
    -> find_package(tts-cpp CONFIG REQUIRED)
- qvac-tts::qvac-tts  -> tts-cpp::tts-cpp
- qvac_tts::chatterbox -> tts_cpp::chatterbox (engine ptrs, EngineOptions,
  SynthesisResult, forward-decls in ChatterboxModel.hpp)
- #include <qvac-tts/chatterbox/engine.h>
    -> #include <tts-cpp/chatterbox/engine.h>
- Doxygen / inline doc references to the old names refreshed alongside
  the code changes.

vcpkg wiring:
- vcpkg-configuration.json baseline bumped to qvac-registry-vcpkg
  commit bc30b0b (ports/tts-cpp renamed and repointed at
  chatterbox.cpp@f8f9145).
- vcpkg.json tts-cpp constraint bumped to 2026-04-24#1 (the port that
  carries the rename + namespace + install(EXPORT) changes).

Verified with a cold bare-make generate + bare-make build against the
new port, and the addon's existing unit + integration test suites.

Made-with: Cursor
Picks up the round-3 review-fix wave landed on the tts-cpp port:

  e673182  scrub stale patches/ refs from README                (N10)
  8ba10a6  drop unreachable TTS_CPP_GGML_LIB_PREFIX block        (N8)
  4b5d2d7  mirror N1-N7 fixes from chatterbox.cpp source-of-truth
            - N1 supertonic alive-registry guard against freed-backend
              gallocr_free assert on hot-swap (Vulkan/Metal/CUDA)
            - N2 drop dead g_sink_* state, soften log_set docstring
            - N3 Turbo BPE try/catch (exception-safe Engine ctor)
            - N4 STFT cancel checkpoint + tighter Engine::cancel() doc
            - N5 document s3gen_preload/unload refcount semantics
            - N6 drop dead cached_text_lc Supertonic shim
            - N7 fix misleading "no copy" view-vs-copy log wording

Plus the integrated-port-only round-2 fixes that landed earlier:

  fa0d490  close patches/-deleted regression: TTS_CPP_USE_SYSTEM_GGML
            now defaults ON; bundled-without-patches hard-errors at
            configure time with a pointer at the ggml-speech vcpkg
            port.
  ae34c58  README rewritten for integrated/vcpkg context.
  a2f2dd6  top-level qvac-ext-lib-whisper.cpp README points at the
            tts-cpp/ subtree (alongside parakeet-cpp/).

Public API used by ChatterboxModel (tts_cpp::chatterbox::Engine /
EngineOptions / SynthesisResult / s3gen_preload / s3gen_unload) is
backward-compatible: the new port adds Engine::backend_name(),
MTL-variant fields on EngineOptions (language / cfg_weight / min_p /
exaggeration), and a separate tts_cpp::supertonic::Engine class, but
nothing this consumer was already calling has changed.

Edits:

  packages/tts-ggml/vcpkg.json
    - tts-cpp dep: version>=2026-04-24#1 -> version>=2026-05-07.

  packages/tts-ggml/vcpkg-configuration.json
    - default-registry baseline: bc30b0b (April 2026 fork-only state)
      -> 16b91afdcfd59baea60e81f3da94f49311ef2a97.  The new baseline
      pulls in the post-tetherto-merge state (parakeet-cpp port at
      932d5d9, ggml-speech port-version 1 at f07bdd0) plus the new
      tts-cpp port (16b91af) on the developer's GustavoA1604
      registry fork.

Smoke-test plan: after running `vcpkg install` against the new
baseline, the tts-cpp port's vcpkg_from_github resolves at
GustavoA1604/qvac-ext-lib-whisper.cpp@e673182 (tts-cpp branch) until the
upstream PR merges.  ChatterboxModel should build and synthesize
identically; expanding to Multilingual + Supertonic flows is the
follow-up commit on the package side.

Co-authored-by: Cursor <cursoragent@cursor.com>
The toolchain hardcoded `clang-19` / `clang++-19` (versioned binary
names) since the package's first commit (0a2c978).  Linux CI hadn't
exercised this path before — the new on-pr-tts-ggml.yml -> integration
matrix is the first time it does, and it fails on every linux runner
(ai-run-ubuntu-22.04, ai-run-linux-gpu, ubuntu-24.04-arm) at vcpkg's
"detect_compiler" step because none of the GH-hosted images ship a
`clang-19` symlink:

  Detecting compiler hash for triplet x64-linux...
  error: while detecting compiler information:
  ...
  CMake Error at scripts/cmake/vcpkg_execute_required_process.cmake:127
  (message): Command failed: ... -DVCPKG_CHAINLOAD_TOOLCHAIN_FILE=
  .../tts-ggml/vcpkg/triplets/../toolchains/linux-clang.cmake ...

Match parakeet's working pattern (qvac-lib-infer-parakeet/vcpkg/
toolchains/linux-clang.cmake): use unversioned `clang` / `clang++` so
each runner picks up its image's default clang (clang-15 on
ubuntu-22.04, clang-18 on ubuntu-24.04, whatever the AI runners ship).
The `-stdlib=libc++` flag added by x64-linux.cmake / arm64-linux.cmake
is honoured by every reasonable clang version.

Co-authored-by: Cursor <cursoragent@cursor.com>
Bundle of correctness, hygiene, and CI-doc fixes from the recent code
review.  Each item below has its own paragraph in the diff comments.

- #1 files-array: add test/utils/runSupertonicTTS.js + test/data/sentences-{medium,long}.js
  to package.json so consumers running the integration tests from the
  npm tarball don't crash with `Cannot find module ../utils/runSupertonicTTS`.
- #2 deps: move @qvac/langdetect-text from runtime dependencies to
  devDependencies (it's only referenced from examples/, which aren't in
  the published files list).
- #3 race-fix: ChatterboxModel::process()'s post-synthesize streaming
  detection used to read engine_->options() outside engineMu_, racing
  with reload().  synthesize() now returns SynthesizeResult { pcm,
  wasStreaming } where wasStreaming is captured under the engine lock
  against the local shared_ptr so process() doesn't have to touch
  engine_ again.
- #4 deferred-load: ChatterboxModel + SupertonicModel constructors
  used to call load() eagerly, so JsInterface::createInstance() (sync
  on the JS thread) was parsing ~370 MB of GGUF on the Bare event loop.
  Both models now implement IModelAsyncLoad: constructors validate +
  return; the actual load is deferred to waitForLoadInitialization(),
  which the new addon_js::activate wraps inside JsAsyncTask::run so the
  parse runs on a worker thread.  binding.cpp registers
  addon_js::activate in place of JsInterface::activate; tts.js now
  awaits the resulting promise.
- #5 dead code: drop _resolvePath (unused), drop the (void)inputObj
  read in AddonJs.hpp::runJob, document FAILED_TO_PAUSE /
  FAILED_TO_STOP / JOB_ALREADY_RUNNING in lib/error.js as reserved-but-
  not-thrown so future maintainers don't delete them blindly (the unit
  suite asserts the values).
- #6 cancel-reset: SupertonicModel grew Chatterbox's cancelRequested_
  reset pattern: cancel() sets it, synthesize() fast-fails on it,
  process() resets it per call so a stale cancel doesn't poison the
  next run.
- #7 useGPU comment: explain in JSAdapter::buildChatterboxConfig that
  the JS layer is the source of truth for useGPU and nGpuLayers wins
  downstream; left a pointer to std::optional<bool> if a future caller
  ever needs to distinguish "absent" from "explicit false".
- #10 fork pointers: README.md and test/utils/downloadModel.js no
  longer point at GustavoA1604/chatterbox.cpp; both reference the
  upstream tetherto/qvac-ext-lib-whisper.cpp/tts-cpp tree now.
- #9 doc: integration-mobile-test-tts-ggml.yml gained a header comment
  on the build-and-test job documenting that continue-on-error is the
  early-days landing posture (merge-guard treats success || skipped as
  pass), with a pointer to tighten once Device Farm provisioning is
  stable.

Nits:
- 'use strict' added to addonLogging.js (matches every other .js).
- node-vs-bare runtime banners on
  scripts/{generate,validate}-mobile-integration-tests.js.
- ttsOutputDebugString no longer JSON.stringify's the full PCM
  Int16Array on every chunk-streaming event; emits a tiny summary
  ({sampleRate, chunkIndex, isLast, sentenceChunk, outputArrayLen})
  instead.

Tests: 35 passing (33 -> 35; two new assertions cover the deferred-load
contract); 4 skipped real-GGUF tests behind the existing
QVAC_TEST_CHATTERBOX_T3_GGUF / QVAC_TEST_CHATTERBOX_S3GEN_GGUF /
QVAC_TEST_SUPERTONIC_GGUF env-var gates.  Lint clean.

Co-authored-by: Cursor <cursoragent@cursor.com>
Four independent failures, one per platform:

1. linux-x64 / linux-arm64: addon load crashed at
   `libomp.so.5: cannot open shared object file`.  tts-cpp's binary is
   built with clang under the linux-clang toolchain and links against
   libomp (LLVM OpenMP runtime); only `libgomp1` (GNU OpenMP) was being
   apt-installed.  Add `libomp5` so libomp.so.5 is on the loader path.

2. darwin-arm64: convert-models.sh aborted at line 200 with
   `hf_args[@]: unbound variable`.  macOS's system bash is 3.2 which
   treats `"${arr[@]}"` as nounset access when the array is empty under
   `set -u`; with HF_TOKEN unset we hit it on every fresh runner.  Use
   the `${arr[@]+"${arr[@]}"}` idiom (defined-or-nothing) at all six
   call sites and add a header comment so the next maintainer doesn't
   accidentally regress.

3. darwin-x64: pip install bombed building `llvmlite` from source
   because the macos-15-large runner has no LLVM 15 development
   install.  Root cause: librosa pulls in numba 0.65+, which stopped
   shipping darwin-x86_64 wheels for Python 3.12.  Pin Python to 3.11
   in the Setup Python step; 3.11 has prebuilt wheels for the entire
   numba/llvmlite/librosa stack on darwin-x64 and is fine for every
   other converter dependency.

4. windows-2022: ChatterboxModel::load threw
   `vk::createInstance: ErrorIncompatibleDriver`.  Root cause: the
   addon's index.js::_validateConfig defaults `useGPU = true` when
   neither useGPU nor nGpuLayers is specified, so the test ran with
   n_gpu_layers=99 -> ggml_backend_vk_init -> vk::createInstance ->
   ErrorIncompatibleDriver on the runner's no-Vulkan-driver image.
   runChatterboxTTS.js now honours `process.env.NO_GPU === 'true'`
   (set on the no-GPU matrix entries) and forces useGPU=false on
   exactly those runners; the other test runners (chatterbox-mtl,
   gpu-smoke, multiple-runs) already had this guard.

Also documents the `mesa-vulkan-drivers` apt package (already pulled
in) as the software ICD that lets the Vulkan-built prebuild's runtime
backend probe enumerate at least one device on linux runners.

Co-authored-by: Cursor <cursoragent@cursor.com>
Mobile build failed at `:app:createBundleReleaseJsAndAssets` with:

  SyntaxError: assets/testAssets/chatterbox-s3gen.gguf:
    Cannot create a string longer than 0x1fffffe8 characters

Root cause: Metro's bundler reads every asset under
`test/mobile/testAssets/` via `Buffer.toString()`.  V8's max string
length is 0x1fffffe8 (~512 MiB).  chatterbox-s3gen.gguf is ~1 GiB even
with --quant q4_0 because the s3gen converter only quantizes attention
weights and leaves the bulk of the s3gen graph in fp16 ("0/291 weight
tensors quantized" in the converter log).

Fix: bundle ONLY supertonic.gguf (~125 MiB, comfortably under the
limit) on mobile.  Mobile Chatterbox tests degrade cleanly to
`t.pass('Skipped: Chatterbox GGUFs not available')` via the existing
`ensureChatterboxModels` helper -- it already returns
{ success: false } when the GGUFs aren't on disk.

Cache key bumped to v2 so existing v1 cache entries (which include
the chatterbox files) are evicted on the next run.

Bundling Chatterbox on mobile requires either:
  - adding `gguf` to qvac-test-addon-mobile's metro `assetExts` so the
    JS-string read is skipped (then the s3gen file can flow through the
    bundle as a raw asset), or
  - pushing the chatterbox GGUFs to the device via `adb push` outside
    the bundle and surfacing the path through downloadModel.js's
    existing ANDROID_CANDIDATE_DIRS fallback.

Both are outside the scope of this PR; documented inline above the
cache step for the next maintainer.

Co-authored-by: Cursor <cursoragent@cursor.com>
* fix[api]: make useGPU flag actually force CPU/GPU and reject useGPU/nGpuLayers conflicts

* add gpu smoke test

* resolve comments

---------

Co-authored-by: Ishan Vohra <ishanvohra@Ishans-MacBook-Air.local>
@GustavoA1604 GustavoA1604 merged commit b4324a5 into main May 11, 2026
16 of 26 checks passed
@GustavoA1604 GustavoA1604 deleted the feat/tts-ggml branch May 11, 2026 21:05
Proletter pushed a commit that referenced this pull request May 24, 2026
…#1983)

* feat: add @qvac/tts-ggml package (Chatterbox English on qvac-tts.cpp)

New Bare addon wrapping the `qvac-tts::qvac-tts` static library (backed
by the `tts-cpp` port added in tetherto/qvac-registry-vcpkg).  API-compatible
with the Chatterbox engine exposed by `@qvac/tts-onnx` so downstream
consumers can swap backends without touching orchestration code.

## Scope

* First iteration.  Supports Chatterbox **English** only.  Chatterbox
  multilingual, LavaSR enhancer, Supertonic engine, and streaming are
  out of scope and remain in `@qvac/tts-onnx`.  They'll land alongside
  the evolution of qvac-tts.cpp.
* Native backend is the static `qvac-tts` library from the QVAC vcpkg
  registry (`ports/tts-cpp`, baseline `2026-04-21`).  No ONNX Runtime
  dependency.

## JS surface

* `@qvac/tts-ggml` exports `TTSGgml` with the same method shape as
  `ONNXTTS`:  `run` / `runStream` / `runStreaming` / `reload` /
  `unload` / `destroy`.
* `files: { modelDir }` looks for `chatterbox-t3-turbo.gguf` +
  `chatterbox-s3gen.gguf` side-by-side; `files.t3Model` /
  `files.s3genModel` override the defaults.
* Options: `referenceAudio`, `voiceDir` (baked profile), `seed`,
  `nGpuLayers`, `threads`, `outputSampleRate`, plus placeholders for
  the upcoming streaming flags (`streamChunkTokens`,
  `streamFirstChunkTokens`, `cfmSteps`).
* Shared reusable lib code (`lib/textChunker.js`,
  `lib/textStreamAccumulator.js`, `addonLogging.*`) is copied verbatim
  from `@qvac/tts-onnx`.
* New error class `QvacErrorAddonTTSGgml` uses codes **13001–14000**
  to avoid collisions with `@qvac/tts-onnx` (7001–7011) when both
  packages are loaded in the same Bare process.

## Native addon

* `addon/src/model-interface/chatterbox/ChatterboxModel.{hpp,cpp}` —
  `IModel` + `IModelCancel` implementation.  First-iteration strategy:
  assemble argv for `qvac_tts_cli_main` with a scratch `.wav` output
  path, call it synchronously, then parse the resulting 16-bit mono
  PCM wav back into `std::vector<int16_t>` for the JS handler.
  Consequences: every job re-loads the model (~700 ms + inference
  time), no mid-synthesis cancellation, no streaming.  The follow-up
  milestone replaces this with a persistent, struct-based API once
  qvac-tts.cpp exposes one.
* `addon/src/js-interface/{JSAdapter.{hpp,cpp}, binding.cpp}` — JS-to-C++
  config bridging (same string-map pattern as `@qvac/tts-onnx`) and the
  `BARE_MODULE(qvac_tts_ggml, ...)` registration exposing
  `createInstance` / `runJob` / `reload` / `activate` / `cancel` /
  `destroyInstance` / `loadWeights` / `setLogger` / `releaseLogger`.
* `addon/src/addon/AddonJs.hpp` — JS-facing `createInstance` / `runJob`
  / `reload` wrappers that register a `JsAudioOutputHandler` emitting
  `{ outputArray: Int16Array, sampleRate: number }` to JS.

## Build / registry

* `CMakeLists.txt` uses `find_package(qvac-tts-cpp CONFIG REQUIRED)`
  and the standard `cmake-bare` + `cmake-vcpkg` scaffolding (shape
  matches `@qvac/transcription-whispercpp`).
* `vcpkg.json` depends on `tts-cpp` (with a `vulkan` feature passthrough)
  plus `qvac-lib-inference-addon-cpp`, `qvac-lint-cpp`, and `gtest`.
* `vcpkg-configuration.json` points at tetherto/qvac-registry-vcpkg.
  NOTE: the baseline pin here is inherited from
  `@qvac/transcription-whispercpp` and **must be bumped** to a commit
  that contains the `tts-cpp` port once that registry PR lands.  A
  follow-up commit will update it.

## Tests & examples

* Integration + unit test files for Chatterbox English are copied
  verbatim from `@qvac/tts-onnx` with only mechanical renames
  (`ONNXTTS` -> `TTSGgml`, `QvacErrorAddonTTS` -> `QvacErrorAddonTTSGgml`,
  `@qvac/tts-onnx/text-chunker` -> `../../lib/textChunker.js`).  Some
  paths in `test/integration/addon.test.js` still import Supertonic /
  LavaSR helpers that don't exist in this package — those test blocks
  will fail fast when the file loads, which is expected until those
  backends get their own ggml packages.
* Examples: `chatterbox-tts.js`, `chatterbox-streaming-tts.js`, plus
  shared `wav-helper.js` + `pcm-chunk-player.js`.

## What's not in this PR (known gaps)

* No docs: README, NOTICE, CHANGELOG, PULL_REQUEST_TEMPLATE changes
  will land in a single documentation pass once the registry + fork
  commits have merged upstream.
* `vcpkg-configuration.json` baseline needs to point at a
  qvac-registry-vcpkg commit that ships `tts-cpp` (pending the
  registry PR).
* Actual `npm run build` requires the registry and fork commits to be
  on `main` of their respective upstream repos.

* chore: point tts-ggml vcpkg baseline at the tts-cpp-bearing registry commit

Bumps `vcpkg-configuration.json` to GustavoA1604/qvac-registry-vcpkg
at commit 1e2839680b6be8d8ffff889a9c29b966c176098c — the commit that
adds the `tts-cpp` port.  Paired with the `qvac-tts` library already
pinned in the port's `portfile.cmake` (GustavoA1604/chatterbox.cpp
@ 0fe4a521618cc30358040b29d75d4261b31cbb60).

Will be re-pointed at tetherto/qvac-registry-vcpkg once the registry
PR lands upstream.

* chore: tts-ggml: trim tests + examples to Chatterbox English, restore mobile wrapper

Second pass over @qvac/tts-ggml after the build started passing: prune
everything that only made sense for the ONNX-era multi-engine scope and
adapt the remaining Chatterbox-English bits to the GGUF + file-path
reference-audio contract.  Restores `test/mobile/` so the Android build
has something to point at.

## C++

* `ChatterboxModel.cpp`: the `ArgvBuilder::buildArgv` doc comment
  contained `**/` which closed the block comment early and broke the
  build.  Rewrote as a `//` comment.

## Examples

* `examples/chatterbox-tts.js` — rewrite for v0 contract: single
  `<text>` argv, `files: { modelDir }` pointing at the two GGUFs,
  `referenceAudio` is now a wav **path** (addon passes it to
  `--reference-audio`) instead of a Float32Array.  Drops
  english/multilingual arg and the CHATTERBOX_VARIANT switch that
  picked which `.onnx` files to load.
* Removed `examples/chatterbox-streaming-tts.js` +
  `examples/pcm-chunk-player.js`.  The v0 addon re-loads the model
  per `run()` call — exposing streaming would mislead.  Both come
  back alongside the persistent-engine milestone.
* `package.json`: `npm run example` now passes a default text so it
  runs without extra args.

## Tests

### Kept as-is (engine-agnostic)

* `test/unit/textChunker.test.js`
* `test/mock/{MockedBinding,utils}.js`
* `test/utils/{wav-helper,pcmConcatenator,loader.fake,runWhisper,runTTS}.js`
* `test/reference-audio/jfk.wav`, `test/data/sentences-*.js`

### Mechanical fixes

* `test/unit/tts.error.test.js` — fix error-code assertions to the
  tts-ggml range (`13001–14000`); was still checking the
  `@qvac/tts-onnx` range (`7001–7011`).
* `test/unit/tts-ggml.lifecycle.test.js` — fix stale
  `QvacErrorAddonTTS` import to `QvacErrorAddonTTSGgml`; switch the
  stubbed model to `{ t3Model, s3genModel }` GGUFs and drop the
  non-existent `engine: 'chatterbox'` option.
* `test/unit/tts-ggml.sentence-stream.test.js` — same GGUF/engine
  cleanup.

### Rewritten

* `test/unit/chatterbox.inference.test.js` — drop tests that asserted
  the old ONNX file shape (`tokenizer / speechEncoder / embedTokens /
  conditionalDecoder / languageModel`), the removed `engine` detection
  and the wrong `getModelKey` return value (`'onnx-tts'` -> `'tts-ggml'`).
  New tests cover: `modelDir` derives the two GGUF paths; explicit
  `t3Model` / `s3genModel` override the defaults.  The mocked-binding
  run/reload/cancel flow stays.
* `test/integration/addon.test.js` — fresh, ~180 LoC, Chatterbox-English
  only.  Ensures the GGUFs are present, runs the short sentence set
  through `loadChatterboxTTS` + `runChatterboxTTS[WithSplit]`, and
  (on darwin only) runs a whisper-based WER check via the existing
  `runWhisper` util.  Drops the Chatterbox-multilingual block + every
  Supertonic + LavaSR block that doesn't apply to this package.
* `test/utils/runChatterboxTTS.js` — rewrite for the GGUF contract:
  `files: { modelDir, t3Model, s3genModel }`, `referenceAudio` as a
  file path that falls back to `test/reference-audio/jfk.wav` (or the
  mobile test-asset when `global.assetPaths` is present).  No more
  WAV decode / resample on the JS side.
* `test/utils/downloadModel.js` — trim from 1007 LoC to 280.  Drops
  the Supertonic + LavaSR + Chatterbox-multilingual + Cangjie
  downloaders.  Keeps the shared HTTP/curl infrastructure and
  `ensureWhisperModel` (still used by the integration WER check).
  `ensureChatterboxModels` is now **check-only**: it verifies
  `chatterbox-t3-turbo.gguf` + `chatterbox-s3gen.gguf` exist locally
  and, if missing, prints the exact commands for generating them
  from the qvac-tts.cpp (née chatterbox.cpp) conversion scripts.
  Once the GGUFs land on a canonical HuggingFace repo we'll wire up
  download URLs here.

## Scripts

* `scripts/ensure-chatterbox.js` — simplify to a single invocation
  against `./models/`.  Drops the variant / language matrix that the
  ONNX downloader needed.
* `scripts/ensure-models.js` — now a thin alias to
  `ensure-chatterbox.js`.  Drops the Supertonic + LavaSR orchestration.

## Mobile

* Restored `test/mobile/{integration.auto.cjs, integration-runtime.cjs,
  testAssets/jfk.wav}` so the Android build has a wrapper to point at.
* `package.json`: re-added `test/mobile` to the `files` list.

## Gitignore

* Ignore generated `.clang-format` / `.clang-tidy` / `.valgrind.supp`
  (produced by the top-level `configure_file(...)` calls) and
  `build_*/` dirs (bare-make convention).

## Verified locally

* `npx standard "test/**/*.js" "*.js" "lib/*.js"` — clean.
* `npm run test:unit` — 38/38 pass (105/105 asserts).
* `npm run build && bare examples/chatterbox-tts.js "Hello from qvac tts ggml."`
  produces a 24 kHz wav as expected.

* Add streaming support

* Update ggml backend to use separate ggml repo

* tts-ggml: consume renamed tts-cpp library (2026-04-24#1)

Upstream chatterbox.cpp renamed the package + namespace + target from
qvac-tts to tts-cpp and tightened the library boundary; pick up the
new artefacts here:

- find_package(qvac-tts-cpp CONFIG REQUIRED)
    -> find_package(tts-cpp CONFIG REQUIRED)
- qvac-tts::qvac-tts  -> tts-cpp::tts-cpp
- qvac_tts::chatterbox -> tts_cpp::chatterbox (engine ptrs, EngineOptions,
  SynthesisResult, forward-decls in ChatterboxModel.hpp)
- #include <qvac-tts/chatterbox/engine.h>
    -> #include <tts-cpp/chatterbox/engine.h>
- Doxygen / inline doc references to the old names refreshed alongside
  the code changes.

vcpkg wiring:
- vcpkg-configuration.json baseline bumped to qvac-registry-vcpkg
  commit bc30b0b (ports/tts-cpp renamed and repointed at
  chatterbox.cpp@f8f9145).
- vcpkg.json tts-cpp constraint bumped to 2026-04-24#1 (the port that
  carries the rename + namespace + install(EXPORT) changes).

Verified with a cold bare-make generate + bare-make build against the
new port, and the addon's existing unit + integration test suites.

Made-with: Cursor

* tts-ggml: bump tts-cpp port to 2026-05-07 + registry baseline

Picks up the round-3 review-fix wave landed on the tts-cpp port:

  e673182  scrub stale patches/ refs from README                (N10)
  8ba10a6  drop unreachable TTS_CPP_GGML_LIB_PREFIX block        (N8)
  4b5d2d7  mirror N1-N7 fixes from chatterbox.cpp source-of-truth
            - N1 supertonic alive-registry guard against freed-backend
              gallocr_free assert on hot-swap (Vulkan/Metal/CUDA)
            - N2 drop dead g_sink_* state, soften log_set docstring
            - N3 Turbo BPE try/catch (exception-safe Engine ctor)
            - N4 STFT cancel checkpoint + tighter Engine::cancel() doc
            - N5 document s3gen_preload/unload refcount semantics
            - N6 drop dead cached_text_lc Supertonic shim
            - N7 fix misleading "no copy" view-vs-copy log wording

Plus the integrated-port-only round-2 fixes that landed earlier:

  fa0d490  close patches/-deleted regression: TTS_CPP_USE_SYSTEM_GGML
            now defaults ON; bundled-without-patches hard-errors at
            configure time with a pointer at the ggml-speech vcpkg
            port.
  ae34c58  README rewritten for integrated/vcpkg context.
  a2f2dd6  top-level qvac-ext-lib-whisper.cpp README points at the
            tts-cpp/ subtree (alongside parakeet-cpp/).

Public API used by ChatterboxModel (tts_cpp::chatterbox::Engine /
EngineOptions / SynthesisResult / s3gen_preload / s3gen_unload) is
backward-compatible: the new port adds Engine::backend_name(),
MTL-variant fields on EngineOptions (language / cfg_weight / min_p /
exaggeration), and a separate tts_cpp::supertonic::Engine class, but
nothing this consumer was already calling has changed.

Edits:

  packages/tts-ggml/vcpkg.json
    - tts-cpp dep: version>=2026-04-24#1 -> version>=2026-05-07.

  packages/tts-ggml/vcpkg-configuration.json
    - default-registry baseline: bc30b0b (April 2026 fork-only state)
      -> 16b91afdcfd59baea60e81f3da94f49311ef2a97.  The new baseline
      pulls in the post-tetherto-merge state (parakeet-cpp port at
      932d5d9, ggml-speech port-version 1 at f07bdd0) plus the new
      tts-cpp port (16b91af) on the developer's GustavoA1604
      registry fork.

Smoke-test plan: after running `vcpkg install` against the new
baseline, the tts-cpp port's vcpkg_from_github resolves at
GustavoA1604/qvac-ext-lib-whisper.cpp@e673182 (tts-cpp branch) until the
upstream PR merges.  ChatterboxModel should build and synthesize
identically; expanding to Multilingual + Supertonic flows is the
follow-up commit on the package side.

Co-authored-by: Cursor <cursoragent@cursor.com>

* Add chatterbox multilingual and supertonic

* Add mobile integration tests

* tts-ggml: drop clang-19 pin in linux-clang toolchain

The toolchain hardcoded `clang-19` / `clang++-19` (versioned binary
names) since the package's first commit (0a2c978).  Linux CI hadn't
exercised this path before — the new on-pr-tts-ggml.yml -> integration
matrix is the first time it does, and it fails on every linux runner
(ai-run-ubuntu-22.04, ai-run-linux-gpu, ubuntu-24.04-arm) at vcpkg's
"detect_compiler" step because none of the GH-hosted images ship a
`clang-19` symlink:

  Detecting compiler hash for triplet x64-linux...
  error: while detecting compiler information:
  ...
  CMake Error at scripts/cmake/vcpkg_execute_required_process.cmake:127
  (message): Command failed: ... -DVCPKG_CHAINLOAD_TOOLCHAIN_FILE=
  .../tts-ggml/vcpkg/triplets/../toolchains/linux-clang.cmake ...

Match parakeet's working pattern (qvac-lib-infer-parakeet/vcpkg/
toolchains/linux-clang.cmake): use unversioned `clang` / `clang++` so
each runner picks up its image's default clang (clang-15 on
ubuntu-22.04, clang-18 on ubuntu-24.04, whatever the AI runners ship).
The `-stdlib=libc++` flag added by x64-linux.cmake / arm64-linux.cmake
is honoured by every reasonable clang version.

Co-authored-by: Cursor <cursoragent@cursor.com>

* Add C++ tests and coverage; fix linux build

* tts-ggml: address PR review feedback

Bundle of correctness, hygiene, and CI-doc fixes from the recent code
review.  Each item below has its own paragraph in the diff comments.

- #1 files-array: add test/utils/runSupertonicTTS.js + test/data/sentences-{medium,long}.js
  to package.json so consumers running the integration tests from the
  npm tarball don't crash with `Cannot find module ../utils/runSupertonicTTS`.
- #2 deps: move @qvac/langdetect-text from runtime dependencies to
  devDependencies (it's only referenced from examples/, which aren't in
  the published files list).
- #3 race-fix: ChatterboxModel::process()'s post-synthesize streaming
  detection used to read engine_->options() outside engineMu_, racing
  with reload().  synthesize() now returns SynthesizeResult { pcm,
  wasStreaming } where wasStreaming is captured under the engine lock
  against the local shared_ptr so process() doesn't have to touch
  engine_ again.
- #4 deferred-load: ChatterboxModel + SupertonicModel constructors
  used to call load() eagerly, so JsInterface::createInstance() (sync
  on the JS thread) was parsing ~370 MB of GGUF on the Bare event loop.
  Both models now implement IModelAsyncLoad: constructors validate +
  return; the actual load is deferred to waitForLoadInitialization(),
  which the new addon_js::activate wraps inside JsAsyncTask::run so the
  parse runs on a worker thread.  binding.cpp registers
  addon_js::activate in place of JsInterface::activate; tts.js now
  awaits the resulting promise.
- #5 dead code: drop _resolvePath (unused), drop the (void)inputObj
  read in AddonJs.hpp::runJob, document FAILED_TO_PAUSE /
  FAILED_TO_STOP / JOB_ALREADY_RUNNING in lib/error.js as reserved-but-
  not-thrown so future maintainers don't delete them blindly (the unit
  suite asserts the values).
- #6 cancel-reset: SupertonicModel grew Chatterbox's cancelRequested_
  reset pattern: cancel() sets it, synthesize() fast-fails on it,
  process() resets it per call so a stale cancel doesn't poison the
  next run.
- #7 useGPU comment: explain in JSAdapter::buildChatterboxConfig that
  the JS layer is the source of truth for useGPU and nGpuLayers wins
  downstream; left a pointer to std::optional<bool> if a future caller
  ever needs to distinguish "absent" from "explicit false".
- #10 fork pointers: README.md and test/utils/downloadModel.js no
  longer point at GustavoA1604/chatterbox.cpp; both reference the
  upstream tetherto/qvac-ext-lib-whisper.cpp/tts-cpp tree now.
- #9 doc: integration-mobile-test-tts-ggml.yml gained a header comment
  on the build-and-test job documenting that continue-on-error is the
  early-days landing posture (merge-guard treats success || skipped as
  pass), with a pointer to tighten once Device Farm provisioning is
  stable.

Nits:
- 'use strict' added to addonLogging.js (matches every other .js).
- node-vs-bare runtime banners on
  scripts/{generate,validate}-mobile-integration-tests.js.
- ttsOutputDebugString no longer JSON.stringify's the full PCM
  Int16Array on every chunk-streaming event; emits a tiny summary
  ({sampleRate, chunkIndex, isLast, sentenceChunk, outputArrayLen})
  instead.

Tests: 35 passing (33 -> 35; two new assertions cover the deferred-load
contract); 4 skipped real-GGUF tests behind the existing
QVAC_TEST_CHATTERBOX_T3_GGUF / QVAC_TEST_CHATTERBOX_S3GEN_GGUF /
QVAC_TEST_SUPERTONIC_GGUF env-var gates.  Lint clean.

Co-authored-by: Cursor <cursoragent@cursor.com>

* tts-ggml: unblock CI integration tests on every desktop runner

Four independent failures, one per platform:

1. linux-x64 / linux-arm64: addon load crashed at
   `libomp.so.5: cannot open shared object file`.  tts-cpp's binary is
   built with clang under the linux-clang toolchain and links against
   libomp (LLVM OpenMP runtime); only `libgomp1` (GNU OpenMP) was being
   apt-installed.  Add `libomp5` so libomp.so.5 is on the loader path.

2. darwin-arm64: convert-models.sh aborted at line 200 with
   `hf_args[@]: unbound variable`.  macOS's system bash is 3.2 which
   treats `"${arr[@]}"` as nounset access when the array is empty under
   `set -u`; with HF_TOKEN unset we hit it on every fresh runner.  Use
   the `${arr[@]+"${arr[@]}"}` idiom (defined-or-nothing) at all six
   call sites and add a header comment so the next maintainer doesn't
   accidentally regress.

3. darwin-x64: pip install bombed building `llvmlite` from source
   because the macos-15-large runner has no LLVM 15 development
   install.  Root cause: librosa pulls in numba 0.65+, which stopped
   shipping darwin-x86_64 wheels for Python 3.12.  Pin Python to 3.11
   in the Setup Python step; 3.11 has prebuilt wheels for the entire
   numba/llvmlite/librosa stack on darwin-x64 and is fine for every
   other converter dependency.

4. windows-2022: ChatterboxModel::load threw
   `vk::createInstance: ErrorIncompatibleDriver`.  Root cause: the
   addon's index.js::_validateConfig defaults `useGPU = true` when
   neither useGPU nor nGpuLayers is specified, so the test ran with
   n_gpu_layers=99 -> ggml_backend_vk_init -> vk::createInstance ->
   ErrorIncompatibleDriver on the runner's no-Vulkan-driver image.
   runChatterboxTTS.js now honours `process.env.NO_GPU === 'true'`
   (set on the no-GPU matrix entries) and forces useGPU=false on
   exactly those runners; the other test runners (chatterbox-mtl,
   gpu-smoke, multiple-runs) already had this guard.

Also documents the `mesa-vulkan-drivers` apt package (already pulled
in) as the software ICD that lets the Vulkan-built prebuild's runtime
backend probe enumerate at least one device on linux runners.

Co-authored-by: Cursor <cursoragent@cursor.com>

* tts-ggml: drop Chatterbox from mobile bundle (Metro V8 string limit)

Mobile build failed at `:app:createBundleReleaseJsAndAssets` with:

  SyntaxError: assets/testAssets/chatterbox-s3gen.gguf:
    Cannot create a string longer than 0x1fffffe8 characters

Root cause: Metro's bundler reads every asset under
`test/mobile/testAssets/` via `Buffer.toString()`.  V8's max string
length is 0x1fffffe8 (~512 MiB).  chatterbox-s3gen.gguf is ~1 GiB even
with --quant q4_0 because the s3gen converter only quantizes attention
weights and leaves the bulk of the s3gen graph in fp16 ("0/291 weight
tensors quantized" in the converter log).

Fix: bundle ONLY supertonic.gguf (~125 MiB, comfortably under the
limit) on mobile.  Mobile Chatterbox tests degrade cleanly to
`t.pass('Skipped: Chatterbox GGUFs not available')` via the existing
`ensureChatterboxModels` helper -- it already returns
{ success: false } when the GGUFs aren't on disk.

Cache key bumped to v2 so existing v1 cache entries (which include
the chatterbox files) are evicted on the next run.

Bundling Chatterbox on mobile requires either:
  - adding `gguf` to qvac-test-addon-mobile's metro `assetExts` so the
    JS-string read is skipped (then the s3gen file can flow through the
    bundle as a raw asset), or
  - pushing the chatterbox GGUFs to the device via `adb push` outside
    the bundle and surfacing the path through downloadModel.js's
    existing ANDROID_CANDIDATE_DIRS fallback.

Both are outside the scope of this PR; documented inline above the
cache step for the next maintainer.

Co-authored-by: Cursor <cursoragent@cursor.com>

* Bump hash of vcpkg

* Consume vcpkg from tetherto repository

* Fix integration tests failures in all platforms

* Further fix tests

* fix: Make useGPU flag more meaningful (#1953)

* fix[api]: make useGPU flag actually force CPU/GPU and reject useGPU/nGpuLayers conflicts

* add gpu smoke test

* resolve comments

---------

Co-authored-by: Ishan Vohra <ishanvohra@Ishans-MacBook-Air.local>

* Update dependencies after monorepo directory changes

* Further drop qvac-lib- prefix

* Add CHANGELOG.md

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Ishan Vohra <ishanvohra2@gmail.com>
Co-authored-by: Ishan Vohra <ishanvohra@Ishans-MacBook-Air.local>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants