Skip to content

feat(tts-ggml): Android dynamic ggml backends#2168

Merged
GustavoA1604 merged 9 commits into
mainfrom
feat/tts-ggml-dynamic-backend
May 21, 2026
Merged

feat(tts-ggml): Android dynamic ggml backends#2168
GustavoA1604 merged 9 commits into
mainfrom
feat/tts-ggml-dynamic-backend

Conversation

@GustavoA1604

@GustavoA1604 GustavoA1604 commented May 20, 2026

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

  • No dynamic-backend packaging in @qvac/tts-ggml: Android builds need libqvac-speech-ggml-{vulkan,opencl,cpu-android_armv*_*.so} next to the .bare module so ggml_backend_load_all_from_path() can discover backends at runtime. The addon had no BACKENDS_SUBDIR staging, no backendsDir / openclCacheDir JS surface, and no forwarding into tts_cpp::EngineOptions — unlike @qvac/transcription-parakeet and @qvac/llm-llamacpp.
  • Upstream tts-cpp not consumable yet: Registry-based backend selection landed in qvac-ext-lib-whisper.cpp#29 (EngineOptions::backends_dir, Adreno tier policy, init_gpu_backend()). This package must pin tts-cpp >= 2026-05-20 and depend on qvac-registry-vcpkg#159 (ggml-speech#4 + tts-cpp port bump).
  • Chatterbox useGPU: true default OOMs on Android: GPU backends mirror large f16 S3Gen weights (~1 GB) on top of the mmap'd CPU copy; 8 GB devices hit lmkd SIGKILL. Default should be CPU with explicit opt-in on capable hosts.
  • Mobile / CI test friction: iOS bundled reference wav paths were not readable from native code; integration tests relied on local HF→GGUF conversion in CI instead of the QVAC model registry; Chatterbox mobile tests used oversized f16 GGUFs.

How does it solve it?

CMake / prebuilds (packages/tts-ggml/CMakeLists.txt)

  • find_package(ggml) for GGML_AVAILABLE_BACKENDS + loose .so glob from vcpkg lib/.
  • bare_target + bare_module_targetBACKENDS_SUBDIR (android-arm64/qvac__tts-ggml, …).
  • add_bare_module(... EXPORTS INSTALL TARGET ggml::<backend>) + install(FILES libqvac-speech-ggml-*.so) for Vulkan/OpenCL MODULE backends not exposed as IMPORTED targets.
  • Android 16 KB page-size link flags (-Wl,-z,max-page-size=16384) — same as parakeet (Pixel 9 class devices).
  • Apple compiler-rt force_load for @available__isPlatformVersionAtLeast (iOS Metal stability on unload/reload).

JS + C++ wiring

  • backendsDir / openclCacheDir on constructor options and TTSGgmlRuntimeConfig; default backendsDirpath.join(__dirname, 'prebuilds').
  • ChatterboxModel / SupertonicModel compose backendsDir / BACKENDS_SUBDIR into opts.backends_dir; forward opencl_cache_dir.
  • JSAdapter reads both fields for Chatterbox and Supertonic configs.

GPU policy

  • useGPU defaults to false for Chatterbox (was true in 0.1.1). Opt in with config: { useGPU: true } on Metal / Vulkan / OpenCL hosts.
  • #ifdef __ANDROID__ in loadLocked(): forces n_gpu_layers = 0 and logs a warning if the host requested GPU — Vulkan (Mali) and OpenCL (Adreno) paths for Chatterbox/Supertonic graphs are not validated yet. Dynamic backend .so staging still matters on Android for per-arch CPU dlopen even while GPU stays off.

Tests & CI

  • @qvac/registry-client devDependency + downloadModel.js registry fetch (q4_0 Chatterbox T3, f16 S3Gen, Supertonic q4_0) with min/max size bands to reject stale caches.
  • resolveRefWavPath() — mobile global.assetPaths / Library/Caches/jfk.wav before in-bundle path (fixes iOS ModelFileNotFound).
  • Remove iOS workflow HF→GGUF conversion step; models fetched via registry in mobile/integration runs.
  • GPU smoke tests skip Android; lifecycle tests use CPU defaults.

Docs & version

  • README updated: CPU-by-default, useGPU table, backendsDir / openclCacheDir knobs (mirrors parakeet).
  • CHANGELOG 0.1.2, package.json 0.1.2.
  • vcpkg.json: tts-cpp >= 2026-05-20.

Breaking changes

Change Migration
Chatterbox useGPU default truefalse Pass config: { useGPU: true } where you previously relied on the implicit GPU default (macOS Metal, CUDA desktop, etc.).
Android GPU requests ignored at addon boundary Expected until Vulkan/Mali + OpenCL/Adreno validation completes; CPU + dynamic CPU backends still work.

Supertonic remains CPU-only at construction time when useGPU: true is passed.

How was it tested

Manual CI run here

@GustavoA1604 GustavoA1604 requested review from a team as code owners May 20, 2026 20:44
@GustavoA1604 GustavoA1604 changed the title Feat/tts ggml dynamic backend feat(tts-ggml): Android dynamic ggml backends May 20, 2026
@github-actions

github-actions Bot commented May 21, 2026

Copy link
Copy Markdown
Contributor

Tier-based Approval Status

**PR Tier:** TIER1

**Current Status:** ✅ APPROVED

**Requirements:**
- 1 Team Member approval ✅ (1/1)
- 1 Team Lead OR Management approval ✅ (1/1)



---
*This comment is automatically updated when reviews change.*

@GustavoA1604

Copy link
Copy Markdown
Contributor Author

/review

@GustavoA1604 GustavoA1604 merged commit ec70978 into main May 21, 2026
20 of 22 checks passed
@GustavoA1604 GustavoA1604 deleted the feat/tts-ggml-dynamic-backend branch May 21, 2026 11:30
Proletter pushed a commit that referenced this pull request May 24, 2026
* Add dynamic backend loading for android and model download in integration tests

* Remove gguf bundling from mobile integration test

* Add missing registry-client dependency

* Remove non-working GPUs

* Fix failing test

* Point to tetherto repo

* Remove redundant comments

* Update readme
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants