tts-cpp: publish 2026-06-12 — QVAC-19557 chatterbox memory (PR #43) + Android-safe symbols#188
Draft
ogad-tether wants to merge 2 commits into
Draft
tts-cpp: publish 2026-06-12 — QVAC-19557 chatterbox memory (PR #43) + Android-safe symbols#188ogad-tether wants to merge 2 commits into
ogad-tether wants to merge 2 commits into
Conversation
… Android-safe symbols Pins tetherto/qvac-ext-lib-whisper.cpp@8b012789 (PR #43 on top of master 1c75d6e9): - Streamed GGUF tensor loads: no full-file host staging during chatterbox model loads (removes the +0.5..1 GB transient per load behind the iOS SDK jetsam kills). - EngineOptions::kv_cache_type (f32|f16|q8_0): selectable T3 KV-cache dtype on a token-major slab; q8_0 stores the cache at ~27% of f32 and decodes 20-30% faster on Metal. Validated upstream: f32 is byte-identical to the previous layout, Turbo greedy decoding is byte-identical across all three dtypes on CPU and Metal. - Removes the last direct ggml_backend_is_cpu / ggml_get_type_traits_cpu references from tts-cpp (backend registry + ggml_quantize_chunk instead), so a static tts-cpp no longer leaves unresolvable UND symbols in Android GGML_BACKEND_DL=ON addon builds — the dlopen crash that forced the tts-ggml 0.2.2 revert and pinned everything back to 2026-06-03. Consumed by @qvac/tts-ggml (kvCacheType knob + q8_0/nCtx=4096 chatterbox defaults, qvac PR #2527). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…bility probe) Updates the 2026-06-12 publish (still in-PR, not yet on registry main) from 8b012789 to c8620cf9, which adds on top of the streaming-loads + q8_0-KV + Android-symbol work: - chatterbox_resolve_kv_type: load-time ggml_backend_supports_op probe that falls back to f32 when a backend rejects the requested f16/q8_0 K/V (review follow-up on PR #43). - Vulkan guard: quantized K/V forced to f32 on Vulkan, since ggml-vulkan's supports_op advertises q8_0 K/V FA but the NV_coopmat2 kernel faults at compute (toggle-confirmed q8_0 SIGSEGV vs f32 pass on NVIDIA coopmat2 CI runners). f16 + Metal/CPU q8_0 unaffected. Same source repo/subfolder; new archive SHA512.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Publishes
tts-cpp@2026-06-12, pinningtetherto/qvac-ext-lib-whisper.cpp@8b012789— the head of tetherto/qvac-ext-lib-whisper.cpp#43 (QVAC-19557), on top of master1c75d6e9.What the pin brings:
EngineOptions::kv_cache_type(f32|f16|q8_0) — selectable T3 KV-cache dtype on a token-major slab; q8_0 stores the cache at ~27% of f32 and decodes 20-30% faster on Metal. Upstream-validated: f32 byte-identical to the old layout; Turbo greedy decoding byte-identical across all three dtypes (CPU + Metal).ggml_backend_is_cpu/ggml_get_type_traits_cpureferences from tts-cpp (backend registry +ggml_quantize_chunkinstead).nm -u libtts-cpp.ais clean of both symbols, so static tts-cpp no longer leaves unresolvable UND symbols in AndroidGGML_BACKEND_DL=ONaddon builds — the dlopen crash that forced the tts-ggml 0.2.2 revert and pinned everything back to 2026-06-03.Consumed by:
@qvac/tts-ggml(tetherto/qvac#2527 —kvCacheTypeknob + q8_0/nCtx=4096 chatterbox defaults). The addon's full gtest suite (42/42) passes built against this revision via a local overlay port.Draft until tetherto/qvac-ext-lib-whisper.cpp#43 merges — the pinned SHA is the PR-branch head (fetchable now and after the merge; happy to re-pin to the master merge commit instead once it lands if that's preferred).
🤖 Generated with Claude Code