[TTS][SpeakerCacheManager] A global speaker cache manager for Voice Cloning by JuanPZuluaga · Pull Request #2630 · vllm-project/vllm-omni

JuanPZuluaga · 2026-04-09T08:03:00Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Consolidate speaker-embedding caching for all TTS backends (Qwen3TTS, FishSpeech, CosyVoice3, VoxCPM2, OmniVoice) behind one shared LRU cache, and make uploaded voices survive server restarts.

Changes

Single process-wide SpeakerEmbeddingCache (LRU, byte + count caps to avoid too many files) replaces 5 per-model caches. Deleting a voice invalidates every model's cache at once.
Uploaded voices persist as .safetensors in ~/.cache/vllm-omni/speakers/ (metadata in the header). Restored on server start.
Fish Speech / CosyVoice3 reject unknown voice names with 400.
Voxtral: inline ref_audio path restored.

new env vars added here

Variable	Default
`SPEAKER_SAMPLES_DIR`	`~/.cache/vllm-omni/speakers`
`SPEAKER_MAX_UPLOADED`	`1000`
`SPEAKER_CACHE_MAX_BYTES`	512 MiB
`SPEAKER_CACHE_MAX_ENTRIES`	1024

Test Plan

Unit tests — tests/test_speaker_cache.py (cache module, tuple keys, created_at isolation)
Integration tests — tests/test_speaker_cache_integration.py (end-to-end upload/cache/delete, stale-cache race)
Per-model cache tests — Fish Speech (tests/test_fish_speech_cache.py)
Pre-commit passes locally (pre-commit run --all-files)
Benchmark re-run on benchmarks/fish-speech/bench_speaker_cache.py pending

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
(Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

JuanPZuluaga · 2026-04-09T12:29:10Z

things to do:

rename everything to Speaker instead of Voice
wait for PR of OmniVoice to support voice cloning
add OmniVoice in this PR.
wait for: [Frontend] Add voice clone prompt cache endpoint for Qwen3-TTS (#1760) #2457
wait for: [Feat][FishSpeech] Cache DAC-encoded ref audio for voice cloning #2609

chatgpt-codex-connector · 2026-04-13T04:34:22Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

linyueqian · 2026-04-13T04:35:53Z

is it ready to be reviewed? I need to merge voxcpm2 perf optimization #2690 first may need to wait for that one.

JuanPZuluaga · 2026-04-13T19:58:02Z

is it ready to be reviewed? I need to merge voxcpm2 perf optimization #2690 first may need to wait for that one.

True, thanks for the heads up

lishunyang12

Review: [TTS][SpeakerCacheManager] A global speaker cache manager for Voice Cloning

Overall this is a solid improvement -- consolidating voice caching across all TTS backends (Fish Speech, CosyVoice3, OmniVoice, VoxCPM2, Qwen3 TTS) with a shared VoiceEmbeddingCache is the right direction. The LRU eviction, thread safety via lock, and per-voice clear() on delete are all welcome. However, I see several issues that should be addressed before merging.

Critical Issues

1. Stale cache on voice re-upload (regression)

The PR removes the created_at-based cache invalidation that previously prevented stale cache hits when a voice is deleted and re-uploaded with different audio. The new approach relies on clear(voice_name) being called on delete. However, there is no guarantee the cache is cleared on the model-side instances (CosyVoice3, Fish Speech, VoxCPM2, OmniVoice each create their own VoiceEmbeddingCache() in __init__). The serving_speech.py delete_voice() only calls self._voice_cache.clear(voice_name_lower) on its own cache instance -- it has no reference to the per-model caches. This means:

serving_speech._voice_cache gets cleared on delete (good)
cosyvoice3._voice_cache, fish_speech._voice_cache, voxcpm2._voice_cache, pipeline_omnivoice._voice_cache all retain stale entries (bug)

The PR title says "global" speaker cache manager, but the implementation creates 5+ independent instances. Either make it truly global (singleton or injected reference), or propagate invalidation to model-level caches. This is a correctness bug.

2. Each VoiceEmbeddingCache() defaults to 128 entries -- unbounded aggregate memory

With 5 independent caches (serving_speech, cosyvoice3, fish_speech, voxcpm2, omnivoice), the system can hold up to 640 cached voice entries. CosyVoice3 caches 4 tensors per voice (speech_feat, speech_token, speech_token_len, embedding). For long reference audio, speech_feat alone can be large. There is no aggregate memory limit, only an entry count limit. The memory_bytes() method exists but is never consulted for eviction decisions.

Consider adding a max_memory_bytes threshold that triggers eviction, or at minimum document the expected memory footprint per model type.

3. _resolve_uploaded_voice mutates request in-place -- surprising side effect

_resolve_uploaded_voice() modifies request.ref_audio and request.ref_text in place. This pattern is fragile: if the method is accidentally called twice (it is called in multiple code paths for omnivoice), the request state could become inconsistent. The guard request.ref_audio is not None at the top prevents double-injection, but it would be cleaner to return the resolved data rather than mutating.

Design Concerns

4. clear() with prefix matching is fragile

keys_to_remove = [k for k in self._cache if k.startswith(f"{voice_name}:")]

If a voice is named "alice" and another is "alice_v2", calling clear("alice") will NOT remove "alice_v2:default" (the : prevents it), so this is actually fine. But if voice names ever contain :, it would break. Consider validating voice names at upload time to reject colons.

5. serving_speech.py has duplicated VoxCPM2 handling

The _prepare_speech_generation method now has two separate VoxCPM2 branches:

Lines around the elif self._tts_model_type == "voxcpm2": in the non-diffusion path
A second elif self._tts_model_type == "voxcpm2": block further down in what appears to be the diffusion/fallback path

This duplication is confusing and error-prone. It's unclear which branch executes for a given request. Please consolidate or add clear comments explaining when each branch is reached.

6. Voxtral voice cloning support removed silently

The _build_voxtral_prompt change removes support for ref_audio entirely and now raises ValueError("Voxtral requires a voice name (preset voice).") if no voice is provided. This is a breaking change for users who were using inline ref_audio with Voxtral. Should be documented in the PR description.

Minor Issues

7. _init_voice_storage() uses /tmp/voice_samples default

This is fine for development but concerning for production. The default path should be documented, and ideally the directory should be configurable without environment variables (e.g., via server config).

8. stats() method calls memory_bytes() outside the lock, then acquires lock again

def stats(self) -> dict[str, Any]:
    memory = self.memory_bytes()  # acquires lock, releases
    with self._lock:              # acquires lock again
        return { ... }

This is not atomic -- entries could be added/removed between the two lock acquisitions, making memory_bytes inconsistent with entries. Consider computing everything under a single lock acquisition.

9. Tests removed for stale-cache protection

test_stale_cache_on_reupload, test_stale_cache_protection, test_make_cache_key_created_at_isolation, and test_created_at_zero_disables_cache are all removed. Since the new approach relies on explicit clear() on delete, there should be an integration test verifying that deleting and re-uploading a voice actually invalidates the model-side caches (not just the serving_speech cache). The new test_voice_cache_integration.py only tests a single cache instance.

10. _cosyvoice3_tokenizer attribute removal

Line self._cosyvoice3_tokenizer = None was removed from __init__. Verify this attribute isn't referenced elsewhere, as it would cause an AttributeError at runtime.

Summary

The core idea is good, but the "global" cache is not actually global -- each model creates its own instance, and invalidation on voice delete only reaches the serving layer's instance. This is the primary blocker. Please either make the cache a true singleton/shared instance, or add a mechanism to propagate invalidation to all model-level caches. The other issues (memory bounds, Voxtral breaking change, duplicated VoxCPM2 branches) should also be addressed.

Replacing with inline comments

JuanPZuluaga · 2026-04-17T11:58:47Z

Thanks for the nice review @lishunyang12

#1 Stale cache on re-upload (regression): added VoiceEmbeddingCache, now a true singleton accessed via get_voice_cache(); all 5 model backends and the serving layer share one instance, and keys are namespaced as {model_type}|{voice_name} so clear(voice_name) on delete reaches every model slot.

#2 Aggregate memory unbounded: fixed this as well, as we have now a single global cache with both max_entries (default 1024) and max_bytes (default 512 MiB) eviction, this is configurable via VOICE_CACHE_MAX_ENTRIES / VOICE_CACHE_MAX_BYTES.

Q: should we leave only one?

#3 _resolve_uploaded_voice mutation: refactored to a pure function returning (error, ref_audio, ref_text); the 3 call sites apply the values explicitly.

#4 Fragile prefix matching in clear(): uses now exact second-segment comparison after split("|", 1), and voice names containing | are rejected at upload time.

#5 Duplicated VoxCPM2 branches: actually, there is only one VoxCPM2 branch; the second one is OmniVoice in the diffusion path.

#6 Voxtral ref_audio removal: restored back _build_voxtral_prompt accepts both a preset voice and inline ref_audio, and the "not yet released" notes were removed from the two Voxtral docs.

#7 /tmp/voice_samples default: the default is now ~/.cache/vllm-omni/voices (survives reboots), configurable via SPEECH_VOICE_SAMPLES, with a 1000-voice cap via SPEECH_MAX_UPLOADED_VOICES.

other changes, everything is computed under a single with self._lock. updated tests, no remaining references of _cosyvoice3_tokenizer

hsliuustc0106

Code Review: Global Speaker Cache Manager for Voice Cloning

CI Note: pre-commit fails (ruff check — import ordering in tests/conftest.py). Run pre-commit run --all-files locally to fix.

Critical

1. Loss of stale-cache protection (created_at) — regression risk

The old VoiceEmbeddingCache.make_cache_key() included created_at in the key so that deleting and re-uploading a voice with the same name but different audio would not hit stale cached embeddings. The new SpeakerEmbeddingCache.make_cache_key(speaker_name, model_type) drops this entirely.

The old tests explicitly tested this (test_stale_cache_on_reupload, test_stale_cache_protection, test_created_at_zero_disables_cache) — all removed in this PR. While delete_voice() now calls self._speaker_cache.clear(voice_name_lower), there's a gap: if a user deletes and re-uploads a voice between two concurrent requests, a race condition can serve the OLD cached artifacts for the NEW upload.

Recommendation: Either (a) add created_at back into the cache key, or (b) use a version counter atomically incremented on re-upload, or (c) document the known race and the trade-off.

Warnings

2. _get_uploaded_audio_data() re-encodes to WAV on every cache miss

_get_uploaded_audio_data() reads the safetensors, decodes to numpy, then re-encodes as WAV via sf.write(buf, ...) just to get a base64 data URL. Previously the raw audio bytes were stored and base64-encoded directly — much cheaper. Consider caching the data URL string or benchmarking the overhead for large audio files.

3. Voice name validation is inconsistent across paths

_resolve_uploaded_speaker() returns an error string but doesn't reject unknown voices for non-CosyVoice3/Fish/OmniVoice models. Meanwhile _prepare_speech_generation() for voxcpm2 raises directly, and _create_diffusion_speech() returns a 400 Response. Three different error-handling patterns for the same logical validation. Recommend extracting a _validate_voice_name() helper.

4. CosyVoice3 dynamic token length change mixed into cache refactoring

The change from character-based (len(request.input)) to token-based (extract_text_token(...)) is a behavioral change that could significantly affect generated audio length for multilingual text. This should ideally be a separate PR. At minimum, add a test verifying token count differs from char count for CJK text.

5. _resolve_uploaded_speaker() called with near-duplicate code in 3 places

Called in _prepare_speech_generation() for fish_tts/cosyvoice3, again for omnivoice, and again in _create_diffusion_speech(). Each call site manually applies results. Consider having the method mutate the request directly (with clear docs) or use a shared helper.

6. f-string in logger.warning/error calls (multiple locations)

e.g., logger.warning(f"Failed to delete audio file for '{name}': {e}") — should use lazy %s formatting.

Suggestions

7. The | delimiter for cache keys is fragile. Consider using tuple keys (model_type, speaker_name) — no delimiter collision risk.

8. PR description checklist is all unchecked. The PR does add tests and docs, but the description should be filled in.

9. 42 commits is excessive. Consider squashing into 5-10 logical commits.

10. for_diffusion() sets _is_tts = False / _is_fish_speech = False but no _tts_stage. Could lead to AttributeError on diffusion-only instances.

Looks Good

Singleton pattern with double-checked locking is correct
fresh_speaker_cache fixture for test isolation is well-designed
Thread safety properly handled
Byte-budget + entry-count dual eviction strategy is sensible
Safetensors metadata round-trip for persistence is clean
Voice name | validation prevents cache key collisions
clear(speaker_name) cross-model-type invalidation solves the real stale-entry bug
Good test coverage

Reviewed by Hermes Agent

hsliuustc0106 · 2026-04-17T12:15:04Z

+        return f"{model_type}|{speaker_name}"
+
+    def get(self, key: str) -> dict[str, Any] | None:
+        """Return cached artifacts on hit. Promotes to MRU."""


🔴 Critical: The old included to prevent stale cache hits after a voice is deleted and re-uploaded with the same name but different audio. This is removed here. While now calls , there is a race window between delete and re-upload in concurrent scenarios.

Consider adding (or a version counter) back into the cache key.

…RU and tuple keys Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

…syvoice3, qwen3_tts, omnivoice) Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

…PI, and misc fixes Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

… conftest fixture Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

…om/JuanPZuluaga/vllm-omni into feat/general-speaker-cache-manager

JuanPZuluaga · 2026-04-17T13:39:16Z

@hsliuustc0106 thanks for the review:

some replies to your comments:

created_at in cache key is fixed. make_cache_key(speaker_name, model_type, created_at=0) now returns (model_type, speaker_name, int(created_at)). All 5 model backends pass created_at=int(info_dict.get("voice_created_at") or 0).
WAV re-encode on every request is fixed. _get_uploaded_audio_data() now memoizes the base64 data URL under uploaded_speakers[name]["_ref_audio_data_url"]; first request pays the encode cost, subsequent requests return the cached string.
Inconsistent voice-name validation i agree it is a bit messy, but the three paths (fish/cosyvoice3, voxcpm2, diffusion) have different request shapes and error contracts. Refactor to a shared _validate_voice_name() belongs in a follow-up to avoid bloating this PR.
CosyVoice3 extract_text_token it is already in main (pre-existing). No behavioral change introduced here.
_resolve_uploaded_speaker duplication same as in the third point above in 3; the three call sites consume the result differently. I can work on this on a follow up PR.
f-string loggers are fixed now, all 10 occurrences converted to lazy %s formatting.
| delimiter - tuple keys now it is fixed, keys are now tuple[str, str, int]. Removed the | name-validation check and the related doc line since collision risk is gone.
PR description — Updated. Checklist filled, summary reflects final scope.
Squashed the commits to few.
for_diffusion() missing _tts_stage it is fixed now.

…NTRIES Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

… feat/general-speaker-cache-manager

Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

linyueqian

LGTM. Re-reviewed at 20e90d7 and the blockers from the earlier rounds are resolved:

🟢 True singleton via get_speaker_cache() in vllm_omni/utils/speaker_cache.py, used by serving_speech.py and all five model paths (Qwen3-TTS, Fish Speech, CosyVoice3, VoxCPM2, OmniVoice). test_singleton_shared_across_call_sites locks it in.
🟢 Byte budget: single 512 MiB cap, LRU eviction on put(), oversize entries skipped. Covered by test_byte_budget_evicts / test_oversize_entry_skipped.
🟢 Stale-cache protection: tuple key (model_type, speaker_name, created_at) + clear(speaker_name) scanning by position k[1] invalidates every model-type slot on delete. test_stale_cache_protection_delete_then_reupload and test_clear_matches_speaker_across_model_types cover both axes.
🟢 Voxtral inline ref_audio restored in _build_voxtral_prompt.
🟢 Duplicated VoxCPM2 branch collapsed; _apply_uploaded_speaker consolidates the three prior call sites with consistent raise ValueError(err) handling.
🟢 Safetensors round-trip via _speaker_metadata_to_header / _speaker_metadata_from_header has unit coverage for ints, strings, None-stripping, malformed ints, and re-injected file_path.

Non-blocking nits for a follow-up if you feel like it:

🟢 [nit] _apply_uploaded_speaker still mutates the request in place. Idempotency guard makes it safe, but a name like _apply_uploaded_speaker_in_place would advertise the side effect.
🟢 [nit] For CosyVoice3 / Fish Speech, uploaded audio round-trips samples → WAV-base64 data URL → numpy. Memoized at the data-URL level so impact is bounded, but a direct _load_uploaded_audio shortcut that skips the re-encode would be cleaner; perf only.
🟢 [nit] shutdown() calls self._speaker_cache.clear() which resets singleton hit/miss counters along with entries. Only matters if the serving instance is ever re-created in the same process.
🟢 [nit] _estimate_tensor_bytes ignoring non-tensor metadata is intentional (big tensors dominate the budget) but worth a one-line comment for future readers.
🟢 [nit] PR description checklist still unchecked (benchmark re-run).

Nice consolidation overall.

lishunyang12 · 2026-04-23T06:35:51Z

Resolve conflict and fix CI.

Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

fixed

Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

… feat/general-speaker-cache-manager

…om/JuanPZuluaga/vllm-omni into feat/general-speaker-cache-manager

JuanPZuluaga · 2026-04-23T19:37:06Z

Thanks for the comments guys! It's been a long way with this PR :) Please, let me know if you'd like something else to get added

JuanPZuluaga changed the title ~~[TTS][SpeakerCacheManager] Feat/general speaker cache manager~~ [TTS][SpeakerCacheManager] A global speaker cache manager for Voice Cloning Apr 9, 2026

linyueqian self-requested a review April 10, 2026 01:37

This was referenced Apr 10, 2026

[TTS][OmniVoice] Add voice cloning support for OmniVoice TTS #2676

Merged

[Model] VoxCPM2 native AR TTS support #2658

Merged

JuanPZuluaga marked this pull request as ready for review April 13, 2026 04:34

JuanPZuluaga requested a review from hsliuustc0106 as a code owner April 13, 2026 04:34

lishunyang12 previously requested changes Apr 16, 2026

View reviewed changes

hsliuustc0106 previously requested changes Apr 17, 2026

View reviewed changes

hsliuustc0106 reviewed Apr 17, 2026

View reviewed changes

JuanPZuluaga added 5 commits April 17, 2026 13:01

speaker cache: add SpeakerEmbeddingCache singleton with byte-budget L…

735adca

…RU and tuple keys Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

integrate speaker cache into 5 TTS backends (fish_speech, voxcpm2, co…

323e040

…syvoice3, qwen3_tts, omnivoice) Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

serving speech: persistent speaker storage, resolver, upload/delete A…

627381a

…PI, and misc fixes Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

tests: speaker cache, metadata persistence, integration, fish_speech,…

2b0ac22

… conftest fixture Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

docs, examples, and benchmark rename for speaker cache

9571741

Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

JuanPZuluaga force-pushed the feat/general-speaker-cache-manager branch from 1bee529 to 9571741 Compare April 17, 2026 13:08

JuanPZuluaga added 4 commits April 17, 2026 13:16

fix

3e59158

Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

remove

cdf8413

Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

fix

7d5d605

Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

Merge branch 'feat/general-speaker-cache-manager' of https://github.c…

ae54756

…om/JuanPZuluaga/vllm-omni into feat/general-speaker-cache-manager

JuanPZuluaga added 4 commits April 17, 2026 14:17

update tests, remove fish speech, update logic, remove _DEFAULT_MAX_E…

b5736ae

…NTRIES Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

Merge branch 'main' of https://github.com/vllm-project/vllm-omni into…

6780bca

… feat/general-speaker-cache-manager

fix failing tests

9a03c77

Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

fix merge from main

0addfdb

Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

merge from main

20e90d7

Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

linyueqian added the ready label to trigger buildkite CI label Apr 21, 2026

linyueqian approved these changes Apr 21, 2026

View reviewed changes

JuanPZuluaga added 2 commits April 23, 2026 10:40

merge from main

5b11f70

Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

merge from main

dced1a2

Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

JuanPZuluaga added 3 commits April 23, 2026 19:19

merge from main

d33a09d

Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>

Merge branch 'main' of https://github.com/vllm-project/vllm-omni into…

a0820df

… feat/general-speaker-cache-manager

Merge branch 'feat/general-speaker-cache-manager' of https://github.c…

c659d9c

…om/JuanPZuluaga/vllm-omni into feat/general-speaker-cache-manager

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TTS][SpeakerCacheManager] A global speaker cache manager for Voice Cloning#2630

[TTS][SpeakerCacheManager] A global speaker cache manager for Voice Cloning#2630
JuanPZuluaga wants to merge 19 commits intovllm-project:mainfrom
JuanPZuluaga:feat/general-speaker-cache-manager

JuanPZuluaga commented Apr 9, 2026 •

edited

Loading

Uh oh!

JuanPZuluaga commented Apr 9, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 13, 2026

Uh oh!

linyueqian commented Apr 13, 2026

Uh oh!

JuanPZuluaga commented Apr 13, 2026

Uh oh!

lishunyang12 left a comment

Uh oh!

JuanPZuluaga commented Apr 17, 2026

Uh oh!

hsliuustc0106 left a comment

Uh oh!

hsliuustc0106 Apr 17, 2026

Uh oh!

JuanPZuluaga commented Apr 17, 2026 •

edited

Loading

Uh oh!

linyueqian left a comment

Uh oh!

lishunyang12 commented Apr 23, 2026

Uh oh!

JuanPZuluaga commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

JuanPZuluaga commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Changes

new env vars added here

Test Plan

Test Result

Uh oh!

JuanPZuluaga commented Apr 9, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 13, 2026

Uh oh!

linyueqian commented Apr 13, 2026

Uh oh!

JuanPZuluaga commented Apr 13, 2026

Uh oh!

lishunyang12 left a comment

Choose a reason for hiding this comment

Review: [TTS][SpeakerCacheManager] A global speaker cache manager for Voice Cloning

Critical Issues

Design Concerns

Minor Issues

Summary

Uh oh!

JuanPZuluaga commented Apr 17, 2026

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Code Review: Global Speaker Cache Manager for Voice Cloning

Critical

Warnings

Suggestions

Looks Good

Uh oh!

hsliuustc0106 Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

JuanPZuluaga commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

linyueqian left a comment

Choose a reason for hiding this comment

Uh oh!

lishunyang12 commented Apr 23, 2026

Uh oh!

JuanPZuluaga commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

JuanPZuluaga commented Apr 9, 2026 •

edited

Loading

JuanPZuluaga commented Apr 17, 2026 •

edited

Loading