[Voxtral TTS] Fix Voxtral TTS input with text and ref_audio#2750
Conversation
Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>
Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
PR #2750 - [Voxtral TTS] Fix input with text and ref_audio OVERALL: NO BLOCKERS Correctness: PASS, Reliability: PASS, Breaking: PASS, Tests: PASS, Docs: PASS, Security: PASS Summary: Bugfix override _apply_hf_processor_mm_only with dummy text for mm_input. 23 lines. Gates pass, tests pass. No blockers. |
…ject#2750) Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>
…ject#2750) Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>
|
Hi @y123456y78, Can you please tell me which voxtral tts opensource model are you using ? I could not find voice cloning support in official hf model page. Commuity discussions say they have not released voice cloning weights. https://huggingface.co/mistralai/Voxtral-4B-TTS-2603/discussions/17. |
…ject#2750) Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>
…ject#2750) Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>
Purpose
VoxtralTTSMultiModalProcessordoesn't work with HF_apply_hf_processor_mm_onlydirectly (the class didn't inherit from Transformers ProcessorMixin since it use mistral tokenizer to handle preprocess)Test Plan
Test Result