Implements response_format to the /v1/audio/speech endpoint #2732

SuperPat45 · 2024-07-05T23:38:04Z

The LocalAI /v1/audio/speech endpoint return a wav audio file.

OpenAI supports all these formats: mp3 (which is the default), opus, aac, flac, wav, and pcm.
To set the desired format, OpenAI has a response_format request parameter for /v1/audio/speech:
https://platform.openai.com/docs/api-reference/audio/createSpeech

mp3 and opus formats are smaller files and better for website than wav.
So, when ffmpeg is available, could LocalAI add supports for the response_format parameter to convert automatically, the generated wav audio file, to the desired format?

SuperPat45 added the enhancement New feature or request label Jul 5, 2024

n-Arno mentioned this issue Nov 2, 2024

feat(tts): Implement naive response_format for tts endpoint #4035

Merged

mudler closed this as completed in #4035 Nov 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implements response_format to the /v1/audio/speech endpoint #2732

Implements response_format to the /v1/audio/speech endpoint #2732

SuperPat45 commented Jul 5, 2024 •

edited

Loading

Implements response_format to the /v1/audio/speech endpoint #2732

Implements response_format to the /v1/audio/speech endpoint #2732

Comments

SuperPat45 commented Jul 5, 2024 • edited Loading

SuperPat45 commented Jul 5, 2024 •

edited

Loading