Clone a voice in 5 seconds to generate arbitrary speech in real-time
-
Updated
Aug 14, 2024 - Python
Clone a voice in 5 seconds to generate arbitrary speech in real-time
🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Azure / DeepSeek), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS) and plugin system. One-click FREE deployment of your private ChatGPT/ Claude application.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
A generative speech model for daily dialogue.
Instant voice cloning by MIT and MyShell.
🤖 The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed, P2P inference
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
A sound cloning tool with a web interface, using your voice or any sound to record audio / 一个带web界面的声音克隆工具,使用你的音色或任意声音来录制音频
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
A fast, local neural text to speech system
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Add a description, image, and links to the tts topic page so that developers can more easily learn about it.
To associate your repository with the tts topic, visit your repo's landing page and select "manage topics."