-
-
Notifications
You must be signed in to change notification settings - Fork 115
Multi Engine Manager
AllTalk Multi Engine Manager (MEM) is an experimental research tool designed to manage and test multiple instances of different Text-to-Speech (TTS) engines simultaneously. It aims to provide a centralized system capable of handling multiple TTS requests concurrently, distributing the workload across various engine instances.
MEM is currently in a demonstration, testing, and experimental phase. It is not intended for production environments and comes with several limitations:
- There may be bugs and unforeseen issues.
- Not all scenarios or TTS engines have been thoroughly tested.
- Performance, reliability, and stability have not been extensively evaluated.
- No official support is being offered for MEM at this time.
- Multi-engine management: Create and control multiple TTS engine instances.
- Queue system: Handle multiple TTS requests across loaded engine instances.
- API compatibility: Mimics a standalone AllTalk server for client requests.
- Customizable settings: Adjust ports, engine counts, and queue management parameters.
- Built-in load tester: Test system performance with multiple simultaneous requests.
- Real-time queue monitoring: View and manage the request queue through the Gradio interface.
- Detailed documentation is built into the interface.
Before using MEM, ensure that:
- AllTalk is installed and configured correctly.
- You have set up and tested your desired TTS engine(s) in the main AllTalk Gradio interface.
- You are running MEM separately from AllTalk (AllTalk does not need to be running).
- Update AllTalk to the latest version that includes MEM.
- Start the Python environment for AllTalk.
- Run MEM with the command:
python tts_mem.py
- Access the MEM interface through the provided Gradio link (default: http://127.0.0.1:7500).
- Engine Management: Start, stop, and monitor multiple TTS engine instances.
- API Server: Handles client requests on the default port 7851 (customizable).
- Queue System: Manages and distributes incoming TTS requests to available engines.
- Settings Panel: Customize MEM's behavior, including port numbers, engine counts, and queue parameters.
- Load Tester: Built-in tool for performance testing.
- Queue Monitor: Real-time view of the request queue and engine statuses.
- Streaming TTS has not been extensively tested.
- Not all TTS engines have been verified to work with MEM.
- The behavior of multiple simultaneous requests on GPU/CPU resources is not fully known.
- MEM's performance and stability in high-load scenarios have not been thoroughly evaluated.
MEM emulates a standalone AllTalk server and responds to the following endpoints:
- Ready Endpoint:
http://{ipaddress}:{port}/api/ready
- Voice Endpoint:
http://{ipaddress}:{port}/api/voices
- RVC Voices Endpoint:
http://{ipaddress}:{port}/api/rvcvoices
- Current Settings Endpoint:
http://{ipaddress}:{port}/api/currentsettings
- TTS Generation Endpoint:
http://{ipaddress}:{port}/api/tts-generate
- OpenAI Compatible Endpoint:
http://{ipaddress}:{port}/v1/audio/speech
While MEM showcases the potential for multi-engine TTS management, its future development is uncertain. Potential areas for expansion include:
- Enhanced control over individual engine settings.
- Improved API management features.
- Extended compatibility with various TTS engines.
- More comprehensive testing and optimization for production use.
While official support is not provided, feedback and contributions to MEM are welcome. If you encounter issues or have suggestions, you may submit them through the AllTalk GitHub repository. However, please note that responses or updates may be limited due to the experimental nature of this tool.
Remember: MEM is a research demonstration and should be used with caution in any non-testing environment.