Releases: erew123/alltalk_tts
DeepSpeed 14.2 versions for Linux
LINUX VERSION HERE (Not Windows)
Libaio Requirement
You need libaio-dev
or libaio-devl
(depending on your Linux flavour), otherwise DeepSpeed will fail e.g at your terminal.
- Debian-based systems
sudo apt install libaio-dev
- RPM-based systems
sudo yum install libaio-devel
DeepSpeed setup/compilation
DeepSpeed is complicated at best. You have to install DeepSpeed that matches:
- Your Python version in your Python virtual environment e.g 3.10, 3.11, 3.12 etc
- Your version of PyTorch within your Python virtual environment e.g, 2.0.x, 2.1.x, 2.2,x, 2.3.x etc
- The version of CUDA that PyTorch was installed with in your Python virtual environment e.g. 11.8, or 12.1
If you change your version of Python, PyTorch or the CUDA version PyTorch uses within that virtual environment, you will need to uninstall DeepSpeed pip uninstall deepspeed
and then install the correct matching version.
To understand a filename deepspeed-0.14.2+cu121torch2.3-cp312-cp312-manylinux_2_24_x86_64.whl
- deepspeed-0.14.2+ the version of DeepSpeed
- cu121 the CUDA version of PyTorch, in this case cu121 means CUDA 12.1
- torch2.3 the version of PyTorch that it works with.
- cp312-cp312 the version of Python it works with.
- manylinux_2_24_x86_64 states its a Linux version.
So you will start your Python virtual environment then use something like AllTalks diagnostics to find out what version of:
- Python is installed
- PyTorch is installed
- CUDA version that PyTorch was installed with.
You will then hunt through the below files and download the correct file to your folder, and still loaded into your Python virtual environment, you will run pip install deepspeed-0.14.2+{version here}manylinux_2_24_x86_64.whl
where "version here" is the correct, matching version that you have downloaded.
To be clear, lets say your virtual Python environment is running Python 3.11.6 with PyTorch 2.2.1 with CUDA 12.1, you would want to download deepspeed-0.14.2+cu121torch2.2-cp311-cp311-manylinux_2_24_x86_64.whl
Note: You will need to install the Nvidia CUDA Development Toolkit, version 12.1.0 works perfectly with PyTorch 2.2.1, as it has been tested and confirmed to work. Version 12.4 has been tried and found to be problematic In Conda Python virtual environments, you can start the Python virtual environment and install this toolkit using the following command conda install nvidia/label/cuda-12.1.0::cuda-toolkit=12.1
.
To be absolutely clear, the Nvidia CUDA Development Toolkit is separate from:
- Your graphics card driver version.
- The version of CUDA used by your graphics driver.
- The version of PyTorch or Python on your system and their associated CUDA versions.
Think of the CUDA Development Toolkit like the engine diagnostics tools used by mechanics. These tools are necessary for the development, compilation and testing of CUDA applications (or in this case DeepSpeed). Just as a mechanic's tools are separate from the engine, car model, or the type of fuel the car uses, the CUDA Development Toolkit is separate from your graphics driver, the CUDA version your driver uses, and the versions of PyTorch or Python installed on your system. Aka, the versions do not all have to match exactly.
Also note, you will see this warning message when AllTalk starts up and DeepSpeed for Linux is installed. It is safe to ignore, as far as AllTalk is concerned.
AllTalk v1.9c
Quite a large update, in preparedness for a more structured application & future possibilities.
- TTS Generator - Various interface bugs & filtering options cleaned up.
- TTS Generator - TTSDiff now scans generated text and TTS for errors.
- TTS Generator - TTSSRT now creates subtitle files for video production e.g. a Youtube video.
- Finetune - Now uses a customised tokenizer to deal with Japanese.
- Finetune - Pre flight check and warning messages.
- Finetune - Extra documentation and warnings.
- Entire file structure has been re-organised to simplify management and future changes.
- Documentation (built in and Github) has been rewritten/tidied up.
- Requirements files have been cleaned up and simplified.
- ATsetup has been re-written as necessary with additional options.
- Diagnostics now performs some other checks.
- DeepSpeed moved up to version 14.
- Standalone Application moved to PyTorch 2.2.1.
- Nvidia CUDA Toolkit installation is NO LONGER needed (other than to compile DeepSpeed on Linux)
Tested on Linux and Windows.
65 changed files with 10,298 additions and 300 deletions.
If you download and use the ZIP file from here, it will NOT be linked to this Github repository and so CANNOT be automatically updated with a git pull
in future.
DeepSpeed v14.0 for PyTorch 2.2.1 & Python 3.11
Before you install DeepSpeed, its recommended you confirm AllTalk works without.
This version has been built for PyTorch 2.2.1 and also Python 3.11.x
For CUDA v12.1 - WINDOWS - Download
For CUDA v11.8 - WINDOWS -Download
For CUDA v12.1 - LINUX- Download
For versions that support PyTorch 2.1.x please look at the main releases page
If you need to check your CUDA version within Text-generation-webui run cmd_windows.bat
and then: python --version
to get the Python version and pip show torch
to get the CUDA version.
NOTE: You DO NOT need to set Text-generation-webUI's --deepspeed setting for AllTalk to be able to use DeepSpeed. These are two completely separate things and incorrectly setting that on Text-generation-webUI may cause other complications.
AllTalk v1.9
- Added Streaming endpoint API & Server Status API
- Added SillyTavern support
- Added ATSetup utility to help simplify Text-gen-webui and Standalone installations.
- Updated TTS API generation endpoint to correctly name
_combined
files - Updated TTS API generation endpoint to correctly add a short uuid to timestamped files (non-narrator) to avoid dual file generation on the same tick.
- Cleaned up some console outputs.
- Additional documentation and documentation cleaning (along with the Github)
- Added cutlet and unidic-lite to help with Japanese support on non-Japanese enabled computers.
- Transformers requirements bumped to 4.37.1
- Kobold is also now supported thanks to @illtellyoulater
- Minor updates to Finetuning
- Minor updates to documentation
DeepSpeed v12.7 wheel file
Before you install DeepSpeed, its recommended you confirm AllTalk works without.
THESE ARE AS YET UNTESTED VERSIONS - DO NOT USE THESE
PLEASE USE DeepSpeed v11.2 here
Back to the DeepSpeed install instructions
Python 3.11.x and CUDA 12.1 COMPILED FOR PyTorch 2.1.x
DeepSpeed v12.7 for CUDA 12.1 and Python 3.11.x
If you are after DeepSpeed for CUDA 11.8 and/or Python 3.10.x please see DeepSpeed v11.2 here
If you need to check your CUDA version within Text-generation-webui run cmd_windows.bat
and then: python --version
to get the Python version and pip show torch
to get the CUDA version.
NOTE: You DO NOT need to set Text-generation-webUI's --deepspeed setting for AllTalk to be able to use DeepSpeed. These are two completely separate things and incorrectly setting that on Text-generation-webUI may cause other complications.
AllTalk 1.8d
- Added the AllTalk TTS Generator, which is designed for creating TTS of any length from as larger amount of text as you want. You are able to individually edit/regenerate sections after all the TTS is produced, export out to 1x wav file. You can also stream TTS if you just want to play back text or even push audio output to wherever AllTalk is currently running from at the terminal/command prompt.
- Add greedy option to avoid apostrophe being removed. Add accentuated character for foreign language. (Thanks to @nicobubulle)
- Updated filtering to allow Hungarian ő and ű characters and Cyrillic characters to pass through correctly.
- Ms Word Add-in Added a proof-of-concept MS Word add-in to stream selected text to speech from within documents. This is purely POC and not production, hence support on this will be limited.
AllTalk 1.8c
AllTalk 1.8b
Tidied up finetune interface.
Added multiple buttons to help with finetune final stages.
Added a compaction routine into finetune to compact legacy finetuned models.
New api JSON return output_cache_url The HTTP location for accessing the generated WAV file as a pushed download.
with corrs support.
Updated documentation.
NOTE: I probably will be doing more work on the Narrator function. So if you are using that, you may want to git pull an update to get the latest.
AllTalk 1.8a
- Adds 3x new API endpoints supporting Ready status, Voices (list all available voices), Preview voice (generate hard coded short voice preview).
- Playing of generated TTS via the API is now supported in the terminal/prompt where the script is running from.-
- All documentation relevant to the above is updated.
- Adds a 4th model loader "XTTSv2 FT" for loading of finetuned models.
The models have to be stored in /models/trainedmodel/
(which is the default location the finetune process will move a model to). On start-up, if a model is detected there, a 4th loader becomes available.
AllTalk 1.8
Finetuning has been made simpler at the final step (3x buttons now)
A compact.py script has been created for people who already have finetuned models.
Narrator function has been improved with its splitting, though there are still minor outlier situations to resolve.