Releases: PaddlePaddle/PaddleSpeech
Releases · PaddlePaddle/PaddleSpeech
PaddleSpeech r0.1.1
New Features
CLI :
- Add cli stats. #1274
- Add unit test. #1321
- ASR: Support English: Add transformer_libirspeech model. #1297
- ASR: Support 4 decoding methods: ctc_greedy_search, ctc_beam_search, attention, attention_rescoring. #1297
- ASR & ST: Use the unified config. #1305 / #1312
- ASR: Refactor the code. #1260 by @AdamBear
- TTS: Support long input text by default. #1241
- TTS: Add Style MelGAN and HiFiGAN. #1241
ASR
- Refactor configs in examples. #1225
TTS
- Fix some frontend bugs. #1262 by @JiehangXie / #1310
- Add speaker embedding and speaker id for style fastspeech2 inference. #1197 by @jerryuhoo
- Add support for finetuning speedyspeech. #1302 by @jerryuhoo / #1322 / #1337
- Update VCTK Parallel WaveGAN. #1294
- Update Multi Band MelGAN. #1272
ST
- Refactor configs in examples. #1225
Text
- Refactor Punctuation Restoration example. #1215
Docs
Others
- Update released models and results. #1306
Acknowledgements
@zh794390558 @yt605155624 @Jackwaterveg @KPatr1ck @Mingxue-Xu @JiehangXie @grasswolfs @jerryuhoo @AdamBear @LittleChenCc @JamesLim-sy
PaddleSpeech r0.1.0
Features
CLI : New Feature
- Easy install by pip
pip install paddlespeech
- CLI to quick explore ASR, TTS, audio classification, speech translation and punctuation restoration.
ASR
- Join CTC LM decoder
- Transformer LM model
- Improve DeepSpeech2 online model
- Refactor some configs
TTS
- Merge Parakeet into PaddleSpeech
- Add FastSpeech2-Conformer
- paper link: fastspeech2 、conformer
- example link
- Add Multi Band MelGAN
- Add HiFiGAN
- Add Style MelGAN
- Add FastSpeech2 Voice Cloning with GE2E (SV2TTS)
CLS
- Add audio classification example on ESC-50 and custom dataset.
- Add audio tagging demo based on PANNs and Audioset labels.
ST
- ST-MTL
- FAT-ST-MTL
Docs
- Add quick start
- Add read the doc
- Improve installation documentation
- Add README for each example
Demos
- Audio_tagging
- Automatic_video_subtitiles
- Metaverse
- Punctuation_restoration
- Speech_recognition
- Speech_translation
- Story_talker
- Style_fs2
- Text_to_speech
Others
- Update released models and results
Acknowledgements
@zh794390558 @KPatr1ck @Jackwaterveg @yt605155624 @Mingxue-Xu @grasswolfs @jerryuhoo
DeepSpeech v2.1.1
- ctc alignment
- refactor data pipeline
- autolog for deepspeech test
- refactor checkpoint save/load
- deepspeech online model
- mfa alignment example
- add text normaliztion example
- TLG for aishell
- more dataest: thchs30, aidatatang, timit etc.
- 8k speech example
- ted en-zh st example
- more utils
DeepSpeech v2.1.0
- Transformer/Conformer Offline/Online ASR
- Unified CTC Loss for DS2 model and Transformer Model
DeepSpeech v1.1.0
paddle 1.8.x with python2
DeepSpeech v1.0.0
master latest code