14 Jan 03:27

Jackwaterveg

PaddleSpeech r0.1.1

New Features

CLI :

Add cli stats. #1274
Add unit test. #1321
ASR: Support English: Add transformer_libirspeech model. #1297
ASR: Support 4 decoding methods: ctc_greedy_search, ctc_beam_search, attention, attention_rescoring. #1297
ASR & ST: Use the unified config. #1305 / #1312
ASR: Refactor the code. #1260 by @AdamBear
TTS: Support long input text by default. #1241
TTS: Add Style MelGAN and HiFiGAN. #1241

ASR

Refactor configs in examples. #1225

TTS

Fix some frontend bugs. #1262 by @JiehangXie / #1310
Add speaker embedding and speaker id for style fastspeech2 inference. #1197 by @jerryuhoo
Add support for finetuning speedyspeech. #1302 by @jerryuhoo / #1322 / #1337
Update VCTK Parallel WaveGAN. #1294
Update Multi Band MelGAN. #1272

ST

Refactor configs in examples. #1225

Text

Refactor Punctuation Restoration example. #1215

Docs

Add topic note for releasing python packages
Add TTS papers. #1330
Add Frontend G2P topic. #1254

Others

Update released models and results. #1306

Acknowledgements

@zh794390558 @yt605155624 @Jackwaterveg @KPatr1ck @Mingxue-Xu @JiehangXie @grasswolfs @jerryuhoo @AdamBear @LittleChenCc @JamesLim-sy

Contributors

AdamBear, zh794390558, and 9 other contributors

Assets 2

23 Dec 07:58

Jackwaterveg

PaddleSpeech r0.1.0

Features

CLI : New Feature

Easy install by pip pip install paddlespeech
CLI to quick explore ASR, TTS, audio classification, speech translation and punctuation restoration.

ASR

Join CTC LM decoder
- paper link
Transformer LM model
Improve DeepSpeech2 online model
Refactor some configs

TTS

Merge Parakeet into PaddleSpeech
Add FastSpeech2-Conformer
- paper link: fastspeech2 、conformer
- example link
Add Multi Band MelGAN
- paper link
- example link
Add HiFiGAN
- paper link
- example link
Add Style MelGAN
- paper link
- example link
Add FastSpeech2 Voice Cloning with GE2E (SV2TTS)
- paper link
- example link

CLS

Add audio classification example on ESC-50 and custom dataset.
Add audio tagging demo based on PANNs and Audioset labels.

ST

ST-MTL
FAT-ST-MTL

Docs

Add quick start
Add read the doc
Improve installation documentation
Add README for each example

Demos

Audio_tagging
Automatic_video_subtitiles
Metaverse
Punctuation_restoration
Speech_recognition
Speech_translation
Story_talker
Style_fs2
Text_to_speech

Others

Update released models and results

Acknowledgements

@zh794390558 @KPatr1ck @Jackwaterveg @yt605155624 @Mingxue-Xu @grasswolfs @jerryuhoo

Contributors

zh794390558, KPatr1ck, and 5 other contributors

Assets 2

16 Aug 03:20

zh794390558

DeepSpeech v2.1.1

ctc alignment
refactor data pipeline
autolog for deepspeech test
refactor checkpoint save/load
deepspeech online model
mfa alignment example
add text normaliztion example
TLG for aishell
more dataest: thchs30, aidatatang, timit etc.
8k speech example
ted en-zh st example
more utils

Assets 2

29 Jun 12:24

zh794390558

DeepSpeech v2.1.0

Transformer/Conformer Offline/Online ASR
Unified CTC Loss for DS2 model and Transformer Model

Assets 2

25 Feb 03:42

zh794390558

DeepSpeech v1.1.0

paddle 1.8.x with python2

Assets 2

25 Feb 03:41

zh794390558

DeepSpeech v1.0.0

master latest code

Assets 2