-
Notifications
You must be signed in to change notification settings - Fork 0
pl-ghost/speech-transformer-pytorch_lightning
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
End2End chinese-english code-switch speech recognition in pytorch
## This is a mixed project borrowing from many awesome projects opened recently.
With pytorch-lightning, experiments can be carried out easily.
And i will try to make evey calculation in a batched and cleaned way.
(such as add bos & eos into batched target and spec augment) Any ideas can be put into the issues,
and welcome for discussion. (This project is still being building and reorganizing)
project features:
joint attention & ctc beam search decode with rnn lm
multi dataset
using pytorch lightning for 16bit training
Chinese-char level & English-word level tokenizer
sentence piece tokenizer for english tokenizing
rnn_lm training
label smoothing
customized transformer encoder and decoder see: src/model/modules/transformer_encoder...
*rezero transformer for some converge problem with half precision and speed consideration
feature:
log fbank with sub sample
speed augment
a spec augment using gpu as a layer in model
customized feature filtering , see src/loader/utils/build_fbank remove_empty_line_2d
optimizer:
Ranger
model:
rezero transformer
restricted encoder field
better mask (may be a little slower than other project but effective)
loss:
lambda * ce loss + (1-lambda * ctc loss) + code switch loss
requirement:
see docker/
references:
https://github.com/ZhengkunTian/OpenTransformer
https://github.com/espnet/espnet
https://github.com/jadore801120/attention-is-all-you-need-pytorch
https://github.com/alphadl/lookahead.pytorch
https://github.com/LiyuanLucasLiu/RAdam
https://github.com/vahidk/tfrecord
https://github.com/kaituoxu/Speech-Transformer
https://github.com/majumderb/rezero
https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer
data:
aishell1 170h
aishell2 1000h
magic data 750h
prime 100h not used
stcmd 100h not used
datatang 200h
datatang 500h
datatang mix 200h
librispeech 960h
train step
english -> eng(sub) + mix + chinese -> chinese + mix -> mixAbout
ASR project with pytorch-lightning
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- Python 100.0%