-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add hlg decode #1521
add hlg decode #1521
Conversation
wenet/transformer/asr_model.py
Outdated
@@ -20,6 +20,10 @@ | |||
|
|||
from torch.nn.utils.rnn import pad_sequence | |||
|
|||
import k2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move it into hlg_onebest
& hlg_rescore
, so we can run it without k2 & icefall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid duplicate code, we use move these import into try except
examples/aishell/s0/README.md
Outdated
@@ -28,6 +28,8 @@ | |||
| ctc prefix beam search | 5.17 | 5.81 | | |||
| attention rescoring | 4.63 | 5.05 | | |||
| LM + attention rescoring | 4.40 | 4.75 | | |||
| HLG | 4.81 | 5.27 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about HLG(k2 LM), so it is easy to understand for the users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please see inline.
What is the motivation to add HLG from k2? |
k2 can do batch decoding with cuda, and it has python interface. |
Thanks! Do you have some benchmarks to share? |
examples/aishell/s0/run.sh
Outdated
# Optionally, you can decode with k2 hlg | ||
if [ ${stage} -le 8 ] && [ ${stop_stage} -ge 8 ]; then | ||
if [ ! -f data/local/lm/lm.arpa ]; then | ||
echo "Please run prepare dict and train lm in Stage 7" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to add
exit 1
to stop processing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
examples/aishell/s0/run.sh
Outdated
fi | ||
|
||
# 8.1 Build decoding HLG | ||
tools/k2/make_hlg.sh data/local/dict/ data/local/lm/ data/local/hlg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we skip this step if the file is already generated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
The wer performance is already compared in README. For speed comparision, wenet runtime default using batchsize=1, which is not comparable with hlg decode using batchsize=16. I will do the speed benchmark later in a fair scenario |
It is more convienent to use FSA in python than c++ in openfst, especially for the beginners. And Dan thinks it is okay to support k2 in wenet. |
Besides, we can use k2 with GPU. |
Thank you for sharing, but we would like to know how we can access HLG Decoding using c++? |
It‘s unpossible that we load the dynamic libraries provided by k2 and wenet at the same time. |
Why is it impossible? Have you used the same version of PyTorch and CUDA to compile k2 and wenet? |
We add hlg decode using k2. Now we can use k2 to compile a HLG, and decode using python with cuda.
We provide two decoding algorithm. One is onebest, which is same as ctc_prefix_beam_search with lm score, another is attention rescore.
Notice that in attention rescore, we have 3 different scale parameter, which are lm_scale, decoder_scale and r_decoder_scale.
Special thanks to K2 groups great work!