Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add hlg decode #1521

Merged
merged 1 commit into from
Oct 27, 2022
Merged

add hlg decode #1521

merged 1 commit into from
Oct 27, 2022

Conversation

aluminumbox
Copy link
Collaborator

@aluminumbox aluminumbox commented Oct 26, 2022

We add hlg decode using k2. Now we can use k2 to compile a HLG, and decode using python with cuda.
We provide two decoding algorithm. One is onebest, which is same as ctc_prefix_beam_search with lm score, another is attention rescore.
Notice that in attention rescore, we have 3 different scale parameter, which are lm_scale, decoder_scale and r_decoder_scale.
Special thanks to K2 groups great work!

@@ -20,6 +20,10 @@

from torch.nn.utils.rnn import pad_sequence

import k2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move it into hlg_onebest & hlg_rescore, so we can run it without k2 & icefall.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid duplicate code, we use move these import into try except

@@ -28,6 +28,8 @@
| ctc prefix beam search | 5.17 | 5.81 |
| attention rescoring | 4.63 | 5.05 |
| LM + attention rescoring | 4.40 | 4.75 |
| HLG | 4.81 | 5.27 |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about HLG(k2 LM), so it is easy to understand for the users.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Collaborator

@robin1001 robin1001 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please see inline.

@csukuangfj
Copy link

What is the motivation to add HLG from k2?

@aluminumbox
Copy link
Collaborator Author

aluminumbox commented Oct 27, 2022

What is the motivation to add HLG from k2?

k2 can do batch decoding with cuda, and it has python interface.
It is easier to use and faster compared with wenet runtime tlg.
Also, it can extract lm score from tot score, so we have more hyperparameter to tune when decoding. And the results on aishell is 4.32, which is better than runtime tlg lm, 4.40

@csukuangfj
Copy link

What is the motivation to add HLG from k2?

k2 can do batch decoding with cuda, and it has python interface. It is easier to use and faster compared with wenet runtime tlg

Thanks! Do you have some benchmarks to share?

# Optionally, you can decode with k2 hlg
if [ ${stage} -le 8 ] && [ ${stop_stage} -ge 8 ]; then
if [ ! -f data/local/lm/lm.arpa ]; then
echo "Please run prepare dict and train lm in Stage 7"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add

exit 1

to stop processing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

fi

# 8.1 Build decoding HLG
tools/k2/make_hlg.sh data/local/dict/ data/local/lm/ data/local/hlg

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we skip this step if the file is already generated?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@aluminumbox
Copy link
Collaborator Author

What is the motivation to add HLG from k2?

k2 can do batch decoding with cuda, and it has python interface. It is easier to use and faster compared with wenet runtime tlg

Thanks! Do you have some benchmarks to share?

The wer performance is already compared in README. For speed comparision, wenet runtime default using batchsize=1, which is not comparable with hlg decode using batchsize=16. I will do the speed benchmark later in a fair scenario

@robin1001
Copy link
Collaborator

It is more convienent to use FSA in python than c++ in openfst, especially for the beginners. And Dan thinks it is okay to support k2 in wenet.

@robin1001
Copy link
Collaborator

Besides, we can use k2 with GPU.

@robin1001 robin1001 merged commit cd3fcb5 into wenet-e2e:main Oct 27, 2022
@zw76859420
Copy link

We add hlg decode using k2. Now we can use k2 to compile a HLG, and decode using python with cuda. We provide two decoding algorithm. One is onebest, which is same as ctc_prefix_beam_search with lm score, another is attention rescore. Notice that in attention rescore, we have 3 different scale parameter, which are lm_scale, decoder_scale and r_decoder_scale. Special thanks to K2 groups great work!

Thank you for sharing, but we would like to know how we can access HLG Decoding using c++?

@zw76859420
Copy link

It‘s unpossible that we load the dynamic libraries provided by k2 and wenet at the same time.
So, can we combine both wenet and k2 in a simple way?

@csukuangfj
Copy link

csukuangfj commented Oct 27, 2022

It‘s unpossible that we load the dynamic libraries provided by k2 and wenet at the same time.

Why is it impossible?

Have you used the same version of PyTorch and CUDA to compile k2 and wenet?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants