Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add cuctc tutorial, change blank skip threshould into prob #3297

Closed
wants to merge 4 commits into from

Conversation

yuekaizhang
Copy link
Contributor

Add a separate tutorial for cuctc.
Reslove #3096

@pytorch-bot
Copy link

pytorch-bot bot commented May 3, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/audio/3297

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 3 Unrelated Failures

As of commit e8a870c:

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base 84b1230:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@Adel-Moumen
Copy link

Adel-Moumen commented May 17, 2023

Hello @yuekaizhang,

Many thanks for your CUDA Implementation. When it will be released in the pip install branch I will definitely make sure that we support it in SpeechBrain.

In the meantime, can I ask you if you have any documentation that explain how you did your implementation ? This is something that could be relevant to the community to have a PyTorch version which does not involve custom CUDA kernels.

Indeed, it is easier for people to directly modify PyTorch code than CUDA code when one wants to modify/implement research papers. Hence, I'm asking you if you could elaborate a little bit more on how we could re-implement what you did but with PyTorch directly. Many thanks!

Adel

@yuekaizhang
Copy link
Contributor Author

yuekaizhang commented May 17, 2023

Thanks. @mthrok Do you know when it will be available in pip install ?

We would have a detailed documentation about how the cuda kernels were implemented in one or two months later.

Unfortunately, it can't be reimplemented with pure pytorch. (Or you would get worse or similar performace with current cpu version decoder in torchaudio with pure pytorch.) Maybe you could try openai triton for python programmers, which I thought will be slower comparing with cuda.

@Adel-Moumen
Copy link

Ok good!

Tbh I don't think Triton will be a good alternative to your implementation. The CPU kernel overheads hits really hard with Triton especially in this case where we have a for loop on the time steps. Btw did you tried CUDA Graph with your implementation? I guess it could improve the results by a lot since there is a for loop.

And do you know if you are planning to integrate support for kenLM ?

Thanks again.

@yuekaizhang
Copy link
Contributor Author

yuekaizhang commented May 17, 2023

Ok good!

Tbh I don't think Triton will be a good alternative to your implementation. The CPU kernel overheads hits really hard with Triton especially in this case where we have a for loop on the time steps. Btw did you tried CUDA Graph with your implementation? I guess it could improve the results by a lot since there is a for loop.

And do you know if you are planning to integrate support for kenLM ?

Thanks again.

We have not tried cuda graph yet. We have implemented skip frames based on blank frame probs, which converts the frame sync decoding into label sync decoding.
We do would like support LM or hot words. However, there is no people currently claim this task.

@Adel-Moumen
Copy link

Thanks for the answers! Do you have a timeline in mind of when your code will be available in the pip install branch please?

@mthrok
Copy link
Collaborator

mthrok commented Jun 12, 2023

Hi folks

Sorry for the belated response. The next PyTorch release is scheduled at Oct 2023. https://dev-discuss.pytorch.org/t/pytorch-release-2-1-0/1271

The CUCTC code is accessible through nightly channel, so if you would like to try out, please refer to https://pytorch.org and install nightly build.

@yuekaizhang Sorry for not responding these days, I will review this soon.

@facebook-github-bot
Copy link
Contributor

@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mthrok merged this pull request in 732c94a.

@github-actions
Copy link

github-actions bot commented Aug 1, 2023

Hey @mthrok.
You merged this PR, but labels were not properly added. Please add a primary and secondary label (See https://github.com/pytorch/audio/blob/main/.github/process_commit.py).


Some guidance:

Use 'module: ops' for operations under 'torchaudio/{transforms, functional}', and ML-related components under 'torchaudio/csrc' (e.g. RNN-T loss).

Things in "examples" directory:

  • 'recipe' is applicable to training recipes under the 'examples' folder,
  • 'tutorial' is applicable to tutorials under the “examples/tutorials” folder
  • 'example' is applicable to everything else (e.g. C++ examples)
  • 'module: docs' is applicable to code documentations (not to tutorials).

Regarding examples in code documentations, please also use 'module: docs'.

Please use 'other' tag only when you’re sure the changes are not much relevant to users, or when all other tags are not applicable. Try not to use it often, in order to minimize efforts required when we prepare release notes.


When preparing release notes, please make sure 'documentation' and 'tutorials' occur as the last sub-categories under each primary category like 'new feature', 'improvements' or 'prototype'.

Things related to build are by default excluded from the release note, except when it impacts users. For example:
* Drop support of Python 3.7.
* Add support of Python 3.X.
* Change the way a third party library is bound (so that user needs to install it separately).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants