-
Notifications
You must be signed in to change notification settings - Fork 672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable ROCm RNN-T Loss #2485
Enable ROCm RNN-T Loss #2485
Conversation
Thanks for the contribution. Can you rebase on the latest main branch? |
Sure thing |
CircleCI is not running. Let me close and reopen to send the web hook again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please guard USE_ROCM with CMake version check (as TorchAudio should be builder on systems with CMake older than 3.21)
Also, can you please add an explanation why explicit #ifdef __HIP_PLATFORM_AMD__
must be explicitly added .cu files? I.e. why can't hipify module take care of those mechanistically changes?
@@ -85,6 +85,10 @@ if(USE_CUDA) | |||
enable_language(CUDA) | |||
endif() | |||
|
|||
if(USE_ROCM) | |||
enable_language(HIP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HIP language is only available since CMake-3.21(see https://cmake.org/cmake/help/v3.21/command/enable_language.html), so please make USE_ROCM
conditional on 3.21 CMake version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The need for enable_language(HIP)
was removed in the latest commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted the commit that refactored the HIP cmake integration -- will revisit in later PR.
torchaudio/csrc/rnnt/options.h
Outdated
#ifdef USE_ROCM | ||
// the stream to launch kernels in when using GPU. | ||
hipStream_t stream_; | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, can we just add #define cudaStream_t hipStream_t
under first USE_ROCM
and then we would not need to duplicate the code here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typedef was added. Please re-review.
1cbcda2
to
eeb8367
Compare
332e298
to
45ec17a
Compare
Added check for rocm language and CMake 3.21 Combined stream types between CUDA and HIP.
…nnt_merge Upstream audio rocm rnnt merge
@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
cc @mthrok can you review this pr? |
@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Should we be concerned about ROCm build job failures??
|
Yes, we need to fix those before we can land this PR. |
@jeffdaily @jithunnair-amd Got it. Would you be able to look into them? |
Seems like there was ROCm CI migration. Could you rebase or merge upstream? If possible, I would like to include this PR for the upcoming release. |
Done. @mthrok |
Note that the CMakeLists.txt refactor was reverted. Let's first focus on landing this PR that adds RNN-T for ROCm, then a follow-up PR that modernizes the CMake files. |
Would you be okay with accepting this PR as-is and then a follow-up PR that revisits the hipify strategy here? It would be nice to get the feature in. The hipify and CMakeLists.txt refactor was causing more trouble than expected, so it was backed out of this PR. |
@jeffdaily Yes, build CI jobs were green, so I will merge this. |
@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@mthrok This PR introduces a dependency on
I'm investigating how to package the libomp.so with the torchaudio wheel for ROCm. If there are any existing instances where you do the same for any build of torchaudio (ROCm or otherwise), please point me to it, so I can re-use the same flow. |
@jeffdaily @jithunnair-amd how critical this PR is ? maybe we can just rever it ? |
This reverts commit c593961.
Hi @jeffdaily @jithunnair-amd - So this PR was reverted due to the packaging issue. I wanted to have this in the upcoming release and I do welcome the re-attempt. As to make the process faster for the future, I invited @jeffdaily for write permission. I hope we can continue and iterate on this feature. Regards |
This reverts commit 49d3eec.
This reverts commit 49d3eec.
Added HIPIFY code and small changes for ROCm. Targeting RNN-T loss.