-
Notifications
You must be signed in to change notification settings - Fork 551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Issue] Incorrect mapping nvcc
flags to hipcc
flags
#3764
Comments
Using the reproducer, with a few extra setup steps and a tweak to workaround a compiler minimum version check, I am able to produce the hipcc cmd containing ....-D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1.... as depicted in the screenshot above |
nvcc
flags to hipcc
flagsnvcc
flags to hipcc
flags
It is not a HIPIFY issue, as HIPIFY tools only transform the CUDA source without considering how the hipified source should be compiled further. And it also doesn't know anything about how to compile ("correctly") the CUDA source. |
Hi @Qubitium - can you share the output of clang --version? While I am able to reproduce the hipcc cmd line, I initially hit an error where a compiler version check in py_envs/lib/python3.12/site-packages/torch/utils/cpp_extension.py failed. My clang's version looks roughly like so: AMD clang version 19.0.0git ; the '0git' tripped up the check as it expected an integer value. I assume you did not hit this, hence the question. Thanks, |
-D__HIP_NO_HALF_OPERATORS__=1 is not mapped from the nvcc flag per se. It is explicitly added by https://github.com/pytorch/pytorch/blob/main/torch/utils/cpp_extension.py#L279 ; now, the next question is why is it explicitly added. I will loop in ROCm-Pytorch folks |
Fixes ROCm/hip#3764. A user convenience.
@searlmc1 Clang is not installed on my system.
OS: Ubuntu 22.04 |
Oh my. Now it appears I have hit a
After installing
|
I think there is confusion about the wording of
|
Correct, hipify does src -> src translation; it does not translate build flags; we currently do not provide a tool that translates the nvcc flags, however, we have started discussions around that. hipify is both a tool and a process . When referring to the tool, it refers to one of hipify-clang, hipify-perl, or hipify-torch. However, it is often referred to simply as hipify. When referring to the process, it references the process of taking CUDA code and translating - or hipify'ing it - into HIP code. hipcc is a thin compiler driver; it invokes clang or nvcc depending on the environment. It also may add include/library options for the target compiler . In the early days, hipcc did more than it does today. In the early days, clang itself was missing functionality and so it was baked into hipcc. By and large, that functionality has been lifted out of hipcc. I understand your confusion. I often see see hipcc referred to as a compiler. It is not a compiler. E.g., hipcc unto itself cannot compile anything; it formulates the clang, or nvcc, invocation and then invokes the compiler. You mentioned that you did not have clang installed. Because your environment is ROCm 6.3.3 + mi300x, I assumed that you would have clang via that ROCm installation. If interested |
For a little context, the kernel I was testing and used in the reproducer is a kernel from the vLLM project and in-directly SGLang, since SGlang imports vLLM kernels. I also contribute to these projects. This context is important becaue the kernel compiles correctly using vllm setup as they do not use pytorch's cudaExension hook for compilation but Now the answer is becoming clear. Beause I, and most pytorch projects use https://github.com/pytorch/pytorch/blob/main/torch/utils/cpp_extension.py#L279 I think it will great if default flags in the cmake vs pytorch build cpp extension are sycned, as much as possible for the rocm eco system. This way, any compiler issues can be correctly pushed down to the developers. Otherwise, us devs are like headless chickens since hunting for compiler flag difference is the last thing on our minds or frankly, good at. |
So installing Found that amd/rocm bundle it's own clang.
|
@searlmc1 I made PR to merge some changes I believe would improve the existing PR/fix by @naromero77amd. Please check and compile. I did not compile because I am not sure I am capable of handling the depend hell that it requires. But I did unit test the method with code of unit test in the PR |
K / thanks much / I'll take a look |
Thanks. BTW, ROCm jenken CI is broken. 😅 I was hoping it can compile the PR for me and run through the tests but...dependency hell gut punched that idea. |
Fixes ROCm/hip#3764. A user convenience.
…tensions (#149245) Fixes ROCm/hip#3764. Fixes and improvements to CUDA->HIP flag conversion for CPP extensions - Log flag conversion for debugging purposes. - Fix cases where it should not touch the -I flags or cases where CUDA appears more than once by replacing only the first instance. - Fix case where nvcc key may not exist - Fix case where hipify should ignore flag values and only touch the flag itself Pull Request resolved: #149245 Approved by: https://github.com/jeffdaily Co-authored-by: Qubitium-ModelCloud <[email protected]>
…tensions (#149245) Fixes ROCm/hip#3764. Fixes and improvements to CUDA->HIP flag conversion for CPP extensions - Log flag conversion for debugging purposes. - Fix cases where it should not touch the -I flags or cases where CUDA appears more than once by replacing only the first instance. - Fix case where nvcc key may not exist - Fix case where hipify should ignore flag values and only touch the flag itself Pull Request resolved: #149245 Approved by: https://github.com/jeffdaily Co-authored-by: Qubitium-ModelCloud <[email protected]> (cherry picked from commit c0566e0)
Attempting to get this into release/2.7. pytorch/pytorch#149044 (comment). |
This issue did not close automatically with the PR merged. Closing now. |
…tensions (#149432) [ROCm] Fixes and improvements to CUDA->HIP flag conversion for CPP extensions (#149245) Fixes ROCm/hip#3764. Fixes and improvements to CUDA->HIP flag conversion for CPP extensions - Log flag conversion for debugging purposes. - Fix cases where it should not touch the -I flags or cases where CUDA appears more than once by replacing only the first instance. - Fix case where nvcc key may not exist - Fix case where hipify should ignore flag values and only touch the flag itself Pull Request resolved: #149245 Approved by: https://github.com/jeffdaily Co-authored-by: Qubitium-ModelCloud <[email protected]> (cherry picked from commit c0566e0) Co-authored-by: Nichols A. Romero <[email protected]>
Problem Description
When there is an obvious
nvcc
flag that has 1-to-1 mapping tohipcc
flag,hipify
should do the right thing and convert the flags tohip
flag. I thinkhipify
does the correct job most of the time but I have found a case where it doesn't, causing cryptic and strange compile errors and putting the burden on the cuda devs who have no idea whathipify
is doing for the most part.nvcc
flags in question:Expectation from
hipify
Actual mapping: Not performed. Causing the following to be injected for
hipcc
Full Output
Operating System
Ubuntu 22.04
CPU
AMD ZEN5 EPYC
GPU
MI300X
ROCm Version
6.3.3
ROCm Component
HIP
Steps to Reproduce
I have pushed reproducing code into a repo: https://github.com/ModelCloud/rockthem
Fix is quite simple: Correctly map
nvcc
tohip
flags when 1-to-1 match is found or whenever applicable to alleviate cuda migration pain.(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response
The text was updated successfully, but these errors were encountered: