[rocThrust] Enable host-only rocThrust builds with g++/clang++#2363
Merged
umfranzw merged 1 commit intoNov 28, 2025
Conversation
memmett
pushed a commit
that referenced
this pull request
Oct 29, 2025
This PR implements `hipdnnEnginePluginExecuteOpGraph` enabling graph/plan execution. We can now complete the initial plugin goal of e2e run of single node ConvFprop graph. Note: a simple e2e integration test is added. This should be expanded to a parameterized matrix of test, once we're on the latest hipDNN [example](https://github.com/ROCm/rocm-libraries/blob/develop/projects/hipdnn/plugins/miopen_legacy_plugin/integration_tests/IntegrationGpuConvForward.cpp). Tickets: [#2363](nod-ai/amd-shark-ai#2363), [#2375](nod-ai/amd-shark-ai#2375).
069d99a to
1e01813
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #2363 +/- ##
============================================
+ Coverage 54.54% 84.07% +29.53%
============================================
Files 14 501 +487
Lines 3768 42472 +38704
Branches 578 5459 +4881
============================================
+ Hits 2055 35707 +33652
- Misses 1468 2499 +1031
- Partials 245 4266 +4021
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
1e01813 to
2a05f0f
Compare
umfranzw
commented
Nov 24, 2025
2a05f0f to
948889b
Compare
Currently, all rocThrust build configurations depend on rocPRIM. Since rocPRIM requires a GPU, this makes it tricky to use rocThrust for host-only computation. In addition, since building rocPRIM requires a hip-aware clang (eg. hipcc), so does building rocThrust. This means you can't use gcc or vanilla-clang to build rocThrust, even if you're only doing host-side compute. Upstream thrust does allow users to build for host-only compute, and this is typically done with gcc. It turns out that there are some differences between the way that clang/hipcc and gcc interpret the C++ standard. This can lead to situations where applications that work when built with Thrust do not work when built with rocThrust. Addressing these situations creates additional load for the compiler team. With this in mind, this change: - adds a new cmake option, `ROCTHRUST_DEVICE_SYSTEM`, which is similar to Thrust's `THRUST_DEVICE_SYSTEM` macro. It can be set to one of: `HIP`, `CUDA`, `TBB`, `OpenMP`, or `CPP`. If it's set to `HIP` or `CUDA`, then the code links against device dependencies (eg. for HIP, it looks for rocPRIM). If it's set to `CPP`, then it does not link against device dependencies. - Because the benchmarks, tests, and examples all currently depend on rocPRIM, I've added a cmake fatal error that triggers when rocThrust is built with device system `CPP` and benchmarks, tests, or examples are enabled. We can probably enable many of the tests going forward, but it'll require a bit of work, so it may be best to follow-up on that in a separate change. - The cmake dependencies file has been tweaked so we don't try to fetch rocPRIM when `ROCTHRUEST_DEVICE_SYSTEM == CPP`. - I've modified the cmake compiler verification and rocThrust device_system.h header to allow gcc to be used as a host compiler again. - I updated a number of cases where we were assuming that we could call into rocPRIM as a fallback. I've added a macro `_THRUST_USE_ROCPRIM` that's set to 1 in cases where have access to it.
948889b to
0647f3e
Compare
stanleytsang-amd
approved these changes
Nov 25, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Currently, all rocThrust build configurations depend on rocPRIM. Since rocPRIM requires a GPU, this makes it tricky to use rocThrust for host-only computation.
In addition, since building rocPRIM requires a hip-aware clang (eg. hipcc), so does building rocThrust. This means you can't use gcc or vanilla-clang to build rocThrust, even if you're only doing host-side compute.
Upstream thrust does allow users to build for host-only compute, and this is typically done with gcc. It turns out that there are some differences between the way that clang/hipcc and gcc interpret the C++ standard. This can lead to situations where applications that work when built with Thrust do not work when built with rocThrust. Addressing these situations creates additional load for the compiler team.
Technical Details
This change makes modifications so that rocThrust can be used in a host-only CPP mode that does not require rocPRIM. This allows you to use g++ or a non-HIP-enabled clang++ as a compiler. Changes include:
Added a new cmake option,
ROCTHRUST_DEVICE_SYSTEM, which is similar to Thrust'sTHRUST_DEVICE_SYSTEMmacro. It can be set to one of:HIP,CUDA,TBB,OpenMP, orCPP. If it's set toHIPorCUDA, then the code links against device dependencies (eg. for HIP, it looks for rocPRIM). If it's set toCPP, then it does not link against device dependencies.Because the benchmarks, tests, and examples all currently depend on rocPRIM, I've added a cmake fatal error that triggers when rocThrust is built with device system
CPPand benchmarks, tests, or examples are enabled. We can probably enable many of the tests going forward, but it'll require a bit of work, so it may be best to follow-up on that in a separate change.The cmake dependencies file has been tweaked so we don't try to fetch rocPRIM when
ROCTHRUEST_DEVICE_SYSTEM == CPP.I've modified the cmake compiler verification and rocThrust device_system.h header to allow gcc to be used as a host compiler again.
I updated a number of cases where we were assuming that we could call into rocPRIM as a fallback. I've added a macro
_THRUST_USE_ROCPRIMthat's set to 1 in cases where have access to it.Test Plan
Since we can't currently run the normal tests, you can verify that this works by building the simple example below using g++ and vanilla-clang++.
cd rocm-libraries/projects/rocthrustROCM_PATH=/opt/rocm CXX=g++ cmake -B build -DBUILD_BENCHMARK=OFF -DBUILD_TEST=OFF -DROCTHRUST_DEVICE_SYSTEM=CPPNow build and run with:
g++ -I /opt/rocm/include example.cpp./a.outEnsure that the example builds and runs successfully.
You can repeat this process using the vanilla clang++ located at /opt/rocm/llvm/bin/clang++. Note that you'll need to delete the build directory and start over, changing the CXX value in the second command to /opt/rocm/llvm/bin/clang++.
Test Result
The example compiles and runs correctly with both g++ and clang++.
Submission Checklist