find OpenMP config#517
Merged
Merged
Conversation
daineAMD
approved these changes
Jul 7, 2025
amcamd
approved these changes
Jul 7, 2025
f798c63 to
879098b
Compare
stanleytsang-amd
pushed a commit
to stanleytsang-amd/rocm-libraries
that referenced
this pull request
Jul 8, 2025
* Structuring backend (ROCm#495) * split rocsparse_handle.hpp * split rocsparse_handle.cpp * structuring trm_info without behaviour change * fixing test failure related to copy_mat_info * Fixing default initialization of rocsparse_trm_info. Removing error prone cast when calling hipMalloc * Fixing missing behavior check in spgeam (ROCm#514) * Fixing missing behavior check. * stage analysis and stage symbolic are not required for stage numeric * spgeam final fix (ROCm#516) * spgeam fix --------- Co-authored-by: Yvan Mokwinski <yvan.mokwinski@gmail.com>
Contributor
Author
|
How do we ping rocm-math-lib-build-infra list for reviews? |
davidd-amd
approved these changes
Jul 9, 2025
Contributor
davidd-amd
left a comment
There was a problem hiding this comment.
Please add a description and include details on the problem and how the changes resolve it.
Contributor
Author
|
Replicates work in ROCm/hipBLAS#1038 |
56a58ba to
e44275c
Compare
e44275c to
1bec38e
Compare
TorreZuk
added a commit
that referenced
this pull request
Jul 9, 2025
Summary of proposed changes: First search for ROCm's libomp.so via openmp-config.cmake. This is what we would prefer instead of searching for a system libomp.so/libgomp.so and then manually adding in a ROCm lib path. This methodology should still be RHEL-10 RPATH compliant. Co-authored-by: estewart08 <ethan.stewart@amd.com> (cherry picked from commit c6baf77)
assistant-librarian Bot
pushed a commit
to ROCm/rocBLAS
that referenced
this pull request
Jul 9, 2025
find OpenMP config (#517) Summary of proposed changes: First search for ROCm's libomp.so via openmp-config.cmake. This is what we would prefer instead of searching for a system libomp.so/libgomp.so and then manually adding in a ROCm lib path. This methodology should still be RHEL-10 RPATH compliant. Co-authored-by: estewart08 <ethan.stewart@amd.com>
TorreZuk
added a commit
that referenced
this pull request
Jul 10, 2025
find OpenMP config (#517) First search for ROCm's libomp.so via openmp-config.cmake. This is what we would prefer instead of searching for a system libomp.so/libgomp.so and then manually adding in a ROCm lib path. This methodology should still be RHEL-10 RPATH compliant. Co-authored-by: estewart08 <ethan.stewart@amd.com>
TorreZuk
added a commit
that referenced
this pull request
Jul 14, 2025
find OpenMP config (#517) First search for ROCm's libomp.so via openmp-config.cmake. This is what we would prefer instead of searching for a system libomp.so/libgomp.so and then manually adding in a ROCm lib path. This methodology should still be RHEL-10 RPATH compliant. Co-authored-by: estewart08 <ethan.stewart@amd.com> (cherry picked from commit 408affb)
ammallya
pushed a commit
that referenced
this pull request
Jul 14, 2025
* Structuring backend (#495) * split rocsparse_handle.hpp * split rocsparse_handle.cpp * structuring trm_info without behaviour change * fixing test failure related to copy_mat_info * Fixing default initialization of rocsparse_trm_info. Removing error prone cast when calling hipMalloc * Fixing missing behavior check in spgeam (#514) * Fixing missing behavior check. * stage analysis and stage symbolic are not required for stage numeric * spgeam final fix (#516) * spgeam fix --------- Co-authored-by: Yvan Mokwinski <yvan.mokwinski@gmail.com> [ROCm/rocSPARSEcommit: 22d1de0]
ammallya
pushed a commit
that referenced
this pull request
Jul 14, 2025
* Structuring backend (#495) * split rocsparse_handle.hpp * split rocsparse_handle.cpp * structuring trm_info without behaviour change * fixing test failure related to copy_mat_info * Fixing default initialization of rocsparse_trm_info. Removing error prone cast when calling hipMalloc * Fixing missing behavior check in spgeam (#514) * Fixing missing behavior check. * stage analysis and stage symbolic are not required for stage numeric * spgeam final fix (#516) * spgeam fix --------- Co-authored-by: Yvan Mokwinski <yvan.mokwinski@gmail.com> [ROCm/rocSPARSE commit: 22d1de0]
SathiyarajRam
pushed a commit
that referenced
this pull request
Jul 15, 2025
find OpenMP config (#517) First search for ROCm's libomp.so via openmp-config.cmake. This is what we would prefer instead of searching for a system libomp.so/libgomp.so and then manually adding in a ROCm lib path. This methodology should still be RHEL-10 RPATH compliant. Co-authored-by: estewart08 <ethan.stewart@amd.com> (cherry picked from commit 408affb)
vamovsik
pushed a commit
that referenced
this pull request
Jul 23, 2025
find OpenMP config (#517) First search for ROCm's libomp.so via openmp-config.cmake. This is what we would prefer instead of searching for a system libomp.so/libgomp.so and then manually adding in a ROCm lib path. This methodology should still be RHEL-10 RPATH compliant. (cherry picked from commit 408affb) --------- Co-authored-by: estewart08 <ethan.stewart@amd.com>
assistant-librarian Bot
pushed a commit
that referenced
this pull request
Feb 3, 2026
Problem: When FloatAcc differs from FloatC (e.g., INT8×INT8→INT32 accumulator with FP32 output scaling), the CDE element op is invoked with wrong storage types. The element op contract is: (E& e, const C& c, const D& d...) where: - E = FloatC (final output type, e.g., float) - C = FloatAcc (accumulator type, e.g., int32_t) Original code used generate_tie() returning the same c_thread_buf for both E& and C&, which: 1. Violates the element op signature when types differ 2. Causes compile errors with strictly-typed element ops 3. Results in undefined behavior during ThreadwiseTensorSliceTransfer Solution: Introduce separate e_thread_buf<FloatC> for element op output, pass (E& e) from e_thread_buf and (const C& c) from c_thread_buf, then transfer e_thread_buf to global memory. Bug has existed since the file was created in December 2022 (PR #517).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary of proposed changes:
First search for ROCm's libomp.so via openmp-config.cmake. This is what we would prefer instead of searching for a system libomp.so/libgomp.so and then manually adding in a ROCm lib path.
This methodology should still be RHEL-10 RPATH compliant.