Skip to content

find OpenMP config#517

Merged
TorreZuk merged 1 commit into
developfrom
users/torrezuk/omp-config
Jul 9, 2025
Merged

find OpenMP config#517
TorreZuk merged 1 commit into
developfrom
users/torrezuk/omp-config

Conversation

@TorreZuk
Copy link
Copy Markdown
Contributor

@TorreZuk TorreZuk commented Jul 7, 2025

Summary of proposed changes:

First search for ROCm's libomp.so via openmp-config.cmake. This is what we would prefer instead of searching for a system libomp.so/libgomp.so and then manually adding in a ROCm lib path.
This methodology should still be RHEL-10 RPATH compliant.

@TorreZuk TorreZuk added the noTensile Run PR without Tensile label Jul 7, 2025
@TorreZuk TorreZuk requested a review from a team as a code owner July 7, 2025 21:07
@TorreZuk TorreZuk added TestLevel2Only Tests only Level 2 functions in this PR project: rocblas labels Jul 7, 2025
@TorreZuk TorreZuk force-pushed the users/torrezuk/omp-config branch from f798c63 to 879098b Compare July 8, 2025 00:55
stanleytsang-amd pushed a commit to stanleytsang-amd/rocm-libraries that referenced this pull request Jul 8, 2025
* Structuring backend (ROCm#495)

* split rocsparse_handle.hpp

* split rocsparse_handle.cpp

* structuring trm_info without behaviour change

* fixing test failure related to copy_mat_info

* Fixing default initialization of rocsparse_trm_info.
Removing error prone cast when calling hipMalloc

* Fixing missing behavior check in spgeam (ROCm#514)

* Fixing missing behavior check.

* stage analysis and stage symbolic are not required for stage numeric

* spgeam final fix (ROCm#516)

* spgeam fix

---------

Co-authored-by: Yvan Mokwinski <yvan.mokwinski@gmail.com>
@TorreZuk TorreZuk requested review from bstefanuk and davidd-amd July 8, 2025 15:24
@TorreZuk
Copy link
Copy Markdown
Contributor Author

TorreZuk commented Jul 8, 2025

How do we ping rocm-math-lib-build-infra list for reviews?

Copy link
Copy Markdown
Contributor

@davidd-amd davidd-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a description and include details on the problem and how the changes resolve it.

@TorreZuk
Copy link
Copy Markdown
Contributor Author

TorreZuk commented Jul 9, 2025

Replicates work in ROCm/hipBLAS#1038

@TorreZuk TorreZuk force-pushed the users/torrezuk/omp-config branch from e44275c to 1bec38e Compare July 9, 2025 16:11
@TorreZuk TorreZuk merged commit c6baf77 into develop Jul 9, 2025
7 of 9 checks passed
@TorreZuk TorreZuk deleted the users/torrezuk/omp-config branch July 9, 2025 18:57
TorreZuk added a commit that referenced this pull request Jul 9, 2025
Summary of proposed changes:

First search for ROCm's libomp.so via openmp-config.cmake. This is what
we would prefer instead of searching for a system libomp.so/libgomp.so
and then manually adding in a ROCm lib path.
This methodology should still be RHEL-10 RPATH compliant.

Co-authored-by:  estewart08 <ethan.stewart@amd.com>
(cherry picked from commit c6baf77)
assistant-librarian Bot pushed a commit to ROCm/rocBLAS that referenced this pull request Jul 9, 2025
find OpenMP config (#517)

Summary of proposed changes:

First search for ROCm's libomp.so via openmp-config.cmake. This is what
we would prefer instead of searching for a system libomp.so/libgomp.so
and then manually adding in a ROCm lib path.
This methodology should still be RHEL-10 RPATH compliant.

Co-authored-by:  estewart08 <ethan.stewart@amd.com>
TorreZuk added a commit that referenced this pull request Jul 10, 2025
find OpenMP config (#517)
    
First search for ROCm's libomp.so via openmp-config.cmake. This is what
we would prefer instead of searching for a system libomp.so/libgomp.so
    and then manually adding in a ROCm lib path.
    This methodology should still be RHEL-10 RPATH compliant.
    

Co-authored-by: estewart08 <ethan.stewart@amd.com>
TorreZuk added a commit that referenced this pull request Jul 14, 2025
find OpenMP config (#517)

First search for ROCm's libomp.so via openmp-config.cmake. This is what
we would prefer instead of searching for a system libomp.so/libgomp.so
    and then manually adding in a ROCm lib path.
    This methodology should still be RHEL-10 RPATH compliant.

Co-authored-by: estewart08 <ethan.stewart@amd.com>
(cherry picked from commit 408affb)
ammallya pushed a commit that referenced this pull request Jul 14, 2025
* Structuring backend (#495)

* split rocsparse_handle.hpp

* split rocsparse_handle.cpp

* structuring trm_info without behaviour change

* fixing test failure related to copy_mat_info

* Fixing default initialization of rocsparse_trm_info.
Removing error prone cast when calling hipMalloc

* Fixing missing behavior check in spgeam (#514)

* Fixing missing behavior check.

* stage analysis and stage symbolic are not required for stage numeric

* spgeam final fix (#516)

* spgeam fix

---------

Co-authored-by: Yvan Mokwinski <yvan.mokwinski@gmail.com>

[ROCm/rocSPARSEcommit: 22d1de0]
ammallya pushed a commit that referenced this pull request Jul 14, 2025
* Structuring backend (#495)

* split rocsparse_handle.hpp

* split rocsparse_handle.cpp

* structuring trm_info without behaviour change

* fixing test failure related to copy_mat_info

* Fixing default initialization of rocsparse_trm_info.
Removing error prone cast when calling hipMalloc

* Fixing missing behavior check in spgeam (#514)

* Fixing missing behavior check.

* stage analysis and stage symbolic are not required for stage numeric

* spgeam final fix (#516)

* spgeam fix

---------

Co-authored-by: Yvan Mokwinski <yvan.mokwinski@gmail.com>

[ROCm/rocSPARSE commit: 22d1de0]
SathiyarajRam pushed a commit that referenced this pull request Jul 15, 2025
find OpenMP config (#517)

First search for ROCm's libomp.so via openmp-config.cmake. This is what
we would prefer instead of searching for a system libomp.so/libgomp.so
    and then manually adding in a ROCm lib path.
    This methodology should still be RHEL-10 RPATH compliant.

Co-authored-by: estewart08 <ethan.stewart@amd.com>
(cherry picked from commit 408affb)
vamovsik pushed a commit that referenced this pull request Jul 23, 2025
find OpenMP config (#517)

First search for ROCm's libomp.so via openmp-config.cmake. This is what
we would prefer instead of searching for a system libomp.so/libgomp.so
    and then manually adding in a ROCm lib path.
    This methodology should still be RHEL-10 RPATH compliant.


(cherry picked from commit 408affb)

---------

Co-authored-by: estewart08 <ethan.stewart@amd.com>
assistant-librarian Bot pushed a commit that referenced this pull request Feb 3, 2026
Problem:
When FloatAcc differs from FloatC (e.g., INT8×INT8→INT32 accumulator with
FP32 output scaling), the CDE element op is invoked with wrong storage types.

The element op contract is: (E& e, const C& c, const D& d...) where:
- E = FloatC (final output type, e.g., float)
- C = FloatAcc (accumulator type, e.g., int32_t)

Original code used generate_tie() returning the same c_thread_buf for both
E& and C&, which:
1. Violates the element op signature when types differ
2. Causes compile errors with strictly-typed element ops
3. Results in undefined behavior during ThreadwiseTensorSliceTransfer

Solution:
Introduce separate e_thread_buf<FloatC> for element op output, pass
(E& e) from e_thread_buf and (const C& c) from c_thread_buf, then
transfer e_thread_buf to global memory.

Bug has existed since the file was created in December 2022 (PR #517).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

noTensile Run PR without Tensile organization: ROCm project: rocblas TestLevel2Only Tests only Level 2 functions in this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants