Skip to content

Op2dTensorLite kernel port to HIP#2295

Closed
ghost wants to merge 1 commit into
release/rocm-rel-7.1from
unknown repository
Closed

Op2dTensorLite kernel port to HIP#2295
ghost wants to merge 1 commit into
release/rocm-rel-7.1from
unknown repository

Conversation

@ghost
Copy link
Copy Markdown

@ghost ghost commented Oct 27, 2025

#Motivation
Rewriting of Op2dTensorLite kernel from OpenCL to HIP.
The goal was to provide the kernel with the same functionality without losing the performance.

#Technical Details
struct PerfHelper (/gtest/perf_helper.hpp) declared without a type template due to a problem with the conversion of the GetKernelTime() function's return result for a type different from 'float'.

#Test Result
The run of the test /gtest/tensor_2d_lite_ocl_hip.cpp passes successfully.

@ghost ghost self-requested a review as a code owner October 27, 2025 17:38
@assistant-librarian assistant-librarian Bot added the external contribution Code contribution from users community.. label Oct 27, 2025
caio96 pushed a commit that referenced this pull request Oct 27, 2025
This PR adds a convenience feature for development. When switching
between `hipDNN` code and `fusilli-plugin` code it's nice to have one
location for sources.
@ghost ghost closed this by deleting the head repository Oct 28, 2025
qiangpan2 pushed a commit to qiangpan2/rocm-libraries that referenced this pull request May 29, 2026
)

* SWDEV-535598 - remove usage of 'warpSize' variable as it has been deprecated. Ideally get_warp_size() should not be constexpr but this is just a workaround

* SWDEV-535598 - remove comment from get_warp_size as constexpr is required for this repo

---------

Co-authored-by: Gerardo Hernandez <gerardo.hernandez@amd.com>

[ROCm/composable_kernel commit: 6635d1b]
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

external contribution Code contribution from users community.. project: miopen

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants