conv:tf32:add all CK instances#2725
Conversation
There was a problem hiding this comment.
Pull Request Overview
This pull request adds comprehensive TF32 (TensorFloat-32) support to MIOpen's Composable Kernel (CK) instances for convolution operations on gfx942. The implementation enables TF32 compute type across all convolution directions (forward, backward, and weight gradient) for both 2D and 3D grouped operations.
Key Changes
- Added TF32 template specializations to all group convolution solvers (Fwd/Bwd/Wrw for 2D and 3D)
- Implemented TF32 compute type parameter throughout the CK device operation templates
- Added test coverage for TF32 operations across all affected solvers
Reviewed Changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
projects/miopen/test/gtest/unit_conv_solver_ConvHipImplicitGemmGroupWrwXdlops.cpp |
New test file for 2D grouped weight gradient convolution with TF32 support |
projects/miopen/test/gtest/unit_conv_solver_ConvHipImplicitGemmGroupFwdXdlops.cpp |
New test file for 2D grouped forward convolution with TF32 support |
projects/miopen/test/gtest/unit_conv_solver_ConvHipImplicitGemmGroupBwdXdlops.cpp |
New test file for 2D grouped backward convolution with TF32 support |
projects/miopen/test/gtest/unit_conv_solver_ConvHipImplicitGemm3DGroupWrwXdlops.cpp |
New test file for 3D grouped weight gradient convolution with TF32 support |
projects/miopen/test/gtest/unit_conv_solver_ConvHipImplicitGemm3DGroupBwdXdlops.cpp |
New test file for 3D grouped backward convolution with TF32 support |
projects/miopen/test/gtest/unit_conv_solver.hpp |
Added type aliases for TF32 backward and weight gradient test fixtures |
projects/miopen/test/gtest/unit_conv_solver.cpp |
Updated verification calls to pass TF32 flag for tolerance adjustment |
projects/miopen/src/solver/conv/conv_hip_implicit_gemm_grouped_wrw_xdlops.cpp |
Extended 2D grouped WRW solver with TF32 compute type template parameter and conditional logic |
projects/miopen/src/solver/conv/conv_hip_implicit_gemm_grouped_fwd_xdlops.cpp |
Extended 2D grouped forward solver with TF32 compute type template parameter and conditional logic |
projects/miopen/src/solver/conv/conv_hip_implicit_gemm_grouped_bwd_xdlops.cpp |
Extended 2D grouped backward solver with TF32 compute type template parameter and conditional logic |
projects/miopen/src/solver/conv/conv_hip_implicit_gemm_3d_grouped_wrw_xdlops.cpp |
Extended 3D grouped WRW solver with TF32 support including alpha/beta handling for bilinear/scale operations |
projects/miopen/src/solver/conv/conv_hip_implicit_gemm_3d_grouped_fwd_xdlops.cpp |
Extended 3D grouped forward solver with TF32 support for bilinear/scale element-wise operations |
projects/miopen/src/solver/conv/conv_hip_implicit_gemm_3d_grouped_bwd_xdlops.cpp |
Extended 3D grouped backward solver with TF32 support for bilinear/scale element-wise operations |
projects/miopen/src/ocl/convolutionocl.cpp |
Added SetupComputeType calls for all convolution directions to enable TF32 detection |
projects/miopen/src/include/miopen/solver/implicitgemm_ck_util.hpp |
Updated device operation templates and factory definitions to support TF32 compute type parameter |
projects/miopen/src/include/miopen/conv/solvers.hpp |
Added UseTF32() accessor methods and mutable use_tf32 flags to performance config structures |
projects/miopen/driver/conv_driver.hpp |
Modified tolerance calculation logic for TF32 math type handling |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull Request Overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
We need to consider the impact the TF32 changes have on our system DB as well as heuristics. If this has changed the key that will be generated for shapes, then it could have made both of those invalid and needing regeneration. |
|
I believe our changes to the CK solvers is done now so if you can please resolve the conflicts, I can review, approve, and get this merged. |
|
Sure. @BradPepersAMD I resolved merge conflicts. We can wait the CI process. |
* Enable xdl in gfx11 & gfx12 * update cmake file * fix all instance build (cmake) * fix batched_gemm_gemm(cmake) * rebase cmake files * fix cmake build error * remve CK_ENABLE_DYNAMIC_WARP_SIZE * update cmake build error2 * fix gfx11 build CK_USE_XDL is enabled on gfx11 and gfx12 * fix gfx10 build * fix gfx11 error --------- Co-authored-by: Lin, Qun <Quentin.Lin+amdeng@amd.com> [ROCm/composable_kernel commit: f22740d]
Motivation
CK instances are all added on gfx942. Previous POC MIOpen PR(#1414) is merged also. This PR is to enable all CK instances in MIOpen algorithms, including forward/backward/wrw/scale/linear .etc.
As CK is bumped to the latest, gfx950 is enabled also.
Technical Details
Test Plan
Add several unit solver tests.
Test Result
pass.
gfx950 also run pass.

no undefined symbol is found on gfx950.

Submission Checklist