cpu: ppc64: enable build without MMA#3959
Conversation
Currently some accelerated code requires the MMA engine to be enabled in the compiler, which limits the supported hardware to Power10+. Add a configure check and omit the problematic code when building for older CPUs.
|
The change is based on a workaround from Adam for the Fedora package - https://src.fedoraproject.org/rpms/onednn/blob/rawhide/f/0001-ppc64-nerf-the-cpu-code.patch |
| DNNL_AARCH64_ONLY(CPU_REORDER_INSTANCE(aarch64::jit_uni_reorder_t)) | ||
|
|
||
| #ifdef DNNL_PPC64_HAS_MMA | ||
| DNNL_PPC64_ONLY(CPU_REORDER_INSTANCE(ppc64::ppc64_matrixA_reorder_t)) |
There was a problem hiding this comment.
I wonder if DNNL_PPC64_ONLY can be re-qualified to include DNNL_PPC64_HAS_MMA...
If not, a new DNNL_PPC64_MMA_ONLY might be a better option.
Edit: it seems a new version of the macro would be needed anyway that would be coupled with build time changes related to DNNL_PPC64_HAS_MMA.
| ${CMAKE_CURRENT_SOURCE_DIR}/gemm/*.[ch]pp | ||
| ) | ||
| if(NOT UPPERCASE_CMAKE_BUILD_TYPE STREQUAL "DEBUG") | ||
| set_source_files_properties(${FILES_REQUIRED_OPT} |
There was a problem hiding this comment.
Why these flags are now applied only for MMA systems?
| add_subdirectory(ppc64) | ||
| if(DNNL_PPC64_HAS_MMA) | ||
| add_subdirectory(ppc64) | ||
| endif() |
There was a problem hiding this comment.
It seems excluding the full directory here is not quite correct.
The CMakefile that should get chnges is src/cpu/ppc64/CMakeLists.txt. It should include extra files if DNNL_PPC64_HAS_MMA is defined under assumption that PPC support can have not only intrinsic support, or not only the last version of intrinsic support (like v8 versus v10).
|
As we are getting into the weeds here I opened #3968 reverting guilty implementation and |
|
Closing in favor of #3968. |
Description
Currently some accelerated code requires the MMA engine to be enabled in the compiler, which limits the supported hardware to Power10+. Add a configure check and omit the problematic code when building for older CPUs.
Fixes #3749
Checklist
General
make testandmake test_benchdnn_*) pass locally for each commit? - I believe it doesn't cause any regression.Performance improvements
New features
Bug fixes
RFC PR