Skip to content

cpu: ppc64: enable build without MMA#3959

Closed
sharkcz wants to merge 1 commit intouxlfoundation:mainfrom
sharkcz:ppc-no-mma
Closed

cpu: ppc64: enable build without MMA#3959
sharkcz wants to merge 1 commit intouxlfoundation:mainfrom
sharkcz:ppc-no-mma

Conversation

@sharkcz
Copy link

@sharkcz sharkcz commented Sep 17, 2025

Description

Currently some accelerated code requires the MMA engine to be enabled in the compiler, which limits the supported hardware to Power10+. Add a configure check and omit the problematic code when building for older CPUs.

Fixes #3749

Checklist

General

  • Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit? - I believe it doesn't cause any regression.
  • [N/A] Have you formatted the code using clang-format? - No code added.

Performance improvements

  • [N/A] Have you submitted performance data that demonstrates performance improvements?

New features

  • [N/A] Have you published an RFC for the new feature?
  • [N/A] Was the RFC approved?
  • [N/A] Have you added relevant tests?

Bug fixes

  • Have you included information on how to reproduce the issue (either in a github issue or in this PR)?
  • [N/A] Have you added relevant regression tests?

RFC PR

  • [N/A] Does RFC document follow the template?
  • [N/A] Have you added a link to the rendered document?

Currently some accelerated code requires the MMA engine to be enabled in
the compiler, which limits the supported hardware to Power10+. Add a
configure check and omit the problematic code when building for older
CPUs.
@sharkcz sharkcz requested a review from a team as a code owner September 17, 2025 16:44
@sharkcz
Copy link
Author

sharkcz commented Sep 17, 2025

The change is based on a workaround from Adam for the Fedora package - https://src.fedoraproject.org/rpms/onednn/blob/rawhide/f/0001-ppc64-nerf-the-cpu-code.patch

DNNL_AARCH64_ONLY(CPU_REORDER_INSTANCE(aarch64::jit_uni_reorder_t))

#ifdef DNNL_PPC64_HAS_MMA
DNNL_PPC64_ONLY(CPU_REORDER_INSTANCE(ppc64::ppc64_matrixA_reorder_t))
Copy link
Contributor

@dzarukin dzarukin Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if DNNL_PPC64_ONLY can be re-qualified to include DNNL_PPC64_HAS_MMA...
If not, a new DNNL_PPC64_MMA_ONLY might be a better option.

Edit: it seems a new version of the macro would be needed anyway that would be coupled with build time changes related to DNNL_PPC64_HAS_MMA.

${CMAKE_CURRENT_SOURCE_DIR}/gemm/*.[ch]pp
)
if(NOT UPPERCASE_CMAKE_BUILD_TYPE STREQUAL "DEBUG")
set_source_files_properties(${FILES_REQUIRED_OPT}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these flags are now applied only for MMA systems?

add_subdirectory(ppc64)
if(DNNL_PPC64_HAS_MMA)
add_subdirectory(ppc64)
endif()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems excluding the full directory here is not quite correct.
The CMakefile that should get chnges is src/cpu/ppc64/CMakeLists.txt. It should include extra files if DNNL_PPC64_HAS_MMA is defined under assumption that PPC support can have not only intrinsic support, or not only the last version of intrinsic support (like v8 versus v10).

@vpirogov
Copy link
Contributor

As we are getting into the weeds here I opened #3968 reverting guilty implementation and -mcpu=power10 flag. This would restore PowerPC functionality and optimizations to v3.8 state.

@vpirogov
Copy link
Contributor

Closing in favor of #3968.

@vpirogov vpirogov closed this Sep 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[PPC64] '-mmma' requiring instructions not guarded by __MMA__

3 participants