Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix hardware optimization compilation - [MOD-6567] #438

Merged
merged 9 commits into from
Apr 7, 2024

Conversation

GuyAv46
Copy link
Collaborator

@GuyAv46 GuyAv46 commented Apr 3, 2024

Describe the changes in the pull request

We previously discovered that the way we add architecture optimization flags was wrong, and we "leaked" higher optimizations onto lower optimization level functions (e.g. an SSE function was compiled with the -mavx flag, and yielded a binary that an SSE only machine cannot use).

from GCC documentation:

GCC depresses SSEx instructions when -mavx is used. Instead, it generates new AVX instructions or AVX equivalence for all SSEx instructions when needed.

These options enable GCC to use these extended instructions in generated code, even without -mfpmath=sse. Applications that perform run-time CPU detection must compile separate files for each supported architecture, using the appropriate flags. In particular, the file containing the CPU detection code should be compiled without these options.

This PR fixes the ill-formed CMakeList.txt files and adds new .cpp files for choosing the optimization function, which will each be compiled with the right flags

Note that a regression is expected for the SSE functions, as they got "too optimized"

Mark if applicable

  • This PR introduces API changes
  • This PR introduces serialization changes

Copy link

codecov bot commented Apr 3, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.41%. Comparing base (0fca479) to head (764ffcc).

❗ Current head 764ffcc differs from pull request most recent head 02f3f07. Consider uploading reports for the commit 02f3f07 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #438      +/-   ##
==========================================
+ Coverage   95.39%   95.41%   +0.01%     
==========================================
  Files          66       69       +3     
  Lines        4148     4184      +36     
==========================================
+ Hits         3957     3992      +35     
- Misses        191      192       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@meiravgri meiravgri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!
Consider to add this to the description:
https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html

GCC depresses SSEx instructions when -mavx is used. Instead, it generates new AVX instructions or AVX equivalence for all SSEx instructions when needed.

These options enable GCC to use these extended instructions in generated code, even without -mfpmath=sse. Applications that perform run-time CPU detection must compile separate files for each supported architecture, using the appropriate flags. In particular, the file containing the CPU detection code should be compiled without these options.

src/VecSim/spaces/CMakeLists.txt Show resolved Hide resolved
src/VecSim/spaces/CMakeLists.txt Show resolved Hide resolved
src/VecSim/spaces/IP/IP_SSE_FP32.h Show resolved Hide resolved
Copy link
Collaborator

@alonre24 alonre24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good! just simplify a bit benchmarks where possible

@GuyAv46 GuyAv46 added this pull request to the merge queue Apr 7, 2024
@GuyAv46 GuyAv46 removed this pull request from the merge queue due to a manual request Apr 7, 2024
@GuyAv46 GuyAv46 added this pull request to the merge queue Apr 7, 2024
Merged via the queue into main with commit 9212165 Apr 7, 2024
15 checks passed
@GuyAv46 GuyAv46 deleted the guyav-fix_hardware_optimization_compilation branch April 7, 2024 14:09
Copy link

github-actions bot commented Apr 7, 2024

Backport failed for 0.6, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 0.6
git worktree add -d .worktree/backport-438-to-0.6 origin/0.6
cd .worktree/backport-438-to-0.6
git switch --create backport-438-to-0.6
git cherry-pick -x 7e2adad94e7f24f04740bc6482d14abba07cae7c fd6b8478146602f6e4bd323c2133b567da72008f bf1f0240e88b4bc554790f1ecec2783317eef4f1 b1cce60c8585604dce9761832b94c8cc0ef8a227 ebab36dd0198d2bb494462779d8f764b992caa5a 89f8eb08e273807fa97897872d38eb008b55c5f7 b2b87e768e1da1926ca1bafd60a100e8049c4e9b 764ffccbd156e7ce6bb55a76fe8ae4541c7d5370 02f3f07298fe90312f072e131efabaa85b5d7ccb

Copy link

github-actions bot commented Apr 7, 2024

Successfully created backport PR for 0.7:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants