Use _tpause instead of __builtin_ia32_tpause#27607
Conversation
There was a problem hiding this comment.
Pull request overview
This PR improves portability of the x86_64 spin-wait implementation by using the standardized _tpause intrinsic (from waitpkgintrin.h via existing headers) instead of directly invoking the compiler-specific built-in __builtin_ia32_tpause, which differs across compilers.
Changes:
- Switch Linux
tpauseusage from__builtin_ia32_tpauseto_tpause. - Consolidate the Windows/Linux
tpausepath under a single preprocessor branch.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
|
Previously, I investigated an issue for clang build. Here is some information I got: |
ed7fa32 to
fe170b5
Compare
|
Thanks for the advice! I have changed the section to use However, I'm not sure if the Additional infogenerated code from GCC v14.2.0generated code from LLVM v21.1.8 |
|
Regarding CMake C flags, since |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
fe170b5 to
7cabbc7
Compare
7cabbc7 to
2d87f10
Compare
|
Rebased to include #27618. |
|
/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, |
|
Azure Pipelines successfully started running 4 pipeline(s). |
### Description Use `_tpause` function defined in `waitpkgintrin.h` instead of calling the compiler built-in function (`__builtin_ia32_tpause`) directly. ### Motivation and Context The [`_tpause`][intel-intrinsics-guide] is independent of the compiler, whereas its implementation via the built-in function `__builtin_ia32_tpause` varies by compiler. Therefore, it is advisable not to use it directly. For example, [GCC][waitpkgintrin-gcc] and [LLVM][waitpkgintrin-llvm] have different arguments, leading to portability issues. [intel-intrinsics-guide]: https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=tpause&techs=Other&ig_expand=6888 [waitpkgintrin-gcc]: https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/i386/waitpkgintrin.h;h=42c6b0cd02866eccdfe3308f4792f17fe8c6ae38;hb=HEAD#l51 [waitpkgintrin-llvm]: https://github.com/llvm/llvm-project/blob/a682073ae7a49de4b95498ba01b9ea32e6b5f607/clang/lib/Headers/waitpkgintrin.h#L33-L38
This cherry-picks the following commits for the release: | Commit ID | PR Number | Commit Title | |-----------|-----------|-------------| | eb23be8 | #27354 | Update python_requires | | d626b56 | #27479 | [QNN EP] Enable offline x64 compilation with memhandle IO type | | 60ce0e6 | #27607 | Use `_tpause` instead of `__builtin_ia32_tpause` | | 69feb84 | #27591 | Add PCI bus fallback for Linux GPU device discovery in containerized environments | | de92668 | #27650 | Revert "[QNN EP] Fix error messages being logged as VERBOSE instead o… | | 0f66526 | #27644 | [Plugin EP] Check for nullptr before dereferencing | | 929f73e | #27666 | Plugin EP: Fix bug that incorrectly assigned duplicate MetDef IDs to fused nodes in different GraphViews | --------- Co-authored-by: XXXXRT666 <157766680+XXXXRT666@users.noreply.github.com> Co-authored-by: derdeljan-msft <derdeljan@microsoft.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Shogo Yamazaki <f9ifphmiz7i8akhowc8l5t1x9qp0lfu4@mocknen.net> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: baijumeswani <12852605+baijumeswani@users.noreply.github.com> Co-authored-by: edgchen1 <18449977+edgchen1@users.noreply.github.com> Co-authored-by: Baiju Meswani <bmeswani@microsoft.com> Co-authored-by: Artur Wojcik <artur.wojcik@amd.com> Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com>
Description
Use
_tpausefunction defined inwaitpkgintrin.hinstead of calling the compiler built-in function (__builtin_ia32_tpause) directly.Motivation and Context
The
_tpauseis independent of the compiler, whereas its implementation via the built-in function__builtin_ia32_tpausevaries by compiler. Therefore, it is advisable not to use it directly. For example, GCC and LLVM have different arguments, leading to portability issues.