Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add specialization for 24MHz QueryPerformanceFrequency #3832

Merged
merged 9 commits into from
Jul 14, 2023
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
add specialization for 24MHz QueryPerformanceFrequency
Co-authored-by: Steven Noonan <steven@uplinklabs.net>
fsb4000 and tycho committed Jun 25, 2023

Verified

This commit was signed with the committer’s verified signature.
ViBiOh Vincent Boutour
commit f7479ab4b1f55fd45ca6774e2f4d5ed7fb9e27d9
25 changes: 22 additions & 3 deletions stl/inc/__msvc_chrono.hpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do the push_macro/undef/pop_macro magic incantation here for likely and unlikely?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likely and unlikely are commonly defined as function-like macros, but not object-like macros. <xkeycheck.h> avoids rejecting them for that reason.

Unlike msvc etc., users are technically not supposed to macroize likely and unlikely, so we technically don't need to defend against them. We could but I don't think it's necessary at the moment.

Original file line number Diff line number Diff line change
@@ -666,18 +666,35 @@ namespace chrono {
using time_point = _CHRONO time_point<steady_clock>;
static constexpr bool is_steady = true;

#if defined(_M_ARM) || defined(_M_ARM64)
#define _LIKELY_ARM likely
#define _LIKELY_X86 unlikely
#elif defined(_M_IX86) || defined(_M_X64)
#define _LIKELY_ARM unlikely
#define _LIKELY_X86 likely
#else
#error Unknown architecture
#endif
_NODISCARD static time_point now() noexcept { // get current time
const long long _Freq = _Query_perf_frequency(); // doesn't change after system boot
const long long _Ctr = _Query_perf_counter();
static_assert(period::num == 1, "This assumes period::num == 1.");
// 10 MHz is a very common QPC frequency on modern PCs. Optimizing for
// 10 MHz is a very common QPC frequency on modern X86 PCs. Optimizing for
// this specific frequency can double the performance of this function by
// avoiding the expensive frequency conversion path.
constexpr long long _TenMHz = 10'000'000;
if (_Freq == _TenMHz) {
constexpr long long _TwentyFourMHz = 24'000'000;
constexpr long long _TenMHz = 10'000'000;
if (_Freq == _TenMHz) [[_LIKELY_X86]] {
static_assert(period::den % _TenMHz == 0, "It should never fail.");
constexpr long long _Multiplier = period::den / _TenMHz;
return time_point(duration(_Ctr * _Multiplier));
} else if (_Freq == _TwentyFourMHz) [[_LIKELY_ARM]] {
// The compiler recognizes the constants for frequency and time period and uses shifts and multiplies
// instead of divides to calculate the nanosecond value. This frequency is common on ARM64 (Windows
// devices, and Apple Silicon Macs using Parallels Desktop)
const long long _Whole = (_Ctr / _TwentyFourMHz) * period::den;
const long long _Part = (_Ctr % _TwentyFourMHz) * period::den / _TwentyFourMHz;
return time_point(duration(_Whole + _Part));
} else {
// Instead of just having "(_Ctr * period::den) / _Freq",
// the algorithm below prevents overflow when _Ctr is sufficiently large.
@@ -690,6 +707,8 @@ namespace chrono {
}
}
};
#undef _LIKELY_ARM
#undef _LIKELY_X86

_EXPORT_STD using high_resolution_clock = steady_clock;
} // namespace chrono