-
Notifications
You must be signed in to change notification settings - Fork 149
Gate ARM64-accelerated impl on hwcap() having the HWCAP_PMULL flag. #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I tested that the accelerated version is still used on my Pixel C, which does have the |
src/crc32c_arm64_linux_check.h
Outdated
| inline bool CanUseArm64Linux() { | ||
| #if defined(HAVE_STRONG_GETAUXVAL) || defined(HAVE_WEAK_GETAUXVAL) | ||
| // From 'arch/arm64/include/uapi/asm/hwcap.h' in Linux kernel source code. | ||
| constexpr unsigned long kHwCapPmull = 1 << 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very minor nit, but I think the standard (maybe just Android) constant names are HWCAP_PMULL and HWCAP_CRC32. What about adopting a constant naming convention similar to those? Maybe kHWCAP_PMULL and kHWCAP_CRC32?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Thank you! This was bothering me, but I didn't know a good way out. I really like your suggestion!
The hardware-accelerated CRC32C implementation that takes advantage of
ARM64 instructions is currently runtime-gated on hwcap() returning a
value that has the HWCAP_CRC32 flag set. This covers the
__crc32c{b,h,w,d} intrinsics, but does not cover the vmull_p64 call. The
later should be gated on the presence of the HWCAP_PMULL flag.
This is a speculative fix for Chrome crashes observed at the first
vmull_64 callsite on MSM8916-based boards.
|
@pwnall Yes, your fix is correct. Crypto instructions including pmull are optional for armv8 core, we should detect cpu capabilities before using crypto instructions. For this crc32c implementation, crc32 and pmull capabilities should be detected by AT_HWCAP flag set. Also I will modify corresponding patch in leveldb, thank you for your findings |
Issue axboe#1239 shows a crash on a FUJITSU/A64FX ARM platform at the following line: crc/crc32c-arm64.c: 64 t1 = (uint64_t)vmull_p64(crc1, k2); On armv8 PMULL crypto instructions like vmull_p64 are defined as optional (see google/crc32c#6 (comment) and dotnet/runtime#35143 (comment) ). Avoid the crash by gating use of the hardware accelerated ARM crc32c path behind runtime detection of PMULL. Fixes: axboe#1239 Signed-off-by: Sitsofe Wheeler <[email protected]> Tested-by: Yi Zhang <[email protected]>
The hardware-accelerated CRC32C implementation that takes advantage of ARM64 instructions is currently runtime-gated on hwcap() returning a value that has the HWCAP_CRC32 flag set. This covers the __crc32c{b,h,w,d} intrinsics, but does not cover the vmull_p64 call. The later should be gated on the presence of the HWCAP_PMULL flag.
This is a speculative fix for Chrome crashes observed at the first vmull_64 callsite on MSM8916-based boards.