-
Notifications
You must be signed in to change notification settings - Fork 8.1k
Optimize CRC32 calculation on arm64 #490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@zorrorffm Thank you very much for the contribution! Due to some infrastructure issues, I won't be able to act on this very quickly. That being said, I am very grateful for your PR, and I look forward to improving the ARM performance. |
|
@pwnall Thanks you. If you need ARM machine to run some tests, I would be pleased to do that. |
|
@zorrorffm I ran tests on a Google Pixel C tablet, and the optimized version seems to be 4-10x faster. If this is reasonably representative hardware, I don't think I'll need another machine. I'll take you up on your offer if it turns out I really need a machine that can easily run a GNU/Linux system. |
|
@zorrorffm The LevelDB work hasn't happened yet, but I used your code in https://github.com/google/crc32c/blob/master/src/crc32c_arm64.cc -- thank you very much! |
|
@zorrorffm We deployed the version of this patch at https://github.com/google/crc32c/blob/master/src/crc32c_arm64.cc in Chrome, and we got crashes when trying to execute the I have a hunch that we should also be checking that the flags returned by |
|
@zorrorffm On the issue above, can you please take a look and comment on google/crc32c#6? |
|
@pwnall Yes, your fix is correct. Thank you for your findings. |
ARM64 provides crc32 instructions for accelerating crc32 calculation. This patch is optimization for linux under aarch64 The comparision of performance is as below old crc32c : 5.670 micros/op; 688.9 MB/s (4K per op) new crc32c : 0.451 micros/op; 8663.4.7 MB/s (4K per op) Change-Id: I51d25ca19688fc95a57b84ee0c1493c18a288087
435a038 to
3922318
Compare
|
Update the patch with detection of CPU capabilities. Acceleration is available only if pmull and crc32c instructions are supported |
|
@googlebot rescan |
|
Thx for the PR @zorrorffm. The optimized crc32c code was moved out into https://github.com/google/crc32c (which supports armv8-a+crc+crypto) in 5c39524. |
ARM64 provides crc32 instructions for accelerating crc32 calculation.
This patch is optimization for linux under aarch64
The comparision of performance is as below
old
crc32c : 5.670 micros/op; 688.9 MB/s (4K per op)
new
crc32c : 0.451 micros/op; 8663.4.7 MB/s (4K per op)
Change-Id: I51d25ca19688fc95a57b84ee0c1493c18a288087
The name I signed CLA with is [email protected]
The issue number is #478