-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intel/ARM native SHA256 #157
Comments
interesting! |
I think is is Linux code to do the arm neon sha256: |
Currently I can't link: I'm not sure why, because the files are all arm64: |
@risner, I believe both your posts are off-topic here. This thread is about SHA-2 specific instructions, not about NEON or Aarch64 in general. As for your second post, feel free to open a new ticket; if you do, make sure you provide all the steps you followed and include or link to your |
Apologies. I may be confused, but the SHA-2 specific instructions are available only from A13+ and M1 arm chips only. From what I can tell, no other arm maker is using these built in instructions. And they seem to be in "neon" calls? I'll make another post for the off topic issue in compiling on M1. |
@risner, I agree it can be confusing. My point is that NEON can be used to improve SHA performance even without the newer SHA instructions (as cpuminer already does on 32-bit ARM, and could in theory do on Aarch64). From a cursory look, I believe that this is what the link you posted is about. |
Thanks. I also found this: It shows a 100 times improvement (6 MB/s to 615MB/s) going from unaccelerated code to 64 bit arm SHA commands (sha256h, sha256h2, sha256su0 and sha256su1). Here is the 64 bit code they implemented: |
My experience with SHA on Ryzen and Icelake is that 8-way parallel hashing is faster. I haven't tested vs 4-way because AVX2 I also found that SHA prevents some of the more innovative SW optimizations specific to sha256d. ARM SHA could indeed be faster than NEON 4-way. |
So there is nothing to gain from that optimization? |
New CPUs from Intel and ARMv8 cores support native SHA256 hashing (in microcode).
This would significantly increase the SHA256d hashrate if it was implemented.
The text was updated successfully, but these errors were encountered: