Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intel/ARM native SHA256 #157

Open
ghost opened this issue Jan 5, 2018 · 9 comments
Open

Intel/ARM native SHA256 #157

ghost opened this issue Jan 5, 2018 · 9 comments

Comments

@ghost
Copy link

ghost commented Jan 5, 2018

New CPUs from Intel and ARMv8 cores support native SHA256 hashing (in microcode).
This would significantly increase the SHA256d hashrate if it was implemented.

@yu-chenxi
Copy link

interesting!

@risner
Copy link

risner commented Apr 29, 2021

I think is is Linux code to do the arm neon sha256:
https://patchwork.kernel.org/project/linux-arm-kernel/patch/[email protected]/

@risner
Copy link

risner commented Apr 29, 2021

Currently I can't link:
clang -fno-strict-aliasing -Ofast -arch arm64 -mfpu=neon -pthread -o minerd minerd-cpu-miner.o minerd-util.o minerd-sha2.o minerd-scrypt.o /opt/homebrew/Cellar/curl/7.76.1/lib/libcurl.dylib compat/jansson/libjansson.a -lpthread
Undefined symbols for architecture arm64:
"_sha256_init_4way", referenced from:
_scanhash_scrypt in minerd-scrypt.o
"_sha256_transform_4way", referenced from:
_scanhash_scrypt in minerd-scrypt.o
"_sha256_use_4way", referenced from:
_scanhash_sha256d in minerd-sha2.o
_scanhash_scrypt in minerd-scrypt.o
"_sha256d_ms_4way", referenced from:
_scanhash_sha256d in minerd-sha2.o
ld: symbol(s) not found for architecture arm64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [minerd] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2

I'm not sure why, because the files are all arm64:
risner@M1 cpuminer-master % file *.o
minerd-cpu-miner.o: Mach-O 64-bit object arm64
minerd-scrypt.o: Mach-O 64-bit object arm64
minerd-sha2.o: Mach-O 64-bit object arm64
minerd-util.o: Mach-O 64-bit object arm64

@pooler
Copy link
Owner

pooler commented Apr 29, 2021

@risner, I believe both your posts are off-topic here. This thread is about SHA-2 specific instructions, not about NEON or Aarch64 in general. As for your second post, feel free to open a new ticket; if you do, make sure you provide all the steps you followed and include or link to your config.log.

@risner
Copy link

risner commented Apr 29, 2021

Apologies. I may be confused, but the SHA-2 specific instructions are available only from A13+ and M1 arm chips only. From what I can tell, no other arm maker is using these built in instructions. And they seem to be in "neon" calls?

I'll make another post for the off topic issue in compiling on M1.

@pooler
Copy link
Owner

pooler commented Apr 29, 2021

@risner, I agree it can be confusing. My point is that NEON can be used to improve SHA performance even without the newer SHA instructions (as cpuminer already does on 32-bit ARM, and could in theory do on Aarch64). From a cursory look, I believe that this is what the link you posted is about.

@risner
Copy link

risner commented Apr 30, 2021

Thanks. I also found this:
https://www.google.com/amp/s/blog.min.io/accelerating-sha256-by-100x-in-golang-on-arm/amp/

It shows a 100 times improvement (6 MB/s to 615MB/s) going from unaccelerated code to 64 bit arm SHA commands (sha256h, sha256h2, sha256su0 and sha256su1).

Here is the 64 bit code they implemented:
https://github.com/minio/sha256-simd/blob/6de4475307716de15b286880ff321c9547086fdd/sha256block_arm64.s

@JayDDee
Copy link

JayDDee commented Oct 19, 2021

My experience with SHA on Ryzen and Icelake is that 8-way parallel hashing is faster. I haven't tested vs 4-way because AVX2
is available on most CPUs with SHA, but extrapolating the 8-way test results suggests 4-way is significantly slower than SHA.

I also found that SHA prevents some of the more innovative SW optimizations specific to sha256d.

ARM SHA could indeed be faster than NEON 4-way.

@risner
Copy link

risner commented Nov 2, 2021

So there is nothing to gain from that optimization?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants
@pooler @risner @JayDDee @yu-chenxi and others