Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Makefile for Android NDK cross-compile #3901

Conversation

pschneider1968
Copy link
Contributor

For cross-compiling to Android, the Makefile needs some tweaks.

Tested with Android NDK 23.1.7779620 and 21.4.7075529, using
Windows 10 with clean MSYS2 environment (i.e. no MINGW/GCC/Clang
toolchain in PATH) and Fedora 35, with build target:
build ARCH=armv8 COMP=ndk

The resulting binary runs fine inside Droidfish on my Samsung
Galaxy Note20 Ultra and Samsung Galaxy Tab S7+

Other builds tested to exclude regressions: MINGW64/Clang64 build
on Windows; MINGW64 cross build, native Clang and GCC builds on Fedora.

No functional change

For cross-compiling to Android, the Makefile needs some tweaks.

Tested with Android NDK 23.1.7779620 and 21.4.7075529, using
Windows 10 with clean MSYS2 environment (i.e. no MINGW/GCC/Clang
toolchain in PATH) and Fedora 35, with build target:
build ARCH=armv8 COMP=ndk

The resulting binary runs fine inside Droidfish on my Samsung
Galaxy Note20 Ultra and Samsung Galaxy Tab S7+

Other builds tested to exclude regressions: MINGW64/Clang64 build
on Windows; MINGW64 cross build, native Clang and GCC builds on Fedora.

No functional change
@pschneider1968
Copy link
Contributor Author

pschneider1968 commented Jan 23, 2022

Improvements are:

  • we shouldn't misname the executable and not set target_windows=yes when we are cross-compiling on Windows
  • make strip ARCH=armv8 COMP=ndk should pick the correct stripbinary, regardless of NDK version
  • in case we are doing multiple subsequent builds in the same directory, make objclean should delete all binaries and object files from any previous builds

@pschneider1968
Copy link
Contributor Author

Hooray, this was my first self compiled Android app, BTW 😉😎

@ppigazzini
Copy link
Contributor

The wiki is lacking a "Building SF with NDK" page :)
https://github.com/glinscott/fishtest/wiki

It seems that it's easy to cross build from any system to any system with clang (and toolsets using clang like NDK, zig).
So to proper account the cross build perhaps in the future we should introduce something like TARGET_OS TARGET_ARCH. When not provided the Makefile should sets the target OS/ARCH to the host OS/ARCH.

If I'm not wrong it seems possible to build from Windows to Windows using NDK (supposing that this makes sense), but this will require some Makefile editing after this PR:
https://developer.android.com/ndk/guides/abis

@pschneider1968
Copy link
Contributor Author

pschneider1968 commented Jan 23, 2022

The wiki is lacking a "Building SF with NDK" page :) https://github.com/glinscott/fishtest/wiki

I'll try to write something up in the evening or tomorrow (but it was really trivial: use the MSYS2 env, just put the NDK compiler bin/ directory first in your path, and there you go!)

@pschneider1968
Copy link
Contributor Author

I have added a Wiki page on cross-compiling for Android. Please review and let me know whether I should improve something, or whether it's good enough and easy to follow. Maybe too detailed!? I'm looking forward to any kind of feedback.

https://github.com/glinscott/fishtest/wiki/Cross-compiling-Stockfish-for-Android-on-Windows-and-Linux

@ppigazzini
Copy link
Contributor

Great job! Fixed some typos.

@pschneider1968
Copy link
Contributor Author

Thanks! I'm glad it's OK and usable!

@ppigazzini
Copy link
Contributor

ppigazzini commented Jan 24, 2022

@pschneider1968 try zig, only 60 MB download, supports all targets :D
https://ziglang.org/download/
https://zig.news/kristoff/cross-compile-a-c-c-project-with-zig-3599

I cross compiled from MSYS and Ubuntu 20.04.
Tested the arm8 static build with qemu-aarch64 ./stockfish_armv8-musl bench
(to install qemu sudo apt update && sudo apt install -y qemu-user):

export PATH=/c/zig/:$PATH
git clone https://github.com/official-stockfish/Stockfish.git
cd Stockfish/src

# dynamic build with gnu glibc
make build ARCH=armv8 COMP=gcc CXX="zig c++ -target aarch64-linux-gnu"
mv stockfish stockfish_armv8
make clean
make build ARCH=x86-64-modern COMP=gcc CXX="zig c++ -target x86_64-linux-gnu"
mv stockfish stockfish_x86-64-modern

# static build with musl libc
make build ARCH=armv8 COMP=gcc CXX="zig c++ -target aarch64-linux-musl"
mv stockfish stockfish_armv8-musl
make build ARCH=x86-64-modern COMP=gcc CXX="zig c++ -target x86_64-linux-musl"
mv stockfish stockfish_x86-64-modern-musl

@ppigazzini
Copy link
Contributor

zig seems not be able to make a PGO.
With NDK on Ubuntu perhaps you can make a PGO aarch64 build using qemu-aarch64 like wine for a mingw-64 build.
To install qemu sudo apt update && sudo apt install -y qemu-user

@ppigazzini
Copy link
Contributor

ppigazzini commented Jan 24, 2022

Cross compile for aarch64 with NEON on a clean Ubuntu 20.04:

# one time configuration
sudo apt update && sudo apt install -y make git
sudo snap install zig --classic --edge
sudo apt install -y qemu-user

# static build with musl libc
git clone https://github.com/official-stockfish/Stockfish.git
cd Stockfish/src
make -j build ARCH=armv8 CXX="zig c++ -target aarch64-linux-musl"
# test with qemu-aarch64 (loaded automatically)
stockfish bench

Build with COMP=ndk works after dropping -latomic in the Makefile:

make -j build ARCH=armv8 COMP=ndk CXX="zig c++ -target aarch64-linux-musl"
diff --git a/src/Makefile b/src/Makefile
index 5f3e739f..cc3c5650 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -457,7 +457,7 @@ ifeq ($(COMP),ndk)
                CXX=aarch64-linux-android21-clang++
                STRIP=aarch64-linux-android-strip
        endif
-       LDFLAGS += -static-libstdc++ -pie -lm -latomic
+       LDFLAGS += -static-libstdc++ -pie -lm
 endif

@pschneider1968
Copy link
Contributor Author

I've never heard of zig before. That looks really interesting! I will have a look and play around with it.

@vondele vondele added the to be merged Will be merged shortly label Jan 25, 2022
@vondele
Copy link
Member

vondele commented Jan 25, 2022

zig + qemu etc looks nice, will give an easy way to test e.g. the ARM code on linux. I wonder if we can somehow fix the pthreads issue cleanly.

@vondele vondele closed this in bddd38c Jan 25, 2022
@pschneider1968 pschneider1968 deleted the android_nkd_build branch January 25, 2022 08:26
@ppigazzini
Copy link
Contributor

I read of zig in this great post about cross compiling with clang:
https://mcilloni.ovh/2021/02/09/cxx-cross-clang/

I learned about qemu-user in this post by the zig's author:
https://andrewkelley.me/post/zig-cc-powerful-drop-in-replacement-gcc-clang.html

Here is a interesting zig feature about macOS:
https://zig.news/kristoff/cross-compile-a-c-c-project-with-zig-3599

When it comes to macOS, Zig even has a custom built linker able to cross-compile for both Intel and Apple Silicon M1, something that not even lld (LLVM's linker) can do. Thanks to that, at the moment of writing, Zig is the only C/C++ compiler able to cross-compile and cross-sign (i.e. perform codesigning from another platform) for Apple Silicon.

zig builds on the fly the needed standard libs with the required flags. Starting with the third compile of SF for a given ARCH the time required is incredibly short thanks to .cache/zig

@ppigazzini
Copy link
Contributor

ppigazzini commented Jan 25, 2022

zig LTO builds (gnu and musl) are very fast on my PC (ARCH=x86-64-modern), on par with the gcc LTO+PGO ones:

  • gcc LTO vs zig LTO:
bash bench_parallel.sh ./stockfish_gcc_lto ./stockfish_zig_musl_lto 20 10
run   sf_base   sf_test      diff
  1   1368592   1480552   +111960
  2   1385075   1496686   +111611
  3   1375543   1481434   +105891
  4   1338214   1440200   +101986
  5   1360938   1477815   +116877
  6   1359010   1468638   +109628
  7   1380860   1485201   +104341
  8   1355007   1465369   +110362
  9   1355609   1466329   +110720
 10   1347558   1460780   +113222

sf_base =  1362640 +/- 9204
sf_test =  1472300 +/- 9697
diff    =   109659 +/- 2747
speedup = 0.080476
  • gcc LTO+PGO vs zig LTO:
bash bench_parallel.sh ./stockfish_gcc_lto_pgo ./stockfish_zig_musl_lto 20 10
run   sf_base   sf_test      diff
  1   1474279   1476353     +2074
  2   1483561   1486417     +2856
  3   1482545   1488394     +5849
  4   1464570   1466617     +2047
  5   1479280   1481368     +2088
  6   1452094   1453226     +1132
  7   1427921   1430994     +3073
  8   1480682   1486878     +6196
  9   1485431   1492794     +7363
 10   1498355   1496252     -2103

sf_base =  1472871 +/- 12465
sf_test =  1475929 +/- 12624
diff    =     3057 +/- 1722
speedup = 0.002076
  • zig gnu glibc vs musl libc:
bash bench_parallel.sh ./stockfish_zig_gnu_lto ./stockfish_zig_musl_lto 20 10
run   sf_base   sf_test      diff
  1   1463039   1454170     -8869
  2   1493092   1489516     -3576
  3   1471115   1470567      -548
  4   1472825   1466681     -6144
  5   1466745   1463294     -3451
  6   1481009   1477945     -3064
  7   1481042   1476224     -4818
  8   1474215   1464347     -9868
  9   1477133   1470084     -7049
 10   1497754   1484216    -13538

sf_base =  1477796 +/- 6775
sf_test =  1471704 +/- 6515
diff    =    -6092 +/- 2384
speedup = -0.004123

EDIT_000: drunk monkey test, our Makefile already detect that zig c++ is clang...

  • zig builds COMP=gcc vs COMP=clang (requires to drop -latomic) :
bash bench_parallel.sh ./stockfish_zig_musl_lto ./stockfish_zig_musl_lto_clang 20 10
run   sf_base   sf_test      diff
  1   1470921   1468413     -2508
  2   1461829   1461543      -286
  3   1483036   1484774     +1738
  4   1473115   1468830     -4285
  5   1450025   1451530     +1505
  6   1467675   1464060     -3615
  7   1476516   1477263      +747
  8   1435582   1437303     +1721
  9   1460018   1463007     +2989
 10   1444133   1445127      +994

sf_base =  1462285 +/- 9359
sf_test =  1462185 +/- 8883
diff    =     -100 +/- 1551
speedup = -0.000068

@pschneider1968
Copy link
Contributor Author

That's really impressive!

@ppigazzini
Copy link
Contributor

On discord Torom on a core i7-6700K tested zig like 1.6% speedup vs g++ PGO and on par vs clang++ PGO
https://discord.com/channels/435943710472011776/813919248455827515/935540848370786304

I never played with Android, does the aarch64-linux-musl build run on Android?
I tried zig c++ -target aarch64-linux-android but there is a flood of errors.

@pschneider1968
Copy link
Contributor Author

I couldn't try it yet, but I hope to do next weekend at the latest.

@ppigazzini
Copy link
Contributor

ppigazzini commented Jan 25, 2022

This my workstation Dual Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz, 2501 Mhz, 12 Core(s), 24 Logical Processor(s)

make -j ARCH=x86-64-bmi2 CXX="zig c++ -target x86_64-linux-musl"
  • gcc-11 PGO vs zig
bash ./bench_parallel.sh ./stockfish_gcc_pgo ./stockfish_zig 20 10
run   sf_base   sf_test      diff
  1   1282121   1290001     +7880
  2   1264778   1291421    +26643
  3   1279411   1286797     +7386
  4   1258110   1278103    +19993
  5   1287863   1295173     +7310
  6   1249906   1285790    +35884
  7   1280522   1290088     +9566
  8   1291334   1297833     +6499
  9   1274985   1296867    +21882
 10   1274872   1298859    +23987

sf_base =  1274390 +/- 8130
sf_test =  1291093 +/- 3992
diff    =    16703 +/- 6419
speedup = 0.013107
  • clang-12 PGO vs zig
bash ./bench_parallel.sh ./stockfish_clang_pgo ./stockfish_zig 20 10
run   sf_base   sf_test      diff
  1   1290725   1286912     -3813
  2   1281692   1289509     +7817
  3   1278813   1289076    +10263
  4   1278131   1291189    +13058
  5   1273179   1285273    +12094
  6   1261255   1272756    +11501
  7   1288267   1296721     +8454
  8   1261587   1275579    +13992
  9   1293250   1299651     +6401
 10   1291798   1297598     +5800

sf_base =  1279869 +/- 7283
sf_test =  1288426 +/- 5508
diff    =     8556 +/- 3198
speedup = 0.006686

@pschneider1968
Copy link
Contributor Author

pschneider1968 commented Jan 25, 2022

I never played with Android, does the aarch64-linux-musl build run on Android? I tried zig c++ -target aarch64-linux-android but there is a flood of errors.

So I was too curious, I had to try this out NOW ;-) Although I'd better go to sleep...

I built with:
make build ARCH=armv8 COMP=ndk CXX="zig c++ -target aarch64-linux-musl"
on my Windows machine, and YES! the binary runs fine on my Galaxy Tab S7+.

Seems to be equally fast as that built with the NDK, but it was just a short test: running in Droidfish analyse mode, I played 3 plies into a King's gambit and let SF think about a move for black. With 4 Threads, it's starts at 1600k nps, and at about depth 30 is at 1200k nps.
I didn't root my device, so I don't know how I could run SF bench to get more objective bench number, because I don't have a shell/command line on my tablet.

Unfortunately, zig could not build a Windows executable, because the pthreads.h and libpthreads.a library is missing?! Seems to still be work in progress...

EDIT: Ah, I just saw that @vondele already mentioned that issue.

Will try out the Linux version next.

@ppigazzini
Copy link
Contributor

I suspect that the speed of zig on x86_64 is due to the build on the fly of the glibc/libc with the optimal CPU flags. The speedup seems to raise from old to new CPU. With aarch64 both zig and NDK are using the optimal CPU NEON flags, so they should be in par.
The NDK 700MB download size was a psychological show stopper for me, but it's still useful because it's able to build all the 3 ARM archs in the SF Makefile and because it's already installed in the Ubuntu 20.04 virtual environment from GitHub Action.

@ppigazzini
Copy link
Contributor

ppigazzini commented Feb 2, 2022

zig + qemu etc looks nice, will give an easy way to test e.g. the ARM code on linux. I wonder if we can somehow fix the pthreads issue cleanly.

Adding some info about the libc shipped with Zig:

Filed a bug Issue on Zig:
ziglang/zig#10989

The Zig instructions to update mingw-64 seems to be naïve and broken, they configure/install mingw-64-headers but stop with a config error for mingw-64-crt and skip the following mingw-64-winpthreads (that ships pthread.h).

This bash script configures and builds mingw-64 in a proper way (mingw-w64-build x86_64 --keep-artifacts), 15 minutes required (usual gcov errors for PGO, though...):
https://github.com/Zeranoe/mingw-w64-build

Update Zig mingw-w64 with:

zig_path=${HOME}/zig-linux-x86_64-0.9.0/lib/libc
mingw_path=${HOME}/.zeranoe/mingw-w64

mv ${zig_path}/mingw ${zig_path}/mingw-orig
cp -r ${mingw_path}/src/mingw-w64/mingw-w64-crt ${zig_path}/mingw

mv ${zig_path}/include/any-windows-any ${zig_path}/include/any-windows-any-orig
cp -r ${mingw_path}/x86_64/x86_64-w64-mingw32/include ${zig_path}/include/any-windows-any

cp ${mingw_path}/bld/x86_64/mingw-w64-crt/config.h  ${zig_path}/include/any-windows-any/config.h

When building Stockfish with make -j build ARCH=x86-64-modern COMP=mingw CXX"zig c++ -target x86_64-windows-gnu" the Zig mingw compilation stops with conflicts with some llvm files (updating the llvm files doesn't help, same result with Zig master):

Compile C Objects [12/61] memmove_s.c... /home/usr00/zig-linux-x86_64-0.9.0/lib/libc/mingw/secapi/vsprintf_s.c:39:10: warning: implicit declaration of function '__ms_vsnprintf' is invalid in C99 [-Wimplicit-function-declaration]
  return __ms_vsnprintf (_DstBuf, _Size, _Format, _ArgList);
         ^
1 warning generated.
Compile C Objects [94/408] vsnprintf.c... /home/usr00/zig-linux-x86_64-0.9.0/lib/libc/mingw/stdio/vwscanf.c:15:10: warning: implicit declaration of function '__ms_vfwscanf' is invalid in C99 [-Wimplicit-function-declaration]
  return __ms_vfwscanf(stdin, format, arg);
         ^
1 warning generated.
Compile C Objects [13/42] shared_mutex.cpp... /home/usr00/zig-linux-x86_64-0.9.0/lib/libcxx/src/support/win32/thread_win32.cpp:22:1: error: static_assert failed due to requirement 'sizeof(long long) == sizeof(_RTL_CRITICAL_SECTION)' ""
static_assert(sizeof(__libcpp_recursive_mutex_t) == sizeof(CRITICAL_SECTION),
^             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/usr00/zig-linux-x86_64-0.9.0/lib/libcxx/src/support/win32/thread_win32.cpp:30:1: error: static_assert failed due to requirement 'sizeof(long) == sizeof(_RTL_RUN_ONCE)' ""
static_assert(sizeof(__libcpp_exec_once_flag) == sizeof(INIT_ONCE), "");
^             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/usr00/zig-linux-x86_64-0.9.0/lib/libcxx/src/support/win32/thread_win32.cpp:31:1: error: static_assert failed due to requirement 'alignof(long) == alignof(_RTL_RUN_ONCE)' ""
static_assert(alignof(__libcpp_exec_once_flag) == alignof(INIT_ONCE), "");

...

/home/usr00/zig-linux-x86_64-0.9.0/lib/libcxx/include/__threading_support:450:5: note: previous definition is here
int __libcpp_condvar_destroy(__libcpp_condvar_t *__cv)
    ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]

dav1312 pushed a commit to dav1312/Stockfish that referenced this pull request Oct 21, 2022
For cross-compiling to Android on windows, the Makefile needs some tweaks.

Tested with Android NDK 23.1.7779620 and 21.4.7075529, using
Windows 10 with clean MSYS2 environment (i.e. no MINGW/GCC/Clang
toolchain in PATH) and Fedora 35, with build target:
build ARCH=armv8 COMP=ndk

The resulting binary runs fine inside Droidfish on my Samsung
Galaxy Note20 Ultra and Samsung Galaxy Tab S7+

Other builds tested to exclude regressions: MINGW64/Clang64 build
on Windows; MINGW64 cross build, native Clang and GCC builds on Fedora.

wiki docs https://github.com/glinscott/fishtest/wiki/Cross-compiling-Stockfish-for-Android-on-Windows-and-Linux

closes official-stockfish#3901

No functional change
Joachim26 pushed a commit to Joachim26/StockfishNPS that referenced this pull request Nov 22, 2023
For cross-compiling to Android on windows, the Makefile needs some tweaks.

Tested with Android NDK 23.1.7779620 and 21.4.7075529, using
Windows 10 with clean MSYS2 environment (i.e. no MINGW/GCC/Clang
toolchain in PATH) and Fedora 35, with build target:
build ARCH=armv8 COMP=ndk

The resulting binary runs fine inside Droidfish on my Samsung
Galaxy Note20 Ultra and Samsung Galaxy Tab S7+

Other builds tested to exclude regressions: MINGW64/Clang64 build
on Windows; MINGW64 cross build, native Clang and GCC builds on Fedora.

wiki docs https://github.com/glinscott/fishtest/wiki/Cross-compiling-Stockfish-for-Android-on-Windows-and-Linux

closes official-stockfish#3901

No functional change
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
to be merged Will be merged shortly
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants