-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error while using AMD AOCC/AOCL to build ABACUS #5982
Comments
GCC-AOCL installation of ABACUS can be done via toolchain, but I need to do lapack link in toolchain for automatic installation |
GCC-AOCL version of ABACUS can be compiled while OpenMPI should be compiled by AOCC-AOCL and keep ELPA be compiled by GCC-OpenBLAS. The SCF effciency of this GCC-AOCL version is slighly lower than GCC-OpenBLAS version of abacus. The buliding script is like:
but if change the It seems that while using |
This largely blames to the difference of grammars which is not a part of ISO C++ between clang and gcc. Generally, it only requires little modification on codes.
It would be better to submit a issue in the repo of ELPA if they did not mention the availability on AMD compilers.
With LCAO or PW basis? |
LCAO basis in Fe80C36 surface system, run by MPI16-OMP1 in AMD-epyc-7b12
|
Or just try ignoring those args if you believe it's OK to do so. Plz see clangd/clangd#662 |
I would suggest you checking the underlying math libs used by elpa to see if functions from AOCL is really in use. Plus, it is a common practice to compile with a standard BLAS implementation and use a vendor library in linking stage / at runtime. |
Which is done and will be upload in the existing toolchain PR |
@caic99 Thanks for your suggestion, after fully support AOCL in toolchain, I've complied another abacus and find that this gcc-AOCL version of ABACUS have better efficiency in LCAO-genelpa SCF calculation. Here is a ranking list after check-out, showing the time comsumed in each scf step (not include the first and the last scf step in each ion step) Task: MPI16-OMP1 LCAO-genelpa for Fe5C2(510) [Fe80C36], ABACUS commit 1fa5e3a , Hardware: AMD-EPYC-7b12
And here is the link libs of ABACUS in gcc-aocl linux-vdso.so.1 (0x00007ffd813fe000)
/data/libraries/fakeintel/libfakeintel.so (0x000079512e365000)
libelpa_openmp.so.19 => /data/libraries/elpa/2025.01.001-amd/cpu/lib/libelpa_openmp.so.19 (0x000079512d400000)
libfftw3.so.3 => /data/softwares/AMD/5.0.0-aocl-aocc/aocl/lib/libfftw3.so.3 (0x000079512cc00000)
libscalapack.so => /data/softwares/AMD/5.0.0-aocl-aocc/aocl/lib/libscalapack.so (0x000079512c400000)
libfftw3_omp.so.3 => /data/softwares/AMD/5.0.0-aocl-aocc/aocl/lib/libfftw3_omp.so.3 (0x000079512e358000)
libxc.so.15 => /data/libraries/libxc/7.0.0-gcc/lib/libxc.so.15 (0x000079512b600000)
libmpi.so.40 => /data/softwares/openmpi/5.0.6-amd5/lib/libmpi.so.40 (0x000079512b200000)
libomp.so => /data/softwares/AMD/5.0.0-aocl-aocc/aocc/lib/libomp.so (0x000079512ac00000)
libblis.so.5 => /data/softwares/AMD/5.0.0-aocl-aocc/aocl/lib/libblis.so.5 (0x000079512a000000)
libflame.so => /data/softwares/AMD/5.0.0-aocl-aocc/aocl/lib/libflame.so (0x0000795129000000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x0000795128c00000)
libm.so.6 => /usr/lib/x86_64-linux-gnu/libm.so.6 (0x000079512e26f000)
libgcc_s.so.1 => /usr/lib/x86_64-linux-gnu/libgcc_s.so.1 (0x000079512e24f000)
libc.so.6 => /usr/lib/x86_64-linux-gnu/libc.so.6 (0x0000795128800000)
libmpi_mpifh.so.40 => /data/softwares/openmpi/5.0.6-amd5/lib/libmpi_mpifh.so.40 (0x000079512d78a000)
libgfortran.so.5 => /usr/lib/x86_64-linux-gnu/libgfortran.so.5 (0x0000795128400000)
libaoclutils.so => /data/softwares/AMD/5.0.0-aocl-aocc/aocl/lib/libaoclutils.so (0x000079512d762000)
libblis-mt.so.5 => /data/softwares/AMD/5.0.0-aocl-aocc/aocl/lib/libblis-mt.so.5 (0x0000795127800000)
libmpi_usempif08.so.40 => /data/softwares/openmpi/5.0.6-amd5/lib/libmpi_usempif08.so.40 (0x000079512d3c7000)
libmpi_usempi_ignore_tkr.so.40 => /data/softwares/openmpi/5.0.6-amd5/lib/libmpi_usempi_ignore_tkr.so.40 (0x000079512d755000)
libflang.so => /data/softwares/AMD/5.0.0-aocl-aocc/aocc/lib/libflang.so (0x0000795127200000)
libflangrti.so => /data/softwares/AMD/5.0.0-aocl-aocc/aocc/lib/libflangrti.so (0x000079512d748000)
libpgmath.so => /data/softwares/AMD/5.0.0-aocl-aocc/aocc/lib/libpgmath.so (0x0000795126e00000)
libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x000079512cbb8000)
librt.so.1 => /usr/lib/x86_64-linux-gnu/librt.so.1 (0x000079512e246000)
libpthread.so.0 => /usr/lib/x86_64-linux-gnu/libpthread.so.0 (0x000079512e241000)
libopen-pal.so.80 => /data/softwares/openmpi/5.0.6-amd5/lib/libopen-pal.so.80 (0x000079512caf7000)
libpmix.so.2 => /data/softwares/openmpi/5.0.6-amd5/lib/libpmix.so.2 (0x0000795126a00000)
libmunge.so.2 => /usr/lib/x86_64-linux-gnu/libmunge.so.2 (0x000079512d73d000)
libevent_core-2.1.so.7 => /data/softwares/openmpi/5.0.6-amd5/lib/libevent_core-2.1.so.7 (0x000079512d392000)
libevent_pthreads-2.1.so.7 => /data/softwares/openmpi/5.0.6-amd5/lib/libevent_pthreads-2.1.so.7 (0x000079512d738000)
libhwloc.so.15 => /usr/lib/x86_64-linux-gnu/libhwloc.so.15 (0x000079512b5a4000)
/lib64/ld-linux-x86-64.so.2 (0x000079512e36c000)
libdl.so.2 => /usr/lib/x86_64-linux-gnu/libdl.so.2 (0x000079512caf2000)
libudev.so.1 => /usr/lib/x86_64-linux-gnu/libudev.so.1 (0x000079512cac8000) Looking forward for this issue being solved and running AOCC-AOCL version of ABACUS in AMD hardware |
I've done this with add a command in toolchain when in build dir of ELPA
Then the error is supressed lol |
ABACUS GCC-AOCL after this operation can be compiled, but have much lower efficiency in 52s/scf-step. It seems that there are some problems in flang-compiled ELPA. ldd information
|
Details
While using AMD AOCC/AOCL (5.0.0.1 version, accessed today) to build ABACUS, I encounter error in the last linking step
So:
Other compilation setting:
gcc/g++/gfortran 11.4
clang/clang++:
LAPACK: OpenBLAS (by toolchain) or AOCL
ScaLAPACK, FFTW:From AOCL
ELPA: compiled by toolchain
Hardware: AMD EPYC 7b12
OS: Ubuntu 22.04
Notice: flang cannot be used, which will lead to error in build ELPA.
Have you read FAQ on the online manual http://abacus.deepmodeling.com/en/latest/community/faq.html
Task list for Issue attackers (only for developers)
The text was updated successfully, but these errors were encountered: