{numlib,chem,tollchain}[NVHPC/23.7-CUDA-12.1.1] nvompi-2023a + QuantumESPRESSO-7.3.1 (GPU enabled) by Crivella · Pull Request #20364 · easybuilders/easybuild-easyconfigs

Crivella · 2024-04-15T16:32:52Z

Added easyconfig files for nvofbf toolchain + QE 7.3.1

local compilers:

GCC/12.3.0
CUDA/12.1.1

Added toolchain/numlib

nvofbf-2023a
- nvompi-2023a
  - NVHPC-23.7-CUDA-12.1.1
  - OpenMPI-4.1.5
- FlexiBLAS-3.3.1
  - OpenBLAS-0.3.24
- FFTW-3.3.10
- FFTW.MPI-3.3.10
- ScaLAPACK-2.2.0-fb

Added easyconfigs

HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb
libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb
QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb

NOTES:

QuantumESPRESSO easyconfig also requires changes from this commit. Will incorporate them once Archive autotools-based QuantumESPRESSO easyblock and switch default to CMakeMake-based easyblock easybuild-easyblocks#3306 is merged
ELPA: compiles using cuda compilers which requires specified compute capability (CC), while QE uses hpc-sdk compilers which if not specified compiles for all supported CCs

Solved issues:

High number of failures in OpenBLAS lapack-testsuite: LAPACK test failures with NVHPC 23.7 OpenMathLib/OpenBLAS#4625
- Use less optimization for v0.3.24
- Use patch for v0.3.27

Open issue:

Segfault in QE test-suite due to FlexiBLAS occasionally when calling the ZHEEV BLAS routine
- Bug does not manifest when running the code with cuda-gdb
- Tested starting from nvompi linking directly to OpenBLAS and the error was not present
Segfault in 3 test cases with RMM-DIS diagonalization with k points other than GAMMA, most likely a QE bug (https://gitlab.com/QEF/q-e/-/issues/675)
Full CUDA libxc: https://gitlab.com/libxc/libxc/-/issues/135
- Tested patch from commit e648f37b
  - Compile time goes from ~5min to ~3.5h
  - Tests are unable to run
  - I would argue for now since it is not officially supported with CMAKE and only experimental with autotools, and also not a really widely used feature of QE, it is ok to not have the libxc routines run on GPU

Crivella · 2024-04-18T08:58:21Z

Comparison of code efficiency when linked to EB numlibs (no prefix) VS linked to NVHPC math_libs (-test prefix) shows no significative difference running on one node with a A100 GPU

[ RUN      ] MINE_QESPRESSO %ecut=250 %nbnd=400 %module_name=QuantumESPRESSO/7.3.1-nvompi-2023a-test %threads=1 /bf4db141 @vega-gpu:default+default
[ RUN      ] MINE_QESPRESSO %ecut=250 %nbnd=400 %module_name=QuantumESPRESSO/7.3.1-nvompi-2023a-CUDA-12.1.1 %threads=1 /e4ce2bb2 @vega-gpu:default+default
[       OK ] (1/2) MINE_QESPRESSO %ecut=250 %nbnd=400 %module_name=QuantumESPRESSO/7.3.1-nvompi-2023a-test %threads=1 /bf4db141 @vega-gpu:default+default
P: extract_report_time: 0 s (r:0, l:None, u:None)
P: PWSCF_cpu: 231.98 s (r:0, l:None, u:None)
P: PWSCF_wall: 241.82 s (r:0, l:None, u:None)
P: electrons_cpu: 213.8 s (r:0, l:None, u:None)
P: electrons_wall: 216.04 s (r:0, l:None, u:None)
P: c_bands_cpu: 181.97 s (r:0, l:None, u:None)
P: c_bands_wall: 183.78 s (r:0, l:None, u:None)
P: cegterg_cpu: 142.72 s (r:0, l:None, u:None)
P: cegterg_wall: 144.01 s (r:0, l:None, u:None)
P: calbec_cpu: 0.12 s (r:0, l:None, u:None)
P: calbec_wall: 0.55 s (r:0, l:None, u:None)
P: fft_cpu: 0.12 s (r:0, l:None, u:None)
P: fft_wall: 0.14 s (r:0, l:None, u:None)
P: ffts_cpu: 0.0 s (r:0, l:None, u:None)
P: ffts_wall: 0.0 s (r:0, l:None, u:None)
P: fftw_cpu: 1.26 s (r:0, l:None, u:None)
P: fftw_wall: 77.36 s (r:0, l:None, u:None)
[       OK ] (2/2) MINE_QESPRESSO %ecut=250 %nbnd=400 %module_name=QuantumESPRESSO/7.3.1-nvompi-2023a-CUDA-12.1.1 %threads=1 /e4ce2bb2 @vega-gpu:default+default
P: extract_report_time: 0 s (r:0, l:None, u:None)
P: PWSCF_cpu: 232.44 s (r:0, l:None, u:None)
P: PWSCF_wall: 241.74 s (r:0, l:None, u:None)
P: electrons_cpu: 214.16 s (r:0, l:None, u:None)
P: electrons_wall: 216.18 s (r:0, l:None, u:None)
P: c_bands_cpu: 182.3 s (r:0, l:None, u:None)
P: c_bands_wall: 183.9 s (r:0, l:None, u:None)
P: cegterg_cpu: 143.11 s (r:0, l:None, u:None)
P: cegterg_wall: 144.18 s (r:0, l:None, u:None)
P: calbec_cpu: 0.12 s (r:0, l:None, u:None)
P: calbec_wall: 0.56 s (r:0, l:None, u:None)
P: fft_cpu: 0.0 s (r:0, l:None, u:None)
P: fft_wall: 0.01 s (r:0, l:None, u:None)
P: ffts_cpu: 0.0 s (r:0, l:None, u:None)
P: ffts_wall: 0.0 s (r:0, l:None, u:None)
P: fftw_cpu: 1.24 s (r:0, l:None, u:None)
P: fftw_wall: 77.23 s (r:0, l:None, u:None)
[----------] all spawned checks have finished

cgross95 · 2024-05-21T22:06:57Z

Thanks for putting all of this together! Our site is interested in a GPU enabled QuantumESPRESSO build, so we've been testing this.

Were you able to get around the "other error"s that occur in LAPACK testing when building OpenBLAS? Using the OpenBLAS_0.3.24-NVHPC-23.7-CUDA-12.1.1.eb EasyConfig as provided gives us 55 other errors:

                        -->   LAPACK TESTING SUMMARY  <--
SUMMARY                 nb test run     numerical error         other error  
================        ===========     =================       ================  
REAL                    1328283         0       (0.000%)        0       (0.000%)        
DOUBLE PRECISION        1328013         10      (0.001%)        0       (0.000%)        
COMPLEX                 769507          159     (0.021%)        55      (0.007%)        
COMPLEX16               780654          116     (0.015%)        0       (0.000%)        

--> ALL PRECISIONS      4206457         285     (0.007%)        55      (0.001%)

I saw that you had done some work with on OpenBLAS issue #4652 to get some of the numerical failures down, but was wondering if you were ever able to get rid of the other errors that stop EasyBuild from finishing.

Crivella · 2024-05-22T08:33:17Z

@cgross95
In my case by compiling on a machine with a A100 gpu i ended up with only 148 numerical errors.

			-->   LAPACK TESTING SUMMARY  <--
SUMMARY             	nb test run 	numerical error   	other error  
================   	===========	=================	================  
REAL             	1326099		12	(0.001%)	0	(0.000%)	
DOUBLE PRECISION	1326921		36	(0.003%)	0	(0.000%)	
COMPLEX          	762663		42	(0.006%)	0	(0.000%)	
COMPLEX16         	771518		58	(0.008%)	0	(0.000%)	

--> ALL PRECISIONS	4187201		148	(0.004%)	0	(0.000%)

What hardware are you trying this on?
Would be interesting in finding out if this is strictly OpenBLAS/LaPACK related or if it is about setting more compiler flags for different architectures.

I think i was still getting some other errors as well with 0.3.27 but i didn't investigate much further into it as i was aiming at 0.3.24 for this release (In that case i was getting 14 errors related to the ZHSQR and ZGEEV routines failing to find all eigenvalues).

The logs should give you further details on which lapack routine failed and with what error code (each function should have the meaning of the errors as comments in the source/documentation).
In case you think those errors might not be a problem you could also increase the threshold of allowed lapack test errors by changing the value assigned to max_failing_lapack_tests_num_errors and adding also max_failing_lapack_tests_other_errors to allow the other errors.

cgross95 · 2024-05-22T12:15:34Z

I'm compiling on a v100s with an Intel Xeon Skylake on Ubuntu 22.04. We also have some a100 cards, but we're in the midst of transferring everything in our cluster to Ubuntu, so they're not easily accessible at the moment. I'll dig into the LAPACK testing logs and see if I can produce some more useful debugging information.

cgross95 · 2024-06-07T19:22:32Z

I finally got access to our A100 cards, and can report that there were no "other error"s in the LAPACK tests. I ended up with 152 numerical errors, so increased the max_failing_lapack_tests_num_errors EasyConfig parameter, and was able to successfully install OpenBLAS. I'm continuing on with the rest of the build now.

beeebiii · 2024-09-05T12:46:02Z

Hi, I'm compiling on a v100s with an Intel Xeon Skylake on Ubuntu 22.04. What more changes do you think i should do to be able to use QuantumEspresso(GPU enabled)?? Because, when i use this PR, eb --from-pr 20364 -r, i got checksum error in libxc, which i fixed, afterwards i am getting error in OpenBLAS/0.3.24-NVHPC-23.7-CUDA-12.1.1.... The error i get is

Error limit reached. Use -fmax-errors=N to change the limit, N=0 for unlimited. 100 errors detected in the compilation of "../kernel/x86_64/sgemv_t_4.c". Compilation terminated. make[1]: *** [Makefile.L2:268: sgemv_t.o] Error 2 make[1]: Leaving directory '/test/parta/EASYBUILD/build/OpenBLAS/0.3.24/NVHPC-23.7-CUDA-12.1.1/OpenBLAS-0.3.24/kernel' make: *** [Makefile:184: libs] Error 1 (at easybuild/tools/run.py:682 in parse_cmd_output)
Should i increase the -fmax-errors as i have a total of 241 errors detected during the compilation of ../kernel/x86_64/...?? Or how do you think can i solve this problem!!??

Crivella · 2024-09-05T13:29:46Z

@beeebiii
Regarding the checksum for libxc it is related to this issue and also this PR.
Since the checksum for the code is not stable it would be best to use --ignore-checksums (in case for security after you've checked that only the libxc one is failing and the the downloaded files are what you expect)

The other error you are reporting seems related to OpenBLAS. In my tests on an A100 with an AMD zen2 CPU I did not encounter failures in the compilation (only some failures in the test suite).
It seems @cgross95 was on a system similar to yours but also did not encounter it.

One weird thing is I am not sure -fmax-errors is supported by NVCC but is a GCC only flag. Would probably need the full debug log to understand what is happening here (eg is gcc being used instead of nvcc?).

beeebiii · 2024-09-05T13:36:59Z

Yeah you are right, i think gcc is being used instead of nvcc.
Remark: individual warnings can be suppressed with "--diag_suppress <warning-name>" "../kernel/x86_64/sgemv_t_4.c", line 31: warning: unrecognized GCC pragma [unrecognized_gcc_pragma] #pragma GCC optimize("no-tree-vectorize")
Very strange, i did compile libxc/6.2.2-NVHPC-23.7-CUDA-12.1.1 myself as i had to change the checksums. Other than that i do eb --from-pr 20364 -r --accept-eula-for=CUDA --cuda-compute-capabilities=9.0 --accept-eula-for=NVHPC

Crivella · 2024-09-05T13:44:13Z

If you look at the OpenBLAS easyconfig, only NVHPC should be used and easybuild should not be aware for other compiler toolchains in that instance.
I would suggest running the build with -d and check the logs to see if there is something weird happening with environment variables or some other module you have already loaded in your environment

Crivella · 2024-09-05T14:04:36Z

That is basically easybuild reporting the easyconfig file being used, the debug logging adds much more.
Look for DEBUG Not skipping configure step in the logs and start scrolling

up to see what env variables were set by easybuild and what modules were found loaded
down to see the output of the configure command

I would venture to guess the problem is in either of them.
Also is OpenBLAS the first thing that gets installed with -r (eg did something else installing with NVHPC only complete sucessfully?)
BTW this conversation might more suited for slack. If you are on the EB slack you can @ me with Davide Grassano

Crivella · 2024-09-09T08:21:43Z

To summarize the problem @beeebiii was having, the -tp=px was causing trouble compiling which was resolved by commenting the 'optarch': 'GENERIC', line from the OpenBLAS easyconfig (causing it to become -tp=host)

This reverts commit 706e9d1.

yqshao · 2024-10-25T16:41:41Z

I was trying this and I think the 55 "other error" that @cgross95 observed was discussed and resolved in #19021

github-actions · 2024-12-05T10:32:22Z

Updated software `FFTW.MPI-3.3.10-nvompi-2023a.eb`

Diff against FFTW.MPI-3.3.10-gompi-2025a.eb

easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-gompi-2025a.eb

diff --git a/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-gompi-2025a.eb b/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-nvompi-2023a.eb
index 2e37427de1..6a87b1494a 100644
--- a/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-gompi-2025a.eb
+++ b/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-nvompi-2023a.eb
@@ -5,14 +5,17 @@ homepage = 'https://www.fftw.org'
 description = """FFTW is a C subroutine library for computing the discrete Fourier transform (DFT)
 in one or more dimensions, of arbitrary input size, and of both real and complex data."""
 
-toolchain = {'name': 'gompi', 'version': '2025a'}
+toolchain = {'name': 'nvompi', 'version': '2023a'}
 toolchainopts = {'pic': True}
 
 source_urls = [homepage]
 sources = ['fftw-%(version)s.tar.gz']
 checksums = ['56c932549852cddcfafdab3820b0200c7742675be92179e59e6215b340e26467']
 
-dependencies = [('FFTW', '3.3.10')]
+local_cuda = '12.1.1'
+local_compiler = ('NVHPC', '23.7-CUDA-%s' % local_cuda)
+
+dependencies = [('FFTW', '3.3.10', '', local_compiler)]
 
 runtest = 'check'

Diff against FFTW.MPI-3.3.10-iimpi-2023a.eb

easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-iimpi-2023a.eb

diff --git a/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-iimpi-2023a.eb b/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-nvompi-2023a.eb
index edd9aa0119..6a87b1494a 100644
--- a/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-iimpi-2023a.eb
+++ b/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-nvompi-2023a.eb
@@ -5,14 +5,17 @@ homepage = 'https://www.fftw.org'
 description = """FFTW is a C subroutine library for computing the discrete Fourier transform (DFT)
 in one or more dimensions, of arbitrary input size, and of both real and complex data."""
 
-toolchain = {'name': 'iimpi', 'version': '2023a'}
+toolchain = {'name': 'nvompi', 'version': '2023a'}
 toolchainopts = {'pic': True}
 
 source_urls = [homepage]
 sources = ['fftw-%(version)s.tar.gz']
 checksums = ['56c932549852cddcfafdab3820b0200c7742675be92179e59e6215b340e26467']
 
-dependencies = [('FFTW', '3.3.10')]
+local_cuda = '12.1.1'
+local_compiler = ('NVHPC', '23.7-CUDA-%s' % local_cuda)
+
+dependencies = [('FFTW', '3.3.10', '', local_compiler)]
 
 runtest = 'check'

Updated software `FFTW-3.3.10-NVHPC-23.7-CUDA-12.1.1.eb`

Diff against FFTW-3.3.10-intel-compilers-2023.1.0.eb

easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-intel-compilers-2023.1.0.eb

diff --git a/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-intel-compilers-2023.1.0.eb b/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-NVHPC-23.7-CUDA-12.1.1.eb
index 79fb090b48..670602aadc 100644
--- a/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-intel-compilers-2023.1.0.eb
+++ b/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-NVHPC-23.7-CUDA-12.1.1.eb
@@ -5,22 +5,16 @@ homepage = 'https://www.fftw.org'
 description = """FFTW is a C subroutine library for computing the discrete Fourier transform (DFT)
 in one or more dimensions, of arbitrary input size, and of both real and complex data."""
 
-toolchain = {'name': 'intel-compilers', 'version': '2023.1.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
 toolchainopts = {'pic': True}
 
 source_urls = [homepage]
 sources = [SOURCELOWER_TAR_GZ]
 checksums = ['56c932549852cddcfafdab3820b0200c7742675be92179e59e6215b340e26467']
 
-# no quad precision, requires GCC v4.6 or higher
-# see also
-# https://www.fftw.org/doc/Extended-and-quadruple-precision-in-Fortran.html
+# Does not work with nvc
 with_quad_prec = False
 
-# compilation fails on AMD systems when configuring with --enable-avx-128-fma,
-# because Intel compilers do not support FMA4 instructions
-use_fma4 = False
-
 runtest = 'check'
 
 moduleclass = 'numlib'

Diff against FFTW-3.3.10-GCC-14.2.0.eb

easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-GCC-14.2.0.eb

diff --git a/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-GCC-14.2.0.eb b/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-NVHPC-23.7-CUDA-12.1.1.eb
index 6ea1d560d5..670602aadc 100644
--- a/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-GCC-14.2.0.eb
+++ b/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-NVHPC-23.7-CUDA-12.1.1.eb
@@ -5,13 +5,16 @@ homepage = 'https://www.fftw.org'
 description = """FFTW is a C subroutine library for computing the discrete Fourier transform (DFT)
 in one or more dimensions, of arbitrary input size, and of both real and complex data."""
 
-toolchain = {'name': 'GCC', 'version': '14.2.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
 toolchainopts = {'pic': True}
 
 source_urls = [homepage]
 sources = [SOURCELOWER_TAR_GZ]
 checksums = ['56c932549852cddcfafdab3820b0200c7742675be92179e59e6215b340e26467']
 
+# Does not work with nvc
+with_quad_prec = False
+
 runtest = 'check'
 
 moduleclass = 'numlib'

Updated software `HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb`

Diff against HDF5-1.14.0-iompi-2023a.eb

easybuild/easyconfigs/h/HDF5/HDF5-1.14.0-iompi-2023a.eb

diff --git a/easybuild/easyconfigs/h/HDF5/HDF5-1.14.0-iompi-2023a.eb b/easybuild/easyconfigs/h/HDF5/HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb
index d5a5f720c7..74256e2cb4 100644
--- a/easybuild/easyconfigs/h/HDF5/HDF5-1.14.0-iompi-2023a.eb
+++ b/easybuild/easyconfigs/h/HDF5/HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb
@@ -1,26 +1,27 @@
 name = 'HDF5'
 # Note: Odd minor releases are only RCs and should not be used.
 version = '1.14.0'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://portal.hdfgroup.org/display/support'
 description = """HDF5 is a data model, library, and file format for storing and managing data.
  It supports an unlimited variety of datatypes, and is designed for flexible
  and efficient I/O and for high volume and complex data."""
 
-toolchain = {'name': 'iompi', 'version': '2023a'}
+toolchain = {'name': 'nvompi', 'version': '2023a'}
 toolchainopts = {'pic': True, 'usempi': True}
 
 source_urls = ['https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-%(version_major_minor)s/hdf5-%(version)s/src']
 sources = [SOURCELOWER_TAR_GZ]
 checksums = ['a571cc83efda62e1a51a0a912dd916d01895801c5025af91669484a1575a6ef4']
 
-# replace src include path with installation dir for $H5BLD_CPPFLAGS
-_regex = 's, -I[^[:space:]]+H5FDsubfiling , -I%(installdir)s/include ,g'
-postinstallcmds = ['sed -i -r "%s" %%(installdir)s/bin/%s' % (_regex, x) for x in ['h5c++', 'h5pcc']]
+local_gcc_compiler = ('GCCcore', '12.3.0')
+# local_compiler = ('NVHPC', '23.7-CUDA-12.1.1')
 
 dependencies = [
-    ('zlib', '1.2.13'),
-    ('Szip', '2.1.1'),
+    ('CUDA', '12.1.1', '', SYSTEM),
+    ('zlib', '1.2.13', '', local_gcc_compiler),
+    ('Szip', '2.1.1', '', local_gcc_compiler),
 ]
 
 moduleclass = 'data'

Diff against HDF5-1.14.5-gompi-2024a.eb

easybuild/easyconfigs/h/HDF5/HDF5-1.14.5-gompi-2024a.eb

diff --git a/easybuild/easyconfigs/h/HDF5/HDF5-1.14.5-gompi-2024a.eb b/easybuild/easyconfigs/h/HDF5/HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb
index 5917196292..74256e2cb4 100644
--- a/easybuild/easyconfigs/h/HDF5/HDF5-1.14.5-gompi-2024a.eb
+++ b/easybuild/easyconfigs/h/HDF5/HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb
@@ -1,27 +1,27 @@
 name = 'HDF5'
 # Note: Odd minor releases are only RCs and should not be used.
-version = '1.14.5'
+version = '1.14.0'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://portal.hdfgroup.org/display/support'
 description = """HDF5 is a data model, library, and file format for storing and managing data.
  It supports an unlimited variety of datatypes, and is designed for flexible
  and efficient I/O and for high volume and complex data."""
 
-toolchain = {'name': 'gompi', 'version': '2024a'}
+toolchain = {'name': 'nvompi', 'version': '2023a'}
 toolchainopts = {'pic': True, 'usempi': True}
 
-source_urls = ['https://github.com/HDFGroup/hdf5/archive']
-sources = ['hdf5_%(version)s.tar.gz']
-checksums = ['c83996dc79080a34e7b5244a1d5ea076abfd642ec12d7c25388e2fdd81d26350']
+source_urls = ['https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-%(version_major_minor)s/hdf5-%(version)s/src']
+sources = [SOURCELOWER_TAR_GZ]
+checksums = ['a571cc83efda62e1a51a0a912dd916d01895801c5025af91669484a1575a6ef4']
 
-dependencies = [
-    ('zlib', '1.3.1'),
-    ('Szip', '2.1.1'),
-]
+local_gcc_compiler = ('GCCcore', '12.3.0')
+# local_compiler = ('NVHPC', '23.7-CUDA-12.1.1')
 
-postinstallcmds = [
-    'sed -i -r "s, -I[^[:space:]]+H5FDsubfiling , -I%(installdir)s/include ,g" %(installdir)s/bin/h5c++',
-    'sed -i -r "s, -I[^[:space:]]+H5FDsubfiling , -I%(installdir)s/include ,g" %(installdir)s/bin/h5pcc',
+dependencies = [
+    ('CUDA', '12.1.1', '', SYSTEM),
+    ('zlib', '1.2.13', '', local_gcc_compiler),
+    ('Szip', '2.1.1', '', local_gcc_compiler),
 ]
 
 moduleclass = 'data'

Updated software `libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb`

Diff against libxc-6.2.2-GCC-13.2.0-nofhc.eb

easybuild/easyconfigs/l/libxc/libxc-6.2.2-GCC-13.2.0-nofhc.eb

diff --git a/easybuild/easyconfigs/l/libxc/libxc-6.2.2-GCC-13.2.0-nofhc.eb b/easybuild/easyconfigs/l/libxc/libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb
index fdd9339800..2cacc1fb9d 100644
--- a/easybuild/easyconfigs/l/libxc/libxc-6.2.2-GCC-13.2.0-nofhc.eb
+++ b/easybuild/easyconfigs/l/libxc/libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb
@@ -2,25 +2,22 @@ easyblock = 'CMakeMake'
 
 name = 'libxc'
 version = '6.2.2'
-versionsuffix = '-nofhc'
 
 homepage = 'https://libxc.gitlab.io'
 description = """Libxc is a library of exchange-correlation functionals for density-functional theory.
  The aim is to provide a portable, well tested and reliable set of exchange and correlation functionals."""
 
-toolchain = {'name': 'GCC', 'version': '13.2.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
 
 source_urls = ['https://gitlab.com/libxc/libxc/-/archive/%(version)s/']
 sources = [SOURCE_TAR_GZ]
-checksums = [
-    ('a0f6f1bba7ba5c0c85b2bfe65aca1591025f509a7f11471b4cd651a79491b045',
-     '3b0523924579cf494cafc6fea92945257f35692b004217d3dfd3ea7ca780e8dc',
-     'd1b65ef74615a1e539d87a0e6662f04baf3a2316706b4e2e686da3193b26b20f'),
-]
+checksums = [('a0f6f1bba7ba5c0c85b2bfe65aca1591025f509a7f11471b4cd651a79491b045',
+              '3b0523924579cf494cafc6fea92945257f35692b004217d3dfd3ea7ca780e8dc'),
+             ]
 
 builddependencies = [
-    ('CMake', '3.27.6'),
-    ('Perl', '5.38.0'),
+    ('CMake', '3.26.3'),
+    ('Perl', '5.36.1'),
 ]
 
 local_common_configopts = "-DENABLE_FORTRAN=ON -DENABLE_XHOST=OFF "
@@ -30,9 +27,6 @@ local_common_configopts = "-DENABLE_FORTRAN=ON -DENABLE_XHOST=OFF "
 # see also https://github.com/pyscf/pyscf/issues/1103
 local_common_configopts += "-DDISABLE_KXC=OFF -DDISABLE_LXC=OFF"
 
-# Disable fhc, this needs to support codes (like VASP) relying on the projector augmented wave (PAW) approach
-local_common_configopts += ' -DDISABLE_FHC=ON'
-
 # perform iterative build to get both static and shared libraries
 configopts = [
     local_common_configopts + ' -DBUILD_SHARED_LIBS=OFF',

Diff against libxc-6.2.2-GCC-13.3.0.eb

easybuild/easyconfigs/l/libxc/libxc-6.2.2-GCC-13.3.0.eb

diff --git a/easybuild/easyconfigs/l/libxc/libxc-6.2.2-GCC-13.3.0.eb b/easybuild/easyconfigs/l/libxc/libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb
index 9e4de5e808..2cacc1fb9d 100644
--- a/easybuild/easyconfigs/l/libxc/libxc-6.2.2-GCC-13.3.0.eb
+++ b/easybuild/easyconfigs/l/libxc/libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb
@@ -7,19 +7,17 @@ homepage = 'https://libxc.gitlab.io'
 description = """Libxc is a library of exchange-correlation functionals for density-functional theory.
  The aim is to provide a portable, well tested and reliable set of exchange and correlation functionals."""
 
-toolchain = {'name': 'GCC', 'version': '13.3.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
 
-source_urls = ['https://gitlab.com/%(name)s/%(name)s/-/archive/%(version)s/']
+source_urls = ['https://gitlab.com/libxc/libxc/-/archive/%(version)s/']
 sources = [SOURCE_TAR_GZ]
-checksums = [
-    ('a0f6f1bba7ba5c0c85b2bfe65aca1591025f509a7f11471b4cd651a79491b045',
-     '3b0523924579cf494cafc6fea92945257f35692b004217d3dfd3ea7ca780e8dc',
-     'd1b65ef74615a1e539d87a0e6662f04baf3a2316706b4e2e686da3193b26b20f'),
-]
+checksums = [('a0f6f1bba7ba5c0c85b2bfe65aca1591025f509a7f11471b4cd651a79491b045',
+              '3b0523924579cf494cafc6fea92945257f35692b004217d3dfd3ea7ca780e8dc'),
+             ]
 
 builddependencies = [
-    ('CMake', '3.29.3'),
-    ('Perl', '5.38.2'),
+    ('CMake', '3.26.3'),
+    ('Perl', '5.36.1'),
 ]
 
 local_common_configopts = "-DENABLE_FORTRAN=ON -DENABLE_XHOST=OFF "

Updated software `nvompi-2023a.eb`

Diff against nvompi-2022.07.eb

easybuild/easyconfigs/n/nvompi/nvompi-2022.07.eb

diff --git a/easybuild/easyconfigs/n/nvompi/nvompi-2022.07.eb b/easybuild/easyconfigs/n/nvompi/nvompi-2023a.eb
index 1a1647cbfa..31a17edb3f 100644
--- a/easybuild/easyconfigs/n/nvompi/nvompi-2022.07.eb
+++ b/easybuild/easyconfigs/n/nvompi/nvompi-2023a.eb
@@ -1,19 +1,20 @@
 easyblock = 'Toolchain'
 
 name = 'nvompi'
-version = '2022.07'
+version = '2023a'
 
 homepage = '(none)'
 description = 'NVHPC based compiler toolchain, including OpenMPI for MPI support.'
 
 toolchain = SYSTEM
 
-local_compiler = ('NVHPC', '22.7-CUDA-11.7.0')
+local_cuda = '12.1.1'
+local_compiler = ('NVHPC', '23.7-CUDA-%s' % local_cuda)
 
 dependencies = [
     local_compiler,
-    ('OpenMPI', '4.1.4', '', local_compiler),
-    ('CUDA', '11.7.0', '', SYSTEM),
+    ('CUDA', local_cuda, '', SYSTEM),
+    ('OpenMPI', '4.1.5', '', local_compiler),
 ]
 
 moduleclass = 'toolchain'

Updated software `OpenBLAS-0.3.24-NVHPC-23.7-CUDA-12.1.1.eb`

Diff against OpenBLAS-0.3.29-GCC-14.2.0.eb

easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.29-GCC-14.2.0.eb

diff --git a/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.29-GCC-14.2.0.eb b/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.24-NVHPC-23.7-CUDA-12.1.1.eb
index f30506242a..db86d55f15 100644
--- a/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.29-GCC-14.2.0.eb
+++ b/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.24-NVHPC-23.7-CUDA-12.1.1.eb
@@ -1,10 +1,15 @@
 name = 'OpenBLAS'
-version = '0.3.29'
+version = '0.3.24'
 
-homepage = 'https://www.openblas.net/'
+homepage = 'http://www.openblas.net/'
 description = "OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version."
 
-toolchain = {'name': 'GCC', 'version': '14.2.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
+toolchainopts = {
+    # https://github.com/OpenMathLib/OpenBLAS/issues/4625
+    'precise': True,
+    'optarch': 'GENERIC',
+}
 
 source_urls = [
     # order matters, trying to download the large.tgz/timing.tgz LAPACK tarballs from GitHub causes trouble
@@ -16,22 +21,30 @@ patches = [
     ('large.tgz', '.'),
     ('timing.tgz', '.'),
     'OpenBLAS-0.3.15_workaround-gcc-miscompilation.patch',
+    'OpenBLAS-0.3.20_use-xASUM-microkernels-on-new-intel-cpus.patch',
     'OpenBLAS-0.3.21_fix-order-vectorization.patch',
+    'OpenBLAS-0.3.23_disable-xDRGES-LAPACK-test.patch',
+    'OpenBLAS-0.3.23_lapack_test_nomain.patch'  # https://github.com/OpenMathLib/OpenBLAS/issues/4625
 ]
 checksums = [
-    {'v0.3.29.tar.gz': '38240eee1b29e2bde47ebb5d61160207dc68668a54cac62c076bb5032013b1eb'},
+    {'v0.3.24.tar.gz': 'ceadc5065da97bd92404cac7254da66cc6eb192679cf1002098688978d4d5132'},
     {'large.tgz': 'f328d88b7fa97722f271d7d0cfea1c220e0f8e5ed5ff01d8ef1eb51d6f4243a1'},
     {'timing.tgz': '999c65f8ea8bd4eac7f1c7f3463d4946917afd20a997807300fe35d70122f3af'},
     {'OpenBLAS-0.3.15_workaround-gcc-miscompilation.patch':
      'e6b326fb8c4a8a6fd07741d9983c37a72c55c9ff9a4f74a80e1352ce5f975971'},
+    {'OpenBLAS-0.3.20_use-xASUM-microkernels-on-new-intel-cpus.patch':
+     '1dbd0f9473963dbdd9131611b455d8a801f1e995eae82896186d3d3ffe6d5f03'},
     {'OpenBLAS-0.3.21_fix-order-vectorization.patch':
      '08af834e5d60441fd35c128758ed9c092ba6887c829e0471ecd489079539047d'},
+    {'OpenBLAS-0.3.23_disable-xDRGES-LAPACK-test.patch':
+     'ab7e0af05f9b2a2ced32f3875e1e3767d9c3531a455421a38f7324350178a0ff'},
+    {'OpenBLAS-0.3.23_lapack_test_nomain.patch': '63e14e2cb67dd81ecc13fa0e07685f853abe0669d4cadd520247b88789346948'},
 ]
 
 builddependencies = [
     ('make', '4.4.1'),
     # required by LAPACK test suite
-    ('Python', '3.13.1'),
+    ('Python', '3.11.3'),
 ]
 
 run_lapack_tests = True

Diff against OpenBLAS-0.3.27-GCC-13.3.0-seq-iface64.eb

easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.27-GCC-13.3.0-seq-iface64.eb

diff --git a/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.27-GCC-13.3.0-seq-iface64.eb b/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.24-NVHPC-23.7-CUDA-12.1.1.eb
index 5527c667f6..db86d55f15 100644
--- a/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.27-GCC-13.3.0-seq-iface64.eb
+++ b/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.24-NVHPC-23.7-CUDA-12.1.1.eb
@@ -1,11 +1,15 @@
 name = 'OpenBLAS'
-version = '0.3.27'
-versionsuffix = '-seq-iface64'
+version = '0.3.24'
 
-homepage = 'https://www.openblas.net/'
+homepage = 'http://www.openblas.net/'
 description = "OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version."
 
-toolchain = {'name': 'GCC', 'version': '13.3.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
+toolchainopts = {
+    # https://github.com/OpenMathLib/OpenBLAS/issues/4625
+    'precise': True,
+    'optarch': 'GENERIC',
+}
 
 source_urls = [
     # order matters, trying to download the large.tgz/timing.tgz LAPACK tarballs from GitHub causes trouble
@@ -17,40 +21,32 @@ patches = [
     ('large.tgz', '.'),
     ('timing.tgz', '.'),
     'OpenBLAS-0.3.15_workaround-gcc-miscompilation.patch',
+    'OpenBLAS-0.3.20_use-xASUM-microkernels-on-new-intel-cpus.patch',
     'OpenBLAS-0.3.21_fix-order-vectorization.patch',
-    'OpenBLAS-0.3.26_lapack_qr_noninittest.patch',
-    'OpenBLAS-0.3.27_fix_zscal.patch',
-    'OpenBLAS-0.3.27_riscv-drop-static-fortran-flag.patch',
+    'OpenBLAS-0.3.23_disable-xDRGES-LAPACK-test.patch',
+    'OpenBLAS-0.3.23_lapack_test_nomain.patch'  # https://github.com/OpenMathLib/OpenBLAS/issues/4625
 ]
 checksums = [
-    {'v0.3.27.tar.gz': 'aa2d68b1564fe2b13bc292672608e9cdeeeb6dc34995512e65c3b10f4599e897'},
+    {'v0.3.24.tar.gz': 'ceadc5065da97bd92404cac7254da66cc6eb192679cf1002098688978d4d5132'},
     {'large.tgz': 'f328d88b7fa97722f271d7d0cfea1c220e0f8e5ed5ff01d8ef1eb51d6f4243a1'},
     {'timing.tgz': '999c65f8ea8bd4eac7f1c7f3463d4946917afd20a997807300fe35d70122f3af'},
     {'OpenBLAS-0.3.15_workaround-gcc-miscompilation.patch':
      'e6b326fb8c4a8a6fd07741d9983c37a72c55c9ff9a4f74a80e1352ce5f975971'},
+    {'OpenBLAS-0.3.20_use-xASUM-microkernels-on-new-intel-cpus.patch':
+     '1dbd0f9473963dbdd9131611b455d8a801f1e995eae82896186d3d3ffe6d5f03'},
     {'OpenBLAS-0.3.21_fix-order-vectorization.patch':
      '08af834e5d60441fd35c128758ed9c092ba6887c829e0471ecd489079539047d'},
-    {'OpenBLAS-0.3.26_lapack_qr_noninittest.patch': '4781bf1d7b239374fd8069e15b4e2c0ef0e8efaa1a7d4c33557bd5b27e5de77c'},
-    {'OpenBLAS-0.3.27_fix_zscal.patch': '9210d7b66538dabaddbe1bfceb16f8225708856f60876ca5561b19d3599f9fd1'},
-    {'OpenBLAS-0.3.27_riscv-drop-static-fortran-flag.patch':
-     'f374e41efffd592ab1c9034df9e7abf1045ed151f4fc0fd0da618ce9826f2d4b'},
+    {'OpenBLAS-0.3.23_disable-xDRGES-LAPACK-test.patch':
+     'ab7e0af05f9b2a2ced32f3875e1e3767d9c3531a455421a38f7324350178a0ff'},
+    {'OpenBLAS-0.3.23_lapack_test_nomain.patch': '63e14e2cb67dd81ecc13fa0e07685f853abe0669d4cadd520247b88789346948'},
 ]
 
 builddependencies = [
     ('make', '4.4.1'),
     # required by LAPACK test suite
-    ('Python', '3.12.3'),
+    ('Python', '3.11.3'),
 ]
 
-# INTERFACE64=1 needs if you link OpenBLAS for fortran code compied with 64 bit integers (-i8)
-# This would be in intel library naming convention ilp64
-# The USE_OPENMP=0 and USE_THREAD=0 needs for the single threaded version
-# The USE_LOCKING=1 needs for thread safe version (if threaded software calls OpenBLAS, without it
-# OpenBLAS is not thread safe (so only single threaded software would be able to use it)
-buildopts = "INTERFACE64=1 USE_OPENMP=0 USE_THREAD=0 USE_LOCKING=1 "
-testopts = buildopts
-installopts = buildopts
-
 run_lapack_tests = True
 max_failing_lapack_tests_num_errors = 150

Updated software `OpenMPI-4.1.5-NVHPC-23.7-CUDA-12.1.1.eb`

Diff against OpenMPI-5.0.7-GCC-14.2.0.eb

easybuild/easyconfigs/o/OpenMPI/OpenMPI-5.0.7-GCC-14.2.0.eb

diff --git a/easybuild/easyconfigs/o/OpenMPI/OpenMPI-5.0.7-GCC-14.2.0.eb b/easybuild/easyconfigs/o/OpenMPI/OpenMPI-4.1.5-NVHPC-23.7-CUDA-12.1.1.eb
index e00af97c34..f858d99eb9 100644
--- a/easybuild/easyconfigs/o/OpenMPI/OpenMPI-5.0.7-GCC-14.2.0.eb
+++ b/easybuild/easyconfigs/o/OpenMPI/OpenMPI-4.1.5-NVHPC-23.7-CUDA-12.1.1.eb
@@ -1,45 +1,72 @@
 name = 'OpenMPI'
-version = '5.0.7'
+version = '4.1.5'
 
 homepage = 'https://www.open-mpi.org/'
 description = """The Open MPI Project is an open source MPI-3 implementation."""
 
-toolchain = {'name': 'GCC', 'version': '14.2.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
+toolchainopts = {
+    'extra_fcflags': '-Mstandard',  # https://forums.developer.nvidia.com/t/howto-build-openmpi-with-nvhpc-24-1/283219
+}
 
 source_urls = ['https://www.open-mpi.org/software/ompi/v%(version_major_minor)s/downloads']
 sources = [SOURCELOWER_TAR_BZ2]
 patches = [
-    ('OpenMPI-5.0.6_build-with-internal-cuda-header.patch', 1),
-    ('OpenMPI-5.0.7_fix-sshmem-build-failure.patch'),
+    'OpenMPI-4.1.1_build-with-internal-cuda-header.patch',
+    'OpenMPI-4.1.1_opal-datatype-cuda-performance.patch',
+    'OpenMPI-4.1.5_fix-pmix3x.patch',
+    'OpenMPI-4.1.x_add_atomic_wmb.patch',
 ]
 checksums = [
-    {'openmpi-5.0.7.tar.bz2': '119f2009936a403334d0df3c0d74d5595a32d99497f9b1d41e90019fee2fc2dd'},
-    {'OpenMPI-5.0.6_build-with-internal-cuda-header.patch':
-     '4821f0740ae4b97f3ff5259f7bac67a11d8cdeede3b1425825c241cf6a2864bb'},
-    {'OpenMPI-5.0.7_fix-sshmem-build-failure.patch':
-     '7382a5bbe44c6eff9ab05c8f315a8911d529749655126d4375e44e809bfedec7'},
+    {'openmpi-4.1.5.tar.bz2': 'a640986bc257389dd379886fdae6264c8cfa56bc98b71ce3ae3dfbd8ce61dbe3'},
+    {'OpenMPI-4.1.1_build-with-internal-cuda-header.patch':
+     '63eac52736bdf7644c480362440a7f1f0ae7c7cae47b7565f5635c41793f8c83'},
+    {'OpenMPI-4.1.1_opal-datatype-cuda-performance.patch':
+     'b767c7166cf0b32906132d58de5439c735193c9fd09ec3c5c11db8d5fa68750e'},
+    {'OpenMPI-4.1.5_fix-pmix3x.patch': '46edac3dbf32f2a611d45e8a3c8edd3ae2f430eec16a1373b510315272115c40'},
+    {'OpenMPI-4.1.x_add_atomic_wmb.patch': '9494bbc546d661ba5189e44b4c84a7f8df30a87cdb9d96ce2e73a7c8fecba172'},
 ]
 
 builddependencies = [
-    ('pkgconf', '2.3.0'),
-    ('Autotools', '20240712'),
+    ('pkgconf', '1.9.5'),
+    ('Perl', '5.36.1'),
+    ('Autotools', '20220317'),
 ]
 
 dependencies = [
-    ('zlib', '1.3.1'),
-    ('hwloc', '2.11.2'),
+    ('zlib', '1.2.13'),
+    ('hwloc', '2.9.1'),
     ('libevent', '2.1.12'),
-    ('UCX', '1.18.0'),
-    ('libfabric', '2.0.0'),
-    ('PMIx', '5.0.6'),
-    ('PRRTE', '3.0.8'),
-    ('UCC', '1.3.0'),
+    ('UCX', '1.14.1'),
+    ('UCX-CUDA', '1.14.1', '-CUDA-%(cudaver)s'),
+    ('libfabric', '1.18.0'),
+    ('PMIx', '4.2.4'),
+    ('UCC', '1.2.0'),
+    ('UCC-CUDA', '1.2.0', '-CUDA-%(cudaver)s'),
 ]
 
+# Update configure to include changes from the "internal-cuda" patch
+# by running a subset of autogen.pl sufficient to achieve this
+# without doing the full, long-running regeneration.
+preconfigopts = ' && '.join([
+    'cd config',
+    'autom4te --language=m4sh opal_get_version.m4sh -o opal_get_version.sh',
+    'cd ..',
+    'autoconf',
+    'autoheader',
+    'aclocal',
+    'automake',
+    ''
+])
+
 # CUDA related patches and custom configure option can be removed if CUDA support isn't wanted.
-preconfigopts = 'gcc -Iopal/mca/cuda/include -shared opal/mca/cuda/lib/cuda.c -o opal/mca/cuda/lib/libcuda.so && '
-configopts = '--with-cuda=%(start_dir)s/opal/mca/cuda --with-show-load-errors=no'
-# Do not pick up the system library automatically
-configopts += ' --without-xpmem'
+configopts = '--with-cuda=internal '
+configopts += ' CC=pgcc CXX=pgc++ FC=pgfortran'
+
+# disable MPI1 compatibility for now, see what breaks...
+# configopts += '--enable-mpi1-compatibility '
+
+# to enable SLURM integration (site-specific)
+# configopts += '--with-slurm --with-pmi=/usr/include/slurm --with-pmi-libdir=/usr'
 
 moduleclass = 'mpi'

Diff against OpenMPI-5.0.3-NVHPC-24.9-CUDA-12.6.0.eb

easybuild/easyconfigs/o/OpenMPI/OpenMPI-5.0.3-NVHPC-24.9-CUDA-12.6.0.eb

diff --git a/easybuild/easyconfigs/o/OpenMPI/OpenMPI-5.0.3-NVHPC-24.9-CUDA-12.6.0.eb b/easybuild/easyconfigs/o/OpenMPI/OpenMPI-4.1.5-NVHPC-23.7-CUDA-12.1.1.eb
index bde23ecd78..f858d99eb9 100644
--- a/easybuild/easyconfigs/o/OpenMPI/OpenMPI-5.0.3-NVHPC-24.9-CUDA-12.6.0.eb
+++ b/easybuild/easyconfigs/o/OpenMPI/OpenMPI-4.1.5-NVHPC-23.7-CUDA-12.1.1.eb
@@ -1,65 +1,72 @@
 name = 'OpenMPI'
-version = '5.0.3'
+version = '4.1.5'
 
 homepage = 'https://www.open-mpi.org/'
 description = """The Open MPI Project is an open source MPI-3 implementation."""
 
-toolchain = {'name': 'NVHPC', 'version': '24.9-CUDA-12.6.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
+toolchainopts = {
+    'extra_fcflags': '-Mstandard',  # https://forums.developer.nvidia.com/t/howto-build-openmpi-with-nvhpc-24-1/283219
+}
 
 source_urls = ['https://www.open-mpi.org/software/ompi/v%(version_major_minor)s/downloads']
 sources = [SOURCELOWER_TAR_BZ2]
 patches = [
-    'OpenMPI-5.0.3_fix_hle_make_errors.patch',
-    'OpenMPI-5.0.3_disable_opal_path_nfs_test.patch',
-    ('OpenMPI-5.0.2_build-with-internal-cuda-header.patch', 1)
+    'OpenMPI-4.1.1_build-with-internal-cuda-header.patch',
+    'OpenMPI-4.1.1_opal-datatype-cuda-performance.patch',
+    'OpenMPI-4.1.5_fix-pmix3x.patch',
+    'OpenMPI-4.1.x_add_atomic_wmb.patch',
 ]
 checksums = [
-    {'openmpi-5.0.3.tar.bz2':
-     '990582f206b3ab32e938aa31bbf07c639368e4405dca196fabe7f0f76eeda90b'},
-    {'OpenMPI-5.0.3_fix_hle_make_errors.patch':
-     '881c907a9f5901d5d6af41cd33dffdcecba4a67a9e5123e602542aea57a80895'},
-    {'OpenMPI-5.0.3_disable_opal_path_nfs_test.patch':
-     '75d4417e35252ea3a19b2792f1b06e9aeb408c253aa4921d77226d57b71dee45'},
-    {'OpenMPI-5.0.2_build-with-internal-cuda-header.patch':
-     'f52dc470543f35efef10d651dd159c771ae25f8f76a420d20d87abf4dc769ed7'},
+    {'openmpi-4.1.5.tar.bz2': 'a640986bc257389dd379886fdae6264c8cfa56bc98b71ce3ae3dfbd8ce61dbe3'},
+    {'OpenMPI-4.1.1_build-with-internal-cuda-header.patch':
+     '63eac52736bdf7644c480362440a7f1f0ae7c7cae47b7565f5635c41793f8c83'},
+    {'OpenMPI-4.1.1_opal-datatype-cuda-performance.patch':
+     'b767c7166cf0b32906132d58de5439c735193c9fd09ec3c5c11db8d5fa68750e'},
+    {'OpenMPI-4.1.5_fix-pmix3x.patch': '46edac3dbf32f2a611d45e8a3c8edd3ae2f430eec16a1373b510315272115c40'},
+    {'OpenMPI-4.1.x_add_atomic_wmb.patch': '9494bbc546d661ba5189e44b4c84a7f8df30a87cdb9d96ce2e73a7c8fecba172'},
 ]
 
 builddependencies = [
-    ('pkgconf', '2.2.0'),
-    ('Perl', '5.38.2'),
-    ('Autotools', '20231222'),
+    ('pkgconf', '1.9.5'),
+    ('Perl', '5.36.1'),
+    ('Autotools', '20220317'),
 ]
 
 dependencies = [
-    ('zlib', '1.3.1'),
-    ('hwloc', '2.10.0'),
+    ('zlib', '1.2.13'),
+    ('hwloc', '2.9.1'),
     ('libevent', '2.1.12'),
-    ('UCX', '1.16.0'),
-    ('UCX-CUDA', '1.16.0', '-CUDA-%(cudaver)s'),
-    ('libfabric', '1.21.0'),
-    ('PMIx', '5.0.2'),
-    ('PRRTE', '3.0.5'),
-    ('UCC', '1.3.0'),
-    ('UCC-CUDA', '1.3.0', '-CUDA-%(cudaver)s'),
+    ('UCX', '1.14.1'),
+    ('UCX-CUDA', '1.14.1', '-CUDA-%(cudaver)s'),
+    ('libfabric', '1.18.0'),
+    ('PMIx', '4.2.4'),
+    ('UCC', '1.2.0'),
+    ('UCC-CUDA', '1.2.0', '-CUDA-%(cudaver)s'),
 ]
 
-# CUDA related patches and custom configure option can be removed if CUDA support isn't wanted.
-preconfigopts = 'nvc -Iopal/mca/cuda/include -shared opal/mca/cuda/lib/cuda.c -o opal/mca/cuda/lib/libcuda.so && '
-# Update configure to include changes from the "disable_opal_path_nfs_test" patch
-preconfigopts += './autogen.pl --force && '
+# Update configure to include changes from the "internal-cuda" patch
+# by running a subset of autogen.pl sufficient to achieve this
+# without doing the full, long-running regeneration.
+preconfigopts = ' && '.join([
+    'cd config',
+    'autom4te --language=m4sh opal_get_version.m4sh -o opal_get_version.sh',
+    'cd ..',
+    'autoconf',
+    'autoheader',
+    'aclocal',
+    'automake',
+    ''
+])
 
-configopts = '--with-cuda=%(start_dir)s/opal/mca/cuda '
-# Required to prevent internal compiler error in opal.
-configopts += ' --enable-alt-short-float=no'
-# Do not pick up the system library automatically
-configopts += ' --without-xpmem'
-# Set PGI compilers manually, as NVHPC compilers are not correctly detected
+# CUDA related patches and custom configure option can be removed if CUDA support isn't wanted.
+configopts = '--with-cuda=internal '
 configopts += ' CC=pgcc CXX=pgc++ FC=pgfortran'
 
-# site specific options
-# configopts += ' --without-psm2 '
-# configopts += ' --disable-oshmem '
-# configopts += ' --with-gpfs '
-# configopts += ' --with-slurm '
+# disable MPI1 compatibility for now, see what breaks...
+# configopts += '--enable-mpi1-compatibility '
+
+# to enable SLURM integration (site-specific)
+# configopts += '--with-slurm --with-pmi=/usr/include/slurm --with-pmi-libdir=/usr'
 
 moduleclass = 'mpi'

Updated software `QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb`

Diff against QuantumESPRESSO-7.4-foss-2024a-minimal.eb

easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.4-foss-2024a-minimal.eb

diff --git a/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.4-foss-2024a-minimal.eb b/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb
index 378170ec51..609a3b47f0 100644
--- a/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.4-foss-2024a-minimal.eb
+++ b/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb
@@ -1,21 +1,15 @@
-# This compiles QuantumESPRESSSO with minimal dependencies, the module
-# is intended to be used with the koopmans module, where a forked
-# version of qe-utils is used and doesn't support all the QE
-# dependencies when configuring with CMake.
+name = "QuantumESPRESSO"
+version = "7.3.1"
+versionsuffix = '-CUDA-%(cudaver)s'
 
-name = 'QuantumESPRESSO'
-version = '7.4'
-versionsuffix = '-minimal'
-
-homepage = 'https://www.quantum-espresso.org'
+homepage = "https://www.quantum-espresso.org"
 description = """Quantum ESPRESSO  is an integrated suite of computer codes
 for electronic-structure calculations and materials modeling at the nanoscale.
 It is based on density-functional theory, plane waves, and pseudopotentials
 (both norm-conserving and ultrasoft).
 """
 
-toolchain = {'name': 'foss', 'version': '2024a'}
-
+toolchain = {"name": "nvompi", "version": "2023a"}
 toolchainopts = {
     "usempi": True,
     "openmp": True,
@@ -23,13 +17,13 @@ toolchainopts = {
 
 # Check hashes inside external/submodule_commit_hash_records when making file for new version
 local_lapack_hash = "12d825396fcef1e0a1b27be9f119f9e554621e55"
-local_mbd_hash = "89a3cc199c0a200c9f0f688c3229ef6b9a8d63bd"
+local_mbd_hash = "82005cbb65bdf5d32ca021848eec8f19da956a77"
 local_devxlib_hash = "a6b89ef77b1ceda48e967921f1f5488d2df9226d"
 local_fox_hash = "3453648e6837658b747b895bb7bef4b1ed2eac40"
-# Different from the one at tag qe-7.4, see https://github.com/anharmonic/d3q/issues/22
-local_d3q_hash = "808acbaf012468f42147d8d6af452ec64b9e5ab0"
-# Different from the one at tag qe-7.4
-local_qe_gipaw_hash = "9b2ae1a46cae045cc04ef02c1072f2e1e74873b2"
+# Different from the one at tag qe-7.3.1 because of:
+# https://gitlab.com/QEF/q-e/-/issues/666
+local_d3q_hash = "de4718351e7bbb9d1d12aad2b7ca232d06775b83"
+local_qe_gipaw_hash = "75b01b694c9ba4df55d294cacc27cf28591b2161"
 local_qmcpack_hash = "f72ab25fa4ea755c1b4b230ae8074b47d5509c70"
 local_w90_hash = "1d6b187374a2d50b509e5e79e2cab01a79ff7ce1"
 
@@ -41,7 +35,7 @@ sources = [
         "source_urls": ["https://gitlab.com/QEF/q-e/-/archive/qe-%(version)s"],
     },
     {
-        "filename": f"lapack-{local_lapack_hash[:8]}.tar.xz",
+        "filename": "lapack-%s.tar.gz" % local_lapack_hash,
         "git_config": {
             "url": "https://github.com/Reference-LAPACK",
             "repo_name": "lapack",
@@ -49,7 +43,7 @@ sources = [
         },
     },
     {
-        "filename": f"mbd-{local_mbd_hash[:8]}.tar.xz",
+        "filename": "mbd-%s.tar.gz" % local_mbd_hash,
         "git_config": {
             "url": "https://github.com/libmbd",
             "repo_name": "libmbd",
@@ -58,7 +52,7 @@ sources = [
         },
     },
     {
-        "filename": f"devxlib-{local_devxlib_hash[:8]}.tar.xz",
+        "filename": "devxlib-%s.tar.gz" % local_devxlib_hash,
         "git_config": {
             "url": "https://gitlab.com/max-centre/components",
             "repo_name": "devicexlib",
@@ -67,7 +61,7 @@ sources = [
         },
     },
     {
-        "filename": f"d3q-{local_d3q_hash[:8]}.tar.xz",
+        "filename": "d3q-%s.tar.gz" % local_d3q_hash,
         "git_config": {
             "url": "https://github.com/anharmonic",
             "repo_name": "d3q",
@@ -75,7 +69,7 @@ sources = [
         },
     },
     {
-        "filename": f"fox-{local_fox_hash[:8]}.tar.xz",
+        "filename": "fox-%s.tar.gz" % local_fox_hash,
         "git_config": {
             "url": "https://github.com/pietrodelugas",
             "repo_name": "fox",
@@ -83,7 +77,7 @@ sources = [
         },
     },
     {
-        "filename": f"qe-gipaw-{local_qe_gipaw_hash[:8]}.tar.xz",
+        "filename": "qe-gipaw-%s.tar.gz" % local_qe_gipaw_hash,
         "git_config": {
             "url": "https://github.com/dceresoli",
             "repo_name": "qe-gipaw",
@@ -91,7 +85,7 @@ sources = [
         },
     },
     {
-        "filename": f"pw2qmcpack-{local_qmcpack_hash[:8]}.tar.xz",
+        "filename": "pw2qmcpack-%s.tar.gz" % local_qmcpack_hash,
         "git_config": {
             "url": "https://github.com/QMCPACK",
             "repo_name": "pw2qmcpack",
@@ -99,7 +93,7 @@ sources = [
         },
     },
     {
-        "filename": f"wannier90-{local_w90_hash[:8]}.tar.xz",
+        "filename": "wannier90-%s.tar.gz" % local_w90_hash,
         "git_config": {
             "url": "https://github.com/wannier-developers",
             "repo_name": "wannier90",
@@ -107,63 +101,76 @@ sources = [
         },
     },
 ]
-patches = [
-    # sourcepath needed for patches applied outside the first `finalpath` directory
-    {'name': 'QuantumESPRESSO-7.4-d3q.patch', 'sourcepath': '../'},
-    {'name': 'QuantumESPRESSO-7.4-parallel-symmetrization.patch'},
-]
+# Holding off checksum checks untill 5.0.x
+# https://github.com/easybuilders/easybuild-framework/pull/4248
+# checksums = [
+#     {'q-e-qe-7.3.1.tar.gz': '2c58b8fadfe4177de5a8b69eba447db5e623420b070dea6fd26c1533b081d844'},
+#     {'lapack-%s.tar.gz' % local_lapack_hash: 'c05532ae0e5fe35f473206dda12970da5f2e2214620487d71837ddcf0ea6b21d'},
+#     {'mbd-%s.tar.gz' % local_mbd_hash: 'a180682c00bb890c9b1e26a98addbd68e32f970c06439acf7582415f4c589800'},
+#     {'devxlib-%s.tar.gz' % local_devxlib_hash: '76da8fe5a2050f58efdc92fa8831efec25c19190df7f4e5e39c173a5fbae83b4'},
+#     {'d3q-%s.tar.gz' % local_d3q_hash: '43e50753a56af05d181b859d3e29d842fb3fc4352f00cb7fe229a435a1f20c31'},
+#     {'fox-%s.tar.gz' % local_fox_hash: '99b6a899a3f947d7763aa318e86f9f08db684568bfdcd293f3318bee9d7f1948'},
+#     {'qe-gipaw-%s.tar.gz' % local_qe_gipaw_hash: '9ac8314363d29cc2f1ce85abd8f26c1a3ae311d54f6e6034d656442dd101c928'},
+#     {'pw2qmcpack-%s.tar.gz' % local_qmcpack_hash: 'a8136da8429fc49ab560ef7356cd6f0a2714dfbb137baff7961f46dfe32061eb'},
+#     {'wannier90-%s.tar.gz' % local_w90_hash: 'f989497790ec9777bdc159945bbf42156edb7268011f972874dec67dd4f58658'},
+# ]
 checksums = [
-    {'q-e-qe-7.4.tar.gz':
-     'b15dcfe25f4fbf15ccd34c1194021e90996393478226e601d876f7dea481d104'},
-    {'lapack-12d82539.tar.xz':
-     '88aea5bca5e730e99fda0a5b9d677d6036c7dd82874e0deaed5cccef1f880111'},
-    {'mbd-89a3cc19.tar.xz':
-     'd026bf0e9334874670a23cd854f445baac003d4f099afa46bab667bc67abb450'},
-    {'devxlib-a6b89ef7.tar.xz':
-     '0a9b7e5350f44017a2390c85176d1683c6ecec0e4b716a59d727f7650f16e807'},
-    {'d3q-808acbaf.tar.xz':
-     '8e42c946c33b90094ad16c3fd545f00a6801958880dfc5e5274759126a4b193c'},
-    {'fox-3453648e.tar.xz':
-     'c8c55cdf9eb2709aebac86a58f936480ee66438dffd3d65c6a35ca7771c031b3'},
-    {'qe-gipaw-9b2ae1a4.tar.xz':
-     '29e6edfda8ee71c12683b1dfce4a29c5fff8aa9046b0a8085441dce01d084475'},
-    {'pw2qmcpack-f72ab25f.tar.xz':
-     'bc9513c4901ec2469d56b8a6b66f56878cb13e3bc7fbcdc5dba0ca6dad880ab9'},
-    {'wannier90-1d6b1873.tar.xz':
-     '351531aaf3434a9aac92d39ee40df5eb949aa27d14fcb93518bf08444478cd2a'},
-    {'QuantumESPRESSO-7.4-d3q.patch':
-     '1f1686365fbf0cc56f634e072a92b3d336fe454348e514d0b4136d447f0d4923'},
-    {'QuantumESPRESSO-7.4-parallel-symmetrization.patch':
-     'e11ac954fa2289a3b453e86871a819a78972e94681f08425ec35dc51a908f7d2'},
+    '2c58b8fadfe4177de5a8b69eba447db5e623420b070dea6fd26c1533b081d844',
+    None, None, None, None, None, None, None, None
 ]
 
+local_gcc_compiler = ('GCCcore', '12.3.0')
+local_compiler = ('NVHPC', '23.7-CUDA-12.1.1')
+
 builddependencies = [
-    ('M4', '1.4.19'),
-    ('CMake', '3.29.3'),
-    ('pkgconf', '2.2.0'),
+    ("M4", "1.4.19", '', local_gcc_compiler),
+    ("CMake", "3.26.3", '', local_gcc_compiler),
+    ("cURL", "8.0.1", '', local_gcc_compiler),
 ]
 dependencies = [
-    # ('HDF5', '1.14.5'),
-    # ('ELPA', '2024.05.001'),
-    # ('libxc', '6.2.2'),
+    ('CUDA', '12.1.1', '', SYSTEM),
+    ('OpenBLAS', '0.3.24', '', local_compiler),
+    ('FFTW', '3.3.10', '', local_compiler),
+    ('FFTW.MPI', '3.3.10'),
+    ('ScaLAPACK', '2.2.0'),
+    ("HDF5", "1.14.0", '-CUDA-%(cudaver)s'),
+    ("libxc", "6.2.2", '', local_compiler),
+    # ("ELPA", "2023.05.001"),
 ]
 
 # Disabled because of
 # https://gitlab.com/QEF/q-e/-/issues/667
 # https://github.com/anharmonic/d3q/issues/15
 build_shared_libs = False
+with_cuda = True
+
 with_scalapack = True
-with_fox = False
-with_gipaw = False
-with_d3q = False
-with_qmcpack = False
+with_gipaw = False  # https://github.com/dceresoli/qe-gipaw/issues/19
+with_d3q = False  # Gives `conflict with access` errors
+with_qmcpack = True
 
 moduleclass = "chem"
 
+test_suite_nprocs = 1  # Unless multiple GPUs are available or GPU should be oversubscribed
 test_suite_threshold = (
-    0.98
+    0.4  # Low threshold because of https://gitlab.com/QEF/q-e/-/issues/665
 )
 test_suite_max_failed = (
     5  # Allow for some flaky tests (failed due to strict thresholds)
 )
-test_suite_allow_failures = []
+test_suite_allow_failures = [
+    "test_qe_xclib_",  # 7.3.1:  https://gitlab.com/QEF/q-e/-/issues/640
+
+    # 7.3.1: Broken testsuite (https://gitlab.com/QEF/q-e/-/issues/665)
+    "--hp_",
+    "--ph_",
+    "--epw_",
+    "--tddfpt_",
+
+    # 7.3.1: https://gitlab.com/QEF/q-e/-/issues/675
+    "system--pw_scf--scf-rmm-k",
+    "system--pw_scf--scf-rmm-paro-k",
+    "system--pw_scf-correctness",
+    "system--pw_noncolin--noncolin-rmm",
+    "system--pw_noncolin-correctness",
+]

Diff against QuantumESPRESSO-7.4-foss-2024a.eb

easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.4-foss-2024a.eb

diff --git a/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.4-foss-2024a.eb b/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb
index 030f24ab07..609a3b47f0 100644
--- a/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.4-foss-2024a.eb
+++ b/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb
@@ -1,15 +1,15 @@
-name = 'QuantumESPRESSO'
-version = '7.4'
+name = "QuantumESPRESSO"
+version = "7.3.1"
+versionsuffix = '-CUDA-%(cudaver)s'
 
-homepage = 'https://www.quantum-espresso.org'
+homepage = "https://www.quantum-espresso.org"
 description = """Quantum ESPRESSO  is an integrated suite of computer codes
 for electronic-structure calculations and materials modeling at the nanoscale.
 It is based on density-functional theory, plane waves, and pseudopotentials
 (both norm-conserving and ultrasoft).
 """
 
-toolchain = {'name': 'foss', 'version': '2024a'}
-
+toolchain = {"name": "nvompi", "version": "2023a"}
 toolchainopts = {
     "usempi": True,
     "openmp": True,
@@ -17,13 +17,13 @@ toolchainopts = {
 
 # Check hashes inside external/submodule_commit_hash_records when making file for new version
 local_lapack_hash = "12d825396fcef1e0a1b27be9f119f9e554621e55"
-local_mbd_hash = "89a3cc199c0a200c9f0f688c3229ef6b9a8d63bd"
+local_mbd_hash = "82005cbb65bdf5d32ca021848eec8f19da956a77"
 local_devxlib_hash = "a6b89ef77b1ceda48e967921f1f5488d2df9226d"
 local_fox_hash = "3453648e6837658b747b895bb7bef4b1ed2eac40"
-# Different from the one at tag qe-7.4, see https://github.com/anharmonic/d3q/issues/22
-local_d3q_hash = "808acbaf012468f42147d8d6af452ec64b9e5ab0"
-# Different from the one at tag qe-7.4
-local_qe_gipaw_hash = "9b2ae1a46cae045cc04ef02c1072f2e1e74873b2"
+# Different from the one at tag qe-7.3.1 because of:
+# https://gitlab.com/QEF/q-e/-/issues/666
+local_d3q_hash = "de4718351e7bbb9d1d12aad2b7ca232d06775b83"
+local_qe_gipaw_hash = "75b01b694c9ba4df55d294cacc27cf28591b2161"
 local_qmcpack_hash = "f72ab25fa4ea755c1b4b230ae8074b47d5509c70"
 local_w90_hash = "1d6b187374a2d50b509e5e79e2cab01a79ff7ce1"
 
@@ -35,7 +35,7 @@ sources = [
         "source_urls": ["https://gitlab.com/QEF/q-e/-/archive/qe-%(version)s"],
     },
     {
-        "filename": f"lapack-{local_lapack_hash[:8]}.tar.xz",
+        "filename": "lapack-%s.tar.gz" % local_lapack_hash,
         "git_config": {
             "url": "https://github.com/Reference-LAPACK",
             "repo_name": "lapack",
@@ -43,7 +43,7 @@ sources = [
         },
     },
     {
-        "filename": f"mbd-{local_mbd_hash[:8]}.tar.xz",
+        "filename": "mbd-%s.tar.gz" % local_mbd_hash,
         "git_config": {
             "url": "https://github.com/libmbd",
             "repo_name": "libmbd",
@@ -52,7 +52,7 @@ sources = [
         },
     },
     {
-        "filename": f"devxlib-{local_devxlib_hash[:8]}.tar.xz",
+        "filename": "devxlib-%s.tar.gz" % local_devxlib_hash,
         "git_config": {
             "url": "https://gitlab.com/max-centre/components",
             "repo_name": "devicexlib",
@@ -61,7 +61,7 @@ sources = [
         },
     },
     {
-        "filename": f"d3q-{local_d3q_hash[:8]}.tar.xz",
+        "filename": "d3q-%s.tar.gz" % local_d3q_hash,
         "git_config": {
             "url": "https://github.com/anharmonic",
             "repo_name": "d3q",
@@ -69,7 +69,7 @@ sources = [
         },
     },
     {
-        "filename": f"fox-{local_fox_hash[:8]}.tar.xz",
+        "filename": "fox-%s.tar.gz" % local_fox_hash,
         "git_config": {
             "url": "https://github.com/pietrodelugas",
             "repo_name": "fox",
@@ -77,7 +77,7 @@ sources = [
         },
     },
     {
-        "filename": f"qe-gipaw-{local_qe_gipaw_hash[:8]}.tar.xz",
+        "filename": "qe-gipaw-%s.tar.gz" % local_qe_gipaw_hash,
         "git_config": {
             "url": "https://github.com/dceresoli",
             "repo_name": "qe-gipaw",
@@ -85,7 +85,7 @@ sources = [
         },
     },
     {
-        "filename": f"pw2qmcpack-{local_qmcpack_hash[:8]}.tar.xz",
+        "filename": "pw2qmcpack-%s.tar.gz" % local_qmcpack_hash,
         "git_config": {
             "url": "https://github.com/QMCPACK",
             "repo_name": "pw2qmcpack",
@@ -93,7 +93,7 @@ sources = [
         },
     },
     {
-        "filename": f"wannier90-{local_w90_hash[:8]}.tar.xz",
+        "filename": "wannier90-%s.tar.gz" % local_w90_hash,
         "git_config": {
             "url": "https://github.com/wannier-developers",
             "repo_name": "wannier90",
@@ -101,63 +101,76 @@ sources = [
         },
     },
 ]
-patches = [
-    # sourcepath needed for patches applied outside the first `finalpath` directory
-    {'name': 'QuantumESPRESSO-7.4-d3q.patch', 'sourcepath': '../'},
-    {'name': 'QuantumESPRESSO-7.4-parallel-symmetrization.patch'},
-]
+# Holding off checksum checks untill 5.0.x
+# https://github.com/easybuilders/easybuild-framework/pull/4248
+# checksums = [
+#     {'q-e-qe-7.3.1.tar.gz': '2c58b8fadfe4177de5a8b69eba447db5e623420b070dea6fd26c1533b081d844'},
+#     {'lapack-%s.tar.gz' % local_lapack_hash: 'c05532ae0e5fe35f473206dda12970da5f2e2214620487d71837ddcf0ea6b21d'},
+#     {'mbd-%s.tar.gz' % local_mbd_hash: 'a180682c00bb890c9b1e26a98addbd68e32f970c06439acf7582415f4c589800'},
+#     {'devxlib-%s.tar.gz' % local_devxlib_hash: '76da8fe5a2050f58efdc92fa8831efec25c19190df7f4e5e39c173a5fbae83b4'},
+#     {'d3q-%s.tar.gz' % local_d3q_hash: '43e50753a56af05d181b859d3e29d842fb3fc4352f00cb7fe229a435a1f20c31'},
+#     {'fox-%s.tar.gz' % local_fox_hash: '99b6a899a3f947d7763aa318e86f9f08db684568bfdcd293f3318bee9d7f1948'},
+#     {'qe-gipaw-%s.tar.gz' % local_qe_gipaw_hash: '9ac8314363d29cc2f1ce85abd8f26c1a3ae311d54f6e6034d656442dd101c928'},
+#     {'pw2qmcpack-%s.tar.gz' % local_qmcpack_hash: 'a8136da8429fc49ab560ef7356cd6f0a2714dfbb137baff7961f46dfe32061eb'},
+#     {'wannier90-%s.tar.gz' % local_w90_hash: 'f989497790ec9777bdc159945bbf42156edb7268011f972874dec67dd4f58658'},
+# ]
 checksums = [
-    {'q-e-qe-7.4.tar.gz':
-     'b15dcfe25f4fbf15ccd34c1194021e90996393478226e601d876f7dea481d104'},
-    {'lapack-12d82539.tar.xz':
-     '88aea5bca5e730e99fda0a5b9d677d6036c7dd82874e0deaed5cccef1f880111'},
-    {'mbd-89a3cc19.tar.xz':
-     'd026bf0e9334874670a23cd854f445baac003d4f099afa46bab667bc67abb450'},
-    {'devxlib-a6b89ef7.tar.xz':
-     '0a9b7e5350f44017a2390c85176d1683c6ecec0e4b716a59d727f7650f16e807'},
-    {'d3q-808acbaf.tar.xz':
-     '8e42c946c33b90094ad16c3fd545f00a6801958880dfc5e5274759126a4b193c'},
-    {'fox-3453648e.tar.xz':
-     'c8c55cdf9eb2709aebac86a58f936480ee66438dffd3d65c6a35ca7771c031b3'},
-    {'qe-gipaw-9b2ae1a4.tar.xz':
-     '29e6edfda8ee71c12683b1dfce4a29c5fff8aa9046b0a8085441dce01d084475'},
-    {'pw2qmcpack-f72ab25f.tar.xz':
-     'bc9513c4901ec2469d56b8a6b66f56878cb13e3bc7fbcdc5dba0ca6dad880ab9'},
-    {'wannier90-1d6b1873.tar.xz':
-     '351531aaf3434a9aac92d39ee40df5eb949aa27d14fcb93518bf08444478cd2a'},
-    {'QuantumESPRESSO-7.4-d3q.patch':
-     '1f1686365fbf0cc56f634e072a92b3d336fe454348e514d0b4136d447f0d4923'},
-    {'QuantumESPRESSO-7.4-parallel-symmetrization.patch':
-     'e11ac954fa2289a3b453e86871a819a78972e94681f08425ec35dc51a908f7d2'},
+    '2c58b8fadfe4177de5a8b69eba447db5e623420b070dea6fd26c1533b081d844',
+    None, None, None, None, None, None, None, None
 ]
 
+local_gcc_compiler = ('GCCcore', '12.3.0')
+local_compiler = ('NVHPC', '23.7-CUDA-12.1.1')
+
 builddependencies = [
-    ('M4', '1.4.19'),
-    ('CMake', '3.29.3'),
-    ('pkgconf', '2.2.0'),
+    ("M4", "1.4.19", '', local_gcc_compiler),
+    ("CMake", "3.26.3", '', local_gcc_compiler),
+    ("cURL", "8.0.1", '', local_gcc_compiler),
 ]
 dependencies = [
-    ('HDF5', '1.14.5'),
-    ('ELPA', '2024.05.001'),
-    ('libxc', '6.2.2'),
+    ('CUDA', '12.1.1', '', SYSTEM),
+    ('OpenBLAS', '0.3.24', '', local_compiler),
+    ('FFTW', '3.3.10', '', local_compiler),
+    ('FFTW.MPI', '3.3.10'),
+    ('ScaLAPACK', '2.2.0'),
+    ("HDF5", "1.14.0", '-CUDA-%(cudaver)s'),
+    ("libxc", "6.2.2", '', local_compiler),
+    # ("ELPA", "2023.05.001"),
 ]
 
 # Disabled because of
 # https://gitlab.com/QEF/q-e/-/issues/667
 # https://github.com/anharmonic/d3q/issues/15
 build_shared_libs = False
+with_cuda = True
+
 with_scalapack = True
-with_fox = True
-with_gipaw = True
-with_d3q = True
+with_gipaw = False  # https://github.com/dceresoli/qe-gipaw/issues/19
+with_d3q = False  # Gives `conflict with access` errors
 with_qmcpack = True
 
 moduleclass = "chem"
 
+test_suite_nprocs = 1  # Unless multiple GPUs are available or GPU should be oversubscribed
 test_suite_threshold = (
-    0.98
+    0.4  # Low threshold because of https://gitlab.com/QEF/q-e/-/issues/665
 )
 test_suite_max_failed = (
     5  # Allow for some flaky tests (failed due to strict thresholds)
 )
-test_suite_allow_failures = []
+test_suite_allow_failures = [
+    "test_qe_xclib_",  # 7.3.1:  https://gitlab.com/QEF/q-e/-/issues/640
+
+    # 7.3.1: Broken testsuite (https://gitlab.com/QEF/q-e/-/issues/665)
+    "--hp_",
+    "--ph_",
+    "--epw_",
+    "--tddfpt_",
+
+    # 7.3.1: https://gitlab.com/QEF/q-e/-/issues/675
+    "system--pw_scf--scf-rmm-k",
+    "system--pw_scf--scf-rmm-paro-k",
+    "system--pw_scf-correctness",
+    "system--pw_noncolin--noncolin-rmm",
+    "system--pw_noncolin-correctness",
+]

Updated software `ScaLAPACK-2.2.0-nvompi-2023a.eb`

Diff against ScaLAPACK-2.2.2-gompi-2025a-fb.eb

easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.2-gompi-2025a-fb.eb

diff --git a/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.2-gompi-2025a-fb.eb b/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-nvompi-2023a.eb
index ede0be1c68..8e4f83e46c 100644
--- a/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.2-gompi-2025a-fb.eb
+++ b/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-nvompi-2023a.eb
@@ -1,34 +1,35 @@
 name = 'ScaLAPACK'
-version = '2.2.2'
-versionsuffix = '-fb'
+version = '2.2.0'
 
 homepage = 'https://www.netlib.org/scalapack/'
 description = """The ScaLAPACK (or Scalable LAPACK) library includes a subset of LAPACK routines
  redesigned for distributed memory MIMD parallel computers."""
 
-toolchain = {'name': 'gompi', 'version': '2025a'}
+toolchain = {'name': 'nvompi', 'version': '2023a'}
 toolchainopts = {'extra_fflags': '-lpthread', 'openmp': True, 'pic': True, 'usempi': True}
 
-source_urls = ['https://github.com/Reference-ScaLAPACK/scalapack/archive/refs/tags/']
-sources = ['v%(version)s.tar.gz']
-checksums = ['a2f0c9180a210bf7ffe126c9cb81099cf337da1a7120ddb4cbe4894eb7b7d022']
+source_urls = [homepage]
+sources = [SOURCELOWER_TGZ]
+patches = ['ScaLAPACK-%(version)s_fix-GCC-10.patch']
+checksums = [
+    '40b9406c20735a9a3009d863318cb8d3e496fb073d201c5463df810e01ab2a57',  # scalapack-2.2.0.tgz
+    'f6bc3c6dee012ba4a696548a2e12b6aae932ce4fd5a142153b338839f52b5906',  # ScaLAPACK-2.2.0_fix-GCC-10.patch
+]
 
 builddependencies = [
-    ('CMake', '3.31.3'),
+    ('CMake', '3.26.3'),
 ]
 
 dependencies = [
-    ('FlexiBLAS', '3.4.5'),
+    ('OpenBLAS', '0.3.24'),
 ]
 
-preconfigopts = 'export CFLAGS="$CFLAGS -std=gnu89" &&'  # https://bugzilla.redhat.com/show_bug.cgi?id=2178710
-
 # Config Opts based on AOCL User Guide:
 # https://developer.amd.com/wp-content/resources/AOCL_User%20Guide_2.2.pdf
 
 configopts = '-DBUILD_SHARED_LIBS=ON '
-configopts += '-DBLAS_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
-configopts += '-DLAPACK_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
+# configopts += '-DBLAS_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
+# configopts += '-DLAPACK_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
 
 sanity_check_paths = {
     'files': ['lib/libscalapack.%s' % SHLIB_EXT, 'lib64/libscalapack.%s' % SHLIB_EXT],

Diff against ScaLAPACK-2.2.0-gompi-2024a-fb.eb

easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-gompi-2024a-fb.eb

diff --git a/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-gompi-2024a-fb.eb b/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-nvompi-2023a.eb
index 1bccc16f38..8e4f83e46c 100644
--- a/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-gompi-2024a-fb.eb
+++ b/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-nvompi-2023a.eb
@@ -1,12 +1,11 @@
 name = 'ScaLAPACK'
 version = '2.2.0'
-versionsuffix = '-fb'
 
 homepage = 'https://www.netlib.org/scalapack/'
 description = """The ScaLAPACK (or Scalable LAPACK) library includes a subset of LAPACK routines
  redesigned for distributed memory MIMD parallel computers."""
 
-toolchain = {'name': 'gompi', 'version': '2024a'}
+toolchain = {'name': 'nvompi', 'version': '2023a'}
 toolchainopts = {'extra_fflags': '-lpthread', 'openmp': True, 'pic': True, 'usempi': True}
 
 source_urls = [homepage]
@@ -18,19 +17,19 @@ checksums = [
 ]
 
 builddependencies = [
-    ('CMake', '3.29.3'),
+    ('CMake', '3.26.3'),
 ]
 
 dependencies = [
-    ('FlexiBLAS', '3.4.4'),
+    ('OpenBLAS', '0.3.24'),
 ]
 
 # Config Opts based on AOCL User Guide:
 # https://developer.amd.com/wp-content/resources/AOCL_User%20Guide_2.2.pdf
 
 configopts = '-DBUILD_SHARED_LIBS=ON '
-configopts += '-DBLAS_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
-configopts += '-DLAPACK_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
+# configopts += '-DBLAS_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
+# configopts += '-DLAPACK_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
 
 sanity_check_paths = {
     'files': ['lib/libscalapack.%s' % SHLIB_EXT, 'lib64/libscalapack.%s' % SHLIB_EXT],

This reverts commit 706e9d1.

Crivella · 2025-12-12T10:42:52Z

Closing this in favor of the upcoming dedicated NVHPC toolchains

make NVHPC a full toolchain with nvidia-compilers, NVHPCX, NVBLAS, and NVScaLAPACK easybuild-framework#4927
refactor NVHPC easyblock into generic NvidiaBase easyblock, and custom easyblocks for nvidia-compilers + NVHPC easybuild-easyblocks#3788
{compiler,toolchain}[system/system] nvidia-compilers v25.9, NVHPC v25.9 w/ CUDA 12.9.1 #23989

migueldiascosta added the update label Apr 16, 2024

migueldiascosta added this to the 4.x milestone Apr 16, 2024

Crivella changed the title ~~{numlib,chem,tollchain}[NVHPC/23.7-CUDA-12.1.1] nvofbf-2023a + QuantumESPRESSO-7.3.1 (GPU enabled)~~ {numlib,chem,tollchain}[NVHPC/23.7-CUDA-12.1.1] nvompi-2023a + QuantumESPRESSO-7.3.1 (GPU enabled) Apr 22, 2024

bedroge mentioned this pull request Jun 5, 2024

add patches for failing LAPACK tests and RISC-V test segfaults to OpenBLAS 0.3.27 #20745

Merged

This comment was marked as off-topic.

Sign in to view

Crivella added a commit to Crivella/easybuild-easyconfigs that referenced this pull request Oct 3, 2024

Readded files removed from PR easybuilders#20364

0c2a597

This reverts commit 706e9d1.

This was referenced Oct 3, 2024

{toolchain}[NVHPC/23.7-CUDA-12.1.1] nvofbf-2023a #21530

Closed

{chem}[nvofbf/2023a-CUDA-12.1.1] MetalWalls 21.06.1 #21533

Closed

Crivella added a commit to Crivella/easybuild-easyconfigs that referenced this pull request Jun 16, 2025

Readded files removed from PR easybuilders#20364

72cbbb3

This reverts commit 706e9d1.

Crivella added 8 commits June 16, 2025 11:55

Added cuda recipe for QE

7999bf8

Added EC files for nvofbf toolchain + QE

211566f

Added HDF5

c20ded4

Added libxc

e7a2043

Changed QE to compile from nvompi

40e4e9a

Removed flexiblas and child packages

c968e93

Cleanup and better docs

c385c48

Revert "Removed flexiblas and child packages"

6d53987

This reverts commit 706e9d1.

Crivella added 4 commits June 16, 2025 11:56

Removed flexiblas and child packages

39e7e3c

Removed unsed file

9bf1115

Updated with new extras

368b1f7

Fix for new easyblock

32ab56b

Crivella force-pushed the feature-QE_cuda branch from 5869132 to 32ab56b Compare June 16, 2025 09:56

Crivella closed this Dec 12, 2025

Crivella deleted the feature-QE_cuda branch December 12, 2025 16:29

This was referenced Jan 13, 2026

{chem}[foss/2025a] QuantumESPRESSO 7.5 #23798

Merged

{chem}[NVHPC/25.3] QuantumESPRESSO v7.5 #24960

Closed

Conversation

Crivella commented Apr 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Crivella commented Apr 18, 2024

Uh oh!

cgross95 commented May 21, 2024

Uh oh!

Crivella commented May 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cgross95 commented May 22, 2024

Uh oh!

cgross95 commented Jun 7, 2024

Uh oh!

beeebiii commented Sep 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Crivella commented Sep 5, 2024

Uh oh!

beeebiii commented Sep 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Crivella commented Sep 5, 2024

Uh oh!

This comment was marked as off-topic.

Crivella commented Sep 5, 2024

Uh oh!

Crivella commented Sep 9, 2024

Uh oh!

yqshao commented Oct 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Updated software FFTW.MPI-3.3.10-nvompi-2023a.eb

Updated software FFTW-3.3.10-NVHPC-23.7-CUDA-12.1.1.eb

Updated software HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb

Updated software libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb

Updated software nvompi-2023a.eb

Updated software OpenBLAS-0.3.24-NVHPC-23.7-CUDA-12.1.1.eb

Updated software OpenMPI-4.1.5-NVHPC-23.7-CUDA-12.1.1.eb

Updated software QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb

Updated software ScaLAPACK-2.2.0-nvompi-2023a.eb

Uh oh!

Crivella commented Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Crivella commented Apr 15, 2024 •

edited

Loading

Crivella commented May 22, 2024 •

edited

Loading

beeebiii commented Sep 5, 2024 •

edited

Loading

beeebiii commented Sep 5, 2024 •

edited

Loading

yqshao commented Oct 25, 2024 •

edited

Loading

github-actions bot commented Dec 5, 2024 •

edited

Loading

Updated software `FFTW.MPI-3.3.10-nvompi-2023a.eb`

Updated software `FFTW-3.3.10-NVHPC-23.7-CUDA-12.1.1.eb`

Updated software `HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb`

Updated software `libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb`

Updated software `nvompi-2023a.eb`

Updated software `OpenBLAS-0.3.24-NVHPC-23.7-CUDA-12.1.1.eb`

Updated software `OpenMPI-4.1.5-NVHPC-23.7-CUDA-12.1.1.eb`

Updated software `QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb`

Updated software `ScaLAPACK-2.2.0-nvompi-2023a.eb`