Skip to content

{lib}[gompi/2025a] CUDA-wheel-bundle v12.8.0 w/ CUDA 12.8.0#25266

Open
lexming wants to merge 3 commits intoeasybuilders:developfrom
lexming:20260212230310_new_pr_CUDA-wheel-bundle1280
Open

{lib}[gompi/2025a] CUDA-wheel-bundle v12.8.0 w/ CUDA 12.8.0#25266
lexming wants to merge 3 commits intoeasybuilders:developfrom
lexming:20260212230310_new_pr_CUDA-wheel-bundle1280

Conversation

@lexming
Copy link
Copy Markdown
Contributor

@lexming lexming commented Feb 12, 2026

(created using eb --new-pr)

Depends on:

This is a bundle with python packages that repackage the CUDA libraries. However, this bundle installs dummy packages of those packages and uses regular EB dependencies for CUDA instead. This allows EB to support any wheels that require these packages seamlessly.

The current list of packages has all requirements defined by pytorch 2.9.1 wheels (to be added in another PR). However, this list can be easily expanded with extra nvidia packages or multiple versions of any package already in the list. There is no restrictions.

@github-actions
Copy link
Copy Markdown

Updated software NVSHMEM-3.3.20-gompi-2025a-CUDA-12.8.0.eb

Diff against NVSHMEM-3.3.20-gompi-2025b-CUDA-12.9.1.eb

easybuild/easyconfigs/n/NVSHMEM/NVSHMEM-3.3.20-gompi-2025b-CUDA-12.9.1.eb

diff --git a/easybuild/easyconfigs/n/NVSHMEM/NVSHMEM-3.3.20-gompi-2025b-CUDA-12.9.1.eb b/easybuild/easyconfigs/n/NVSHMEM/NVSHMEM-3.3.20-gompi-2025a-CUDA-12.8.0.eb
index dab1b06a44..51eb1af0a7 100644
--- a/easybuild/easyconfigs/n/NVSHMEM/NVSHMEM-3.3.20-gompi-2025b-CUDA-12.9.1.eb
+++ b/easybuild/easyconfigs/n/NVSHMEM/NVSHMEM-3.3.20-gompi-2025a-CUDA-12.8.0.eb
@@ -12,7 +12,7 @@ accessed with fine-grained GPU-initiated operations, CPU-initiated operations,
 and operations on CUDA streams.
 """
 
-toolchain = {'name': 'gompi', 'version': '2025b'}
+toolchain = {'name': 'gompi', 'version': '2025a'}
 
 source_urls = [
     'https://github.com/NVIDIA/nvshmem/releases/download/v%(version)s-0/',
@@ -29,15 +29,15 @@ checksums = [
 ]
 
 builddependencies = [
-    ('Autotools', '20250527'),
-    ('pkgconf', '2.4.3'),
-    ('CMake', '4.0.3'),
+    ('Autotools', '20240712'),
+    ('pkgconf', '2.3.0'),
+    ('CMake', '3.31.3'),
 ]
 
 dependencies = [
-    ('CUDA', '12.9.1', '', SYSTEM),
+    ('CUDA', '12.8.0', '', SYSTEM),
     ('NCCL', '2.27.7', versionsuffix),
-    ('UCX-CUDA', '1.19.0', versionsuffix),
+    ('UCX-CUDA', '1.18.0', versionsuffix),
 ]
 
 configopts = '-DNVSHMEM_USE_GDRCOPY=1 '
Diff against NVSHMEM-2.8.0-gompi-2022a-CUDA-11.7.0.eb

easybuild/easyconfigs/n/NVSHMEM/NVSHMEM-2.8.0-gompi-2022a-CUDA-11.7.0.eb

diff --git a/easybuild/easyconfigs/n/NVSHMEM/NVSHMEM-2.8.0-gompi-2022a-CUDA-11.7.0.eb b/easybuild/easyconfigs/n/NVSHMEM/NVSHMEM-3.3.20-gompi-2025a-CUDA-12.8.0.eb
index d7251a8f5b..51eb1af0a7 100644
--- a/easybuild/easyconfigs/n/NVSHMEM/NVSHMEM-2.8.0-gompi-2022a-CUDA-11.7.0.eb
+++ b/easybuild/easyconfigs/n/NVSHMEM/NVSHMEM-3.3.20-gompi-2025a-CUDA-12.8.0.eb
@@ -1,7 +1,7 @@
-easyblock = 'ConfigureMake'
+easyblock = 'CMakeMake'
 
 name = 'NVSHMEM'
-version = '2.8.0'
+version = '3.3.20'
 versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://developer.nvidia.com/nvshmem'
@@ -12,53 +12,46 @@ accessed with fine-grained GPU-initiated operations, CPU-initiated operations,
 and operations on CUDA streams.
 """
 
-toolchain = {'name': 'gompi', 'version': '2022a'}
+toolchain = {'name': 'gompi', 'version': '2025a'}
 
-download_instructions = """The sources of NVSHMEM can be downloaded at NVIDIA's webpage when you have signed up for
-their (free) developer program:
-https://developer.nvidia.com/nvshmem-downloads"""
-
-sources = ['%(namelower)s_src_%(version)s-3.txz']
-checksums = ['7d4ef226630a94b587d18e02c27decc8b41d6f4ee52a26e25644b23cd18da81f']
+source_urls = [
+    'https://github.com/NVIDIA/nvshmem/releases/download/v%(version)s-0/',
+    'https://developer.download.nvidia.com/compute/redist/nvshmem/%(version)s/source/',
+]
+sources = ['%(namelower)s_src_cuda12-all-all-%(version)s.tar.gz']
+patches = ['NVSHMEM-3.3.20_update_cxx_standard.patch']
+
+checksums = [
+    # nvshmem_src_cuda12-all-all-3.3.20.tar.gz
+    '96ec9620e82ec90de92c7d61a7ba03c0eba05075bf10e1fc4a066d45e7f7d21f',
+    # NVSHMEM-3.3.20_update_cxx_standard.patch
+    '560eda0fb6e44c8f7666fb18a87d5b6505f0fb77316908718df6e835db52b49f'
+]
 
 builddependencies = [
-    ('Autotools', '20220317'),
-    ('pkgconf', '1.8.0'),
+    ('Autotools', '20240712'),
+    ('pkgconf', '2.3.0'),
+    ('CMake', '3.31.3'),
 ]
 
 dependencies = [
-    ('CUDA', '11.7.0', '', SYSTEM),
-    ('UCX-CUDA', '1.12.1', versionsuffix),
-    ('NCCL', '2.12.12', versionsuffix),
+    ('CUDA', '12.8.0', '', SYSTEM),
+    ('NCCL', '2.27.7', versionsuffix),
+    ('UCX-CUDA', '1.18.0', versionsuffix),
 ]
 
-skipsteps = ['configure']
-
-prebuildopts = 'export %s &&' % ' '.join([
-    'NVSHMEM_USE_GDRCOPY=1',
-    'GDRCOPY_HOME=${EBROOTGDRCOPY}',
-
-    'MPI_HOME=${EBROOTOPENMPI}',
-    'NVSHMEM_MPI_SUPPORT=1',
-    'NVSHMEMTEST_USE_MPI_LAUNCHER=1',
-
-    'NCCL_HOME=${EBROOTNCCL}',
-    'NVSHMEM_USE_NCCL=1',
-
-    'NVSHMEM_BUILDDIR=%(builddir)s',
-    'NVSHMEM_EXAMPLES_BUILDDIR=${NVSHMEM_BUILDDIR}/examples/obj',
-    'NVSHMEM_OTHERTEST_BUILDDIR=${NVSHMEM_BUILDDIR}/othertest/obj',
-    'NVSHMEM_TEST_BUILDDIR=${NVSHMEM_BUILDDIR}/test/obj',
-    'NVSHMEM_PERFTEST_BUILDDIR=${NVSHMEM_BUILDDIR}/perftest/obj',
-
-    'NVSHMEM_PREFIX=%(installdir)s',
-    'NVSHMEM_EXAMPLES_INSTALL=${NVSHMEM_PREFIX}/examples',
-    'NVSHMEM_OTHERTEST_INSTALL=${NVSHMEM_PREFIX}/othertest',
-    'NVSHMEM_PERFTEST_INSTALL=${NVSHMEM_PREFIX}/perftest',
-    'NVSHMEM_TEST_INSTALL=${NVSHMEM_PREFIX}/test',
-])
-
-preinstallopts = prebuildopts
+configopts = '-DNVSHMEM_USE_GDRCOPY=1 '
+configopts += '-DGDRCOPY_HOME=${EBROOTGDRCOPY} '
+configopts += '-DMPI_HOME=${EBROOTOPENMPI} '
+configopts += '-DNVSHMEM_MPI_SUPPORT=1 '
+configopts += '-DNVSHMEMTEST_USE_MPI_LAUNCHER=1 '
+configopts += '-DNCCL_HOME=${EBROOTNCCL} '
+configopts += '-DNVSHMEM_USE_NCCL=1 '
+# configopts += '-DNVSHMEM_IBGDA_SUPPORT=1 '
+configopts += '-DNVSHMEM_PREFIX=%(installdir)s '
+# NVSHMEM builds a wheel package if this option is enabled.
+# Properly installing the Python bindings is better handled in a separate EC.
+configopts += '-DNVSHMEM_BUILD_PYTHON_LIB=OFF '
 
 sanity_check_paths = {
     'files': ['lib/libnvshmem.a', 'lib/nvshmem_bootstrap_mpi.%s' % SHLIB_EXT],
Diff against NVSHMEM-2.7.0-gompi-2022a-CUDA-11.7.0.eb

easybuild/easyconfigs/n/NVSHMEM/NVSHMEM-2.7.0-gompi-2022a-CUDA-11.7.0.eb

diff --git a/easybuild/easyconfigs/n/NVSHMEM/NVSHMEM-2.7.0-gompi-2022a-CUDA-11.7.0.eb b/easybuild/easyconfigs/n/NVSHMEM/NVSHMEM-3.3.20-gompi-2025a-CUDA-12.8.0.eb
index 3a5d8323c9..51eb1af0a7 100644
--- a/easybuild/easyconfigs/n/NVSHMEM/NVSHMEM-2.7.0-gompi-2022a-CUDA-11.7.0.eb
+++ b/easybuild/easyconfigs/n/NVSHMEM/NVSHMEM-3.3.20-gompi-2025a-CUDA-12.8.0.eb
@@ -1,7 +1,7 @@
-easyblock = 'ConfigureMake'
+easyblock = 'CMakeMake'
 
 name = 'NVSHMEM'
-version = '2.7.0'
+version = '3.3.20'
 versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://developer.nvidia.com/nvshmem'
@@ -12,53 +12,46 @@ accessed with fine-grained GPU-initiated operations, CPU-initiated operations,
 and operations on CUDA streams.
 """
 
-toolchain = {'name': 'gompi', 'version': '2022a'}
+toolchain = {'name': 'gompi', 'version': '2025a'}
 
-download_instructions = """The sources of NVSHMEM can be downloaded at NVIDIA's webpage when you have signed up for
-their (free) developer program:
-https://developer.nvidia.com/nvshmem-downloads"""
-
-sources = ['%(namelower)s_src_%(version)s-6.txz']
-checksums = ['23ed9b0187104dc87d5d2bc1394b6f5ff29e8c19138dc019d940b109ede699df']
+source_urls = [
+    'https://github.com/NVIDIA/nvshmem/releases/download/v%(version)s-0/',
+    'https://developer.download.nvidia.com/compute/redist/nvshmem/%(version)s/source/',
+]
+sources = ['%(namelower)s_src_cuda12-all-all-%(version)s.tar.gz']
+patches = ['NVSHMEM-3.3.20_update_cxx_standard.patch']
+
+checksums = [
+    # nvshmem_src_cuda12-all-all-3.3.20.tar.gz
+    '96ec9620e82ec90de92c7d61a7ba03c0eba05075bf10e1fc4a066d45e7f7d21f',
+    # NVSHMEM-3.3.20_update_cxx_standard.patch
+    '560eda0fb6e44c8f7666fb18a87d5b6505f0fb77316908718df6e835db52b49f'
+]
 
 builddependencies = [
-    ('Autotools', '20220317'),
-    ('pkgconf', '1.8.0'),
+    ('Autotools', '20240712'),
+    ('pkgconf', '2.3.0'),
+    ('CMake', '3.31.3'),
 ]
 
 dependencies = [
-    ('CUDA', '11.7.0', '', SYSTEM),
-    ('UCX-CUDA', '1.12.1', versionsuffix),
-    ('NCCL', '2.12.12', versionsuffix),
+    ('CUDA', '12.8.0', '', SYSTEM),
+    ('NCCL', '2.27.7', versionsuffix),
+    ('UCX-CUDA', '1.18.0', versionsuffix),
 ]
 
-skipsteps = ['configure']
-
-prebuildopts = 'export %s &&' % ' '.join([
-    'NVSHMEM_USE_GDRCOPY=1',
-    'GDRCOPY_HOME=${EBROOTGDRCOPY}',
-
-    'MPI_HOME=${EBROOTOPENMPI}',
-    'NVSHMEM_MPI_SUPPORT=1',
-    'NVSHMEMTEST_USE_MPI_LAUNCHER=1',
-
-    'NCCL_HOME=${EBROOTNCCL}',
-    'NVSHMEM_USE_NCCL=1',
-
-    'NVSHMEM_BUILDDIR=%(builddir)s',
-    'NVSHMEM_EXAMPLES_BUILDDIR=${NVSHMEM_BUILDDIR}/examples/obj',
-    'NVSHMEM_OTHERTEST_BUILDDIR=${NVSHMEM_BUILDDIR}/othertest/obj',
-    'NVSHMEM_TEST_BUILDDIR=${NVSHMEM_BUILDDIR}/test/obj',
-    'NVSHMEM_PERFTEST_BUILDDIR=${NVSHMEM_BUILDDIR}/perftest/obj',
-
-    'NVSHMEM_PREFIX=%(installdir)s',
-    'NVSHMEM_EXAMPLES_INSTALL=${NVSHMEM_PREFIX}/examples',
-    'NVSHMEM_OTHERTEST_INSTALL=${NVSHMEM_PREFIX}/othertest',
-    'NVSHMEM_PERFTEST_INSTALL=${NVSHMEM_PREFIX}/perftest',
-    'NVSHMEM_TEST_INSTALL=${NVSHMEM_PREFIX}/test',
-])
-
-preinstallopts = prebuildopts
+configopts = '-DNVSHMEM_USE_GDRCOPY=1 '
+configopts += '-DGDRCOPY_HOME=${EBROOTGDRCOPY} '
+configopts += '-DMPI_HOME=${EBROOTOPENMPI} '
+configopts += '-DNVSHMEM_MPI_SUPPORT=1 '
+configopts += '-DNVSHMEMTEST_USE_MPI_LAUNCHER=1 '
+configopts += '-DNCCL_HOME=${EBROOTNCCL} '
+configopts += '-DNVSHMEM_USE_NCCL=1 '
+# configopts += '-DNVSHMEM_IBGDA_SUPPORT=1 '
+configopts += '-DNVSHMEM_PREFIX=%(installdir)s '
+# NVSHMEM builds a wheel package if this option is enabled.
+# Properly installing the Python bindings is better handled in a separate EC.
+configopts += '-DNVSHMEM_BUILD_PYTHON_LIB=OFF '
 
 sanity_check_paths = {
     'files': ['lib/libnvshmem.a', 'lib/nvshmem_bootstrap_mpi.%s' % SHLIB_EXT],

@lexming
Copy link
Copy Markdown
Contributor Author

lexming commented Feb 13, 2026

Test report by @lexming
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#4063
SUCCESS
Build succeeded for 2 out of 2 (total: 18 mins 18 secs) (2 easyconfigs in total)
node700.hydra.os - Linux Rocky Linux 9.7 (Blue Onyx), x86_64, AMD EPYC 9535 64-Core Processor, Python 3.9.25
See https://gist.github.com/lexming/c1e661e1e7f2a8d0907dc0a597cf654b for a full test report.

Co-authored-by: Pavel Tomanek <99190809+pavelToman@users.noreply.github.com>
@boegel boegel added this to the next release (5.2.2?) milestone Mar 25, 2026
@boegel
Copy link
Copy Markdown
Member

boegel commented Mar 25, 2026

@boegelbot please test @ jsc-zen3
EB_ARGS="--include-easyblocks-from-pr 4063"

@boegelbot
Copy link
Copy Markdown
Collaborator

@boegel: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=25266 EB_ARGS="--include-easyblocks-from-pr 4063" EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_25266 --ntasks=8 ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 10088

Test results coming soon (I hope)...

Details

- notification for comment with ID 4126553993 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Copy Markdown
Collaborator

Test report by @boegelbot
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#4063
SUCCESS
Build succeeded for 2 out of 2 (total: 10 mins 9 secs) (2 easyconfigs in total)
jsczen3c4.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.7, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.25
See https://gist.github.com/boegelbot/f5a1ec1f5f44b984e3f21841939980a9 for a full test report.

@boegel
Copy link
Copy Markdown
Member

boegel commented Mar 26, 2026

Test report by @boegel
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#4063
SUCCESS
Build succeeded for 2 out of 2 (total: 4 mins 52 secs) (2 easyconfigs in total)
node4414.skiddo.os - Linux RHEL 9.6, x86_64, AMD EPYC 9755 128-Core Processor (zen5), Python 3.9.21
See https://gist.github.com/boegel/c7d7e321f79ec4e5740d14b40084a26a for a full test report.

('nvidia-cuda-cupti-cu12', '12.8.90'),
('nvidia-cuda-nvrtc-cu12', '12.8.93'),
('nvidia-cuda-runtime-cu12', '12.8.90'),
('nvidia-cufft-cu12', '11.3.3.83'),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lexming Is there a way to determine these versions from the CUDA 12.8.0 instalation?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lexming ping?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2025a issues & PRs related to 2025a common toolchains new update

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants