Skip to content

don't forcibly set $LD_LIBRARY_PATH in LAMMPS sanity check#3776

Merged
Crivella merged 2 commits intoeasybuilders:developfrom
lorisercole:fix/lammps-tests-cuda
Jun 18, 2025
Merged

don't forcibly set $LD_LIBRARY_PATH in LAMMPS sanity check#3776
Crivella merged 2 commits intoeasybuilders:developfrom
lorisercole:fix/lammps-tests-cuda

Conversation

@lorisercole
Copy link
Copy Markdown
Contributor

With the foss-2024a toolchain (CUDA 12.6+), prepending LD_LIBRARY_PATH with LIBRARY_PATH causes LAMMPS to crash, likely because the dynamic linker gets pointed to the CUDA stubs libs instead of the system libs. Strangely, this is not a problem for older toolchains. At the same time, it appears that setting LD_LIBRARY_PATH has no effect on the functionality of the code (it seems that Python is able to retrieve liblammps.so without it), so I propose we just get rid of it, unless there are actual scenarios where it causes problems.

With the `foss-2024a` toolchain (CUDA 12.6+), prepending `LD_LIBRARY_PATH` with `LIBRARY_PATH` causes LAMMPS to crash, likely because the dynamic linker gets pointed to the CUDA stubs libs instead of the system libs.
Strangely, this is not a problem for older toolchains.
At the same time, it appears that setting `LD_LIBRARY_PATH` has no effect on the functionality of the code (it seems that Python is able to retrieve `liblammps.so` without it), so I propose we just get rid of it, unless there are actual scenarios where it causes problems.
@ocaisa
Copy link
Copy Markdown
Member

ocaisa commented Jun 16, 2025

I'd agree with this, this is no longer needed due to #3352

@casparvl
Copy link
Copy Markdown
Contributor

I'm with Alan on this. The fact you can't reproduce the original issue is likely because indeed it was resolved in the meantime, likely by #3352 . And even if it isn't, I'd say it's a ctypes issue and should be fixed there - not in the EasyBlock like this. Especially not since it's causing other issues.

In the ctypes fix, we can consider if it's needed to filter e.g. any prefix containing stubs as a subdir... It'd again be a big hammer, but not all that different from what we do in our RPATH wrappers (I think).

@Crivella
Copy link
Copy Markdown
Contributor

@boegelbot please test @ jsc-zen3-a100
EB_ARGS="--installpath=/tmp/$USER/ebpr-3776 LAMMPS-29Aug2024_update2-foss-2023a-kokkos.eb LAMMPS-2Aug2023_update2-foss-2023a-kokkos-CUDA-12.1.1.eb LAMMPS-23Jun2022-foss-2022a-kokkos.eb LAMMPS-29Aug2024-foss-2023b-kokkos.eb"
CORE_CNT=16

@boegelbot
Copy link
Copy Markdown

@Crivella: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=3776 EB_ARGS="--installpath=/tmp/$USER/ebpr-3776 LAMMPS-29Aug2024_update2-foss-2023a-kokkos.eb LAMMPS-2Aug2023_update2-foss-2023a-kokkos-CUDA-12.1.1.eb LAMMPS-23Jun2022-foss-2022a-kokkos.eb LAMMPS-29Aug2024-foss-2023b-kokkos.eb" EB_REPO=easybuild-easyblocks EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_3776 --ntasks="16" ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 6871

Test results coming soon (I hope)...

Details

- notification for comment with ID 2981636876 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Copy Markdown

Test report by @boegelbot

Overview of tested easyconfigs (in order)

  • SUCCESS LAMMPS-29Aug2024_update2-foss-2023a-kokkos.eb
  • SUCCESS LAMMPS-2Aug2023_update2-foss-2023a-kokkos-CUDA-12.1.1.eb
  • SUCCESS LAMMPS-23Jun2022-foss-2022a-kokkos.eb
  • SUCCESS LAMMPS-29Aug2024-foss-2023b-kokkos.eb

Build succeeded for 4 out of 4 (4 easyconfigs in total)
jsczen3c1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.5, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.21
See https://gist.github.com/boegelbot/c51511d6e99ea628ba808a2cb4d8b184 for a full test report.

@Crivella
Copy link
Copy Markdown
Contributor

I would say in conjunction with easybuilders/easybuild-easyconfigs#23117 (comment) this has been extensively tested, not sure what happened with the CI, some problem with the GH runners?

@lorisercole
Copy link
Copy Markdown
Contributor Author

I would say in conjunction with easybuilders/easybuild-easyconfigs#23117 (comment) this has been extensively tested, not sure what happened with the CI, some problem with the GH runners?

Thanks. Not sure but it seems to be a temporary problem with the CI. Perhaps you can rerun them manually?

Copy link
Copy Markdown
Contributor

@Crivella Crivella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Crivella Crivella added this to the next release (5.1.1?) milestone Jun 18, 2025
@Crivella
Copy link
Copy Markdown
Contributor

Going in, thanks @lorisercole!

@Crivella Crivella merged commit 475040e into easybuilders:develop Jun 18, 2025
17 checks passed
@lorisercole lorisercole deleted the fix/lammps-tests-cuda branch June 19, 2025 09:03
@boegel boegel changed the title lammps sanity checks: remove LD_LIBRARY_PATH setting don't forcibly set $LD_LIBRARY_PATH in LAMMPS sanity check Jul 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants