Skip to content

add patch to Python easyconfigs to fix ctypes when $LD_LIBRARY_PATH is not being set#23499

Merged
boegel merged 28 commits intoeasybuilders:developfrom
dagonzalezfo:python-3.11.5-librarypath-prefix-support
Oct 22, 2025
Merged

add patch to Python easyconfigs to fix ctypes when $LD_LIBRARY_PATH is not being set#23499
boegel merged 28 commits intoeasybuilders:developfrom
dagonzalezfo:python-3.11.5-librarypath-prefix-support

Conversation

@dagonzalezfo
Copy link
Copy Markdown
Contributor

@dagonzalezfo dagonzalezfo commented Jul 25, 2025

@casparvl
Copy link
Copy Markdown
Contributor

@dagonzalezfo I'm putting some reminders, also for myself, that we should ensure to put in some descriptive strings of what we do where. For now, they help me to understand what's going on - but you don't have to implement them into this patch. The patch is just here so we can test the functionality of the change :)

@casparvl
Copy link
Copy Markdown
Contributor

FYI: I want to test this a bit more to make sure it also still works for librosa, with which we had our original issue, but that requires me to locally build librosa for 2023b, so... takes a bit of time.

In the meantime, I created this test script:

$ cat test.sh
#!/bin/bash

module purge
module load EESSI
module load EESSI-extend
module load Python/3.11.5-GCCcore-13.2.0
module load SDL2/2.28.5-GCCcore-13.2.0

# Testing find_library by library name

for lib in foo c SDL2; do
  echo "Running find_library for $lib"
  python -c "import ctypes; from ctypes.util import find_library; print(find_library('$lib'))"
  echo ""
done

# Testing library loads by full name
for lib in libfoo libSDL2.so libSDL2-2.0.so.0 libSDL2-2.0.so.0.2800.5 libc.so libc.so.6; do
  echo "Instantiating CDLL object for $lib"
  python -c "import ctypes; print(ctypes.CDLL('$lib'))"
  echo "Running LoadLibrary for $lib"
  python -c "import ctypes; print(ctypes.cdll.LoadLibrary('$lib'))"
  echo "Running find_library for $lib"
  python -c "import ctypes; from ctypes.util import find_library; print(find_library('$lib'))"
  echo ""
done

# Testing library loads by full path
for lib in /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/software/SDL2/2.28.5-GCCcore-13.2.0/lib64/libSDL2.so /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/lib64/libc.so.6; do
  echo "Instantiating CDLL object for $lib"
  python -c "import ctypes; print(ctypes.CDLL('$lib'))"
  echo "Running LoadLibrary for $lib"
  python -c "import ctypes; print(ctypes.cdll.LoadLibrary('$lib'))"
  echo ""
done

I may add some more libraries later, but I think this already covers a fair range of cases: from libs in the compat layer, to non-versioned libnames, to versioned libnames, to non-existing libnames. All seem to produce the expected result (with the regex change I suggested in my review).

Copy link
Copy Markdown
Contributor

@casparvl casparvl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the addition of easybuilders/easybuild-easyblocks#3860 I think there are only a few things left:

  • Accept my PR to your feature branch dagonzalezfo#1 (if you agree, of course)

  • The change of the regex on line 35 to

expr = os.fsencode(r'^[^\(\)\s]*%s[^\(\)\s]*' % re.escape(name))
  • The change of overwriting self._name only if util.find_library() doesn't return None

  • Make sure the sysroot doesn't appear in the patch anymore, otherwise it will not apply (patch will apply before the easyconfig changes this sysroot thing).

  • Add some more comments in comments in the code. I.e. this would be lines that are added by the patch file into the final code. I think some of it is useful, non-trivial, etc. You can use my own comments on those sections of the code for inspiration.

And then, I will give it a final test drive (after rebuilding python with --include-easyblocks-from-pr 3860 of course)

@dagonzalezfo dagonzalezfo requested a review from casparvl July 31, 2025 08:06
@dagonzalezfo
Copy link
Copy Markdown
Contributor Author

Hi @casparvl,

I included all the requested changes and test it locally and it seems to work correctly. Do you mind to confirm?

Best,

Copy link
Copy Markdown
Contributor

@casparvl casparvl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, two things:

I think the regex is a bit too generic: we're now assuming that any path containing stubs in the name can be filtered out. I think we should check that it also contains CUDA with a case insensitive match. You can check with a regular expression pattern = re.compile(r'cuda.*stubs', re.IGNORECASE) for example
If I do find_library('cuda'), it will actually find the library through findLib_gcc, and not findLib_ld. Since this (by design) will check which library would have been compiled against, it will return the stubs library. That is, essentially, by design of the find_library call.
The second can be fixed by two small changes to the code. The last lines of the _findLib_gcc(name) should be replaced with:

            file = os.fsdecode(file)
            if re.search(r'cuda.*stubs', file, re.IGNORECASE):
                continue
            return file

Similarly, the last lines of _findLib_gcc_fullname should be

            if re.search(r'cuda.*stubs', file, re.IGNORECASE):
                continue
            return file

So that if the stubs lib is picked up, it is just ignored and the search continues.

In the end, there is one issue: it does not pick up on the driver's libcuda.so.1 either. Normally, this would have been caught by the _findSonmame_ldconfig:

$ /sbin/ldconfig -p | grep libcuda.so
        libcuda.so.1 (libc6,x86-64) => /lib64/libcuda.so.1
        libcuda.so (libc6,x86-64) => /lib64/libcuda.so

But, since we are running with a sysroot, we are calling <sysyroot>/sbin/ldconfig instead. And that one doesn't know about /lib64/libcuda.so.

Note that for calls to ctypes.CDLL('libcuda.so'), this is actually not an issue: they will see that find_library(...) returns None, and thus call dlopen('libcuda.so') which (correctly!) opens the driver from the default search path.

So... we are almost at identical results as the regular ctypes with LD_LIBRARY_PATH, with find_library('cuda') being the one exception.

I Can come up with two solutions:

Add yet another function to this return statement:

return _findSoname_ldconfig(name) or \
                   _findLib_gcc(name) or _findLib_ld(name) or \
                   _findLib_gcc_fullname(name) or \
                   _findLib_host_ldconfig(name)

where _findLib_host_ldconfig(name) then uses the host's ldconfig from /sbin/ldconfig to locate libraries. The tricky thing is that this provides a fallback for any library that is not found in the compatiblity layer. That may lead to (unexpected) differences in behaviour of EESSI accross systems, as codes that try to load libfoo may work on some (because the host provides one) and not on others (because the compat layer and software layer don't).

Handle the cuda case explicitely in _findSoname_ldconfig. I.e. something like

if name == 'cuda':
    ldconfig = '/sbin/ldconfig'
else:
    ld_config = '/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/sbin/ldconfig'
...
with subprocess.Popen([ldconfig, '-p'],
...

Note that this is a bit tricky to implement: it is only needed if a sysroot is set, which is something we don't know at the patch/easyconfig level, but only at the easyblock level. I.e. this change should not be part of the patch, but part of the easyblock logic that currently replaces the ldconfig path

What do you think @dagonzalezfo ?

@dagonzalezfo
Copy link
Copy Markdown
Contributor Author

Hi @casparvl ,

Thanks for having a look on it.
About enhancing stub regex, I will add the changes later today.

About libcuda.so, it is not exposed using a sort of host-injections on a EESSI related path?
Moreover, is this path used by ld to find the libraries or how are they actually exposed?
(by a fast check, it seems that is exposed using LD_PRELOAD and this are not recovered by the ldconfig command)
If this is the case, there is no need to add the /sbin/ldconfig option.

Do you mind to clarify?

@dagonzalezfo
Copy link
Copy Markdown
Contributor Author

dagonzalezfo commented Aug 11, 2025

I just realize that maybe defining a new function for LD_PRELOAD override could be helpful. Or adding it into the _findLib_ld and reorder return:

Option 1:

def _findLib_ld_preload ():
.
.
.
return _findLib_ld_preload () or\
            _findSoname_ldconfig(name) or \
            _findLib_gcc(name) or _findLib_ld(name) or \
            _findLib_gcc_fullname(name) 

Option 2:

return _findLib_ld(name) or \
             _findSoname_ldconfig(name) or \
             _findLib_gcc(name) or \
             _findLib_gcc_fullname(name) 

Any opinion?

@casparvl
Copy link
Copy Markdown
Contributor

About libcuda.so, it is not exposed using a sort of host-injections on a EESSI related path?
Moreover, is this path used by ld to find the libraries or how are they actually exposed?
(by a fast check, it seems that is exposed using LD_PRELOAD and this are not recovered by the ldconfig command)
If this is the case, there is no need to add the /sbin/ldconfig option.

Very good point actually, I did not think about that. What I don't know is if that's on anyone's path. We may need @ocaisa to clarify this, but I guess the whole reason for symlinking the drivers was to expose them to the linker from the prefix? I have to admit, I tested this on a CPU node which does have the driver installed (a bit weird, but ok). I figured not finding the libcuda.so.1 was an issue with this PR, but maybe it's somehow related to the driver symlinking... I'll at the very least try to re-test on a GPU node. And see if I can see the host_injections in any search path of the linker.

@casparvl
Copy link
Copy Markdown
Contributor

I just realize that maybe defining a new function for LD_PRELOAD override could be helpful. Or adding it into the _findLib_ld and reorder return:

Option 1:

def _findLib_ld_preload ():
.
.
.
return _findLib_ld_preload () or\
            _findSoname_ldconfig(name) or \
            _findLib_gcc(name) or _findLib_ld(name) or \
            _findLib_gcc_fullname(name) 

Option 2:

return _findLib_ld(name) or \
             _findSoname_ldconfig(name) or \
             _findLib_gcc(name) or \
             _findLib_gcc_fullname(name) 

Any opinion?

Hmm, I'm not sure I understand what you mean here. Would you like to make similar behavior to LD_PRELOAD, i.e. if LD_PRELOAD is set, the runtime linker doesn't even try to load the library, since the library is already in memory?

I think that surpasses the 'normal' behavior. I.e. even without RPATH/sysroot, LD_PRELOAD isn't taken into account by ctypes. I wouldn't implement it here either. Although maybe you are thinking about the fact that LD_PRELOAD is sort of the only route that users can go if sites don't install the drivers in the host_injections dir? Even then, it's a very large deviation. I'm also wondering if it's needed: if you LD_PRELOAD it, doesn't dlopen just return succesfully, concluding that the lib was already in memory (without actually checking any search paths)? I'm not sure, I haven't tested. But I'd expect it might...

@boegel boegel changed the title Python 3.11.5 librarypath prefix support add patch to Python easyconfigs to fix ctypes when $LD_LIBRARY_PATH is not being set Oct 22, 2025
boegel
boegel previously approved these changes Oct 22, 2025
Copy link
Copy Markdown
Member

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@boegel
Copy link
Copy Markdown
Member

boegel commented Oct 22, 2025

Patch applies to all Python versions being touched here, I've checked that manually.

That's important to check, since the patch specified in patch_ctypes_ld_library_path is only actually applied when EasyBuild is configured to not update $LD_LIBRARY_PATH...

Patch applies cleanly for Python 3.11.x, and with minor offsets for other versions, like:

  • Python 3.9.5

    Details
    patching file Lib/ctypes/__init__.py
    Hunk #2 succeeded at 350 (offset -2 lines).
    patching file Lib/ctypes/util.py
    
  • Python 3.13.1

    Details
    patching file Lib/ctypes/__init__.py
    Hunk #1 succeeded at 15 (offset 1 line).
    Hunk #2 succeeded at 366 (offset 14 lines).
    patching file Lib/ctypes/util.py
    Hunk #1 succeeded at 111 with fuzz 2 (offset 12 lines).
    Hunk #2 succeeded at 134 (offset 12 lines).
    Hunk #3 succeeded at 158 (offset 12 lines).
    Hunk #4 succeeded at 173 (offset 12 lines).
    Hunk #5 succeeded at 313 (offset 12 lines).
    Hunk #6 succeeded at 324 (offset 12 lines).
    Hunk #7 succeeded at 333 (offset 12 lines).
    Hunk #8 succeeded at 364 (offset 12 lines).
    

Also still applies for Python 3.14.0:

Details
patching file Lib/ctypes/__init__.py
Hunk #1 succeeded at 16 (offset 2 lines).
Hunk #2 succeeded at 438 (offset 86 lines).
patching file Lib/ctypes/util.py
Hunk #1 succeeded at 205 with fuzz 2 (offset 106 lines).
Hunk #2 succeeded at 228 (offset 106 lines).
Hunk #3 succeeded at 252 (offset 106 lines).
Hunk #4 succeeded at 267 (offset 106 lines).
Hunk #5 succeeded at 407 (offset 106 lines).
Hunk #6 succeeded at 418 (offset 106 lines).
Hunk #7 succeeded at 427 (offset 106 lines).
Hunk #8 succeeded at 458 with fuzz 2 (offset 106 lines).

We may need to rework the patch file at some point, but we're good for now...

@casparvl
Copy link
Copy Markdown
Contributor

casparvl commented Oct 22, 2025

Test report by @casparvl
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#3860
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
tcn398.local.snellius.surf.nl - Linux RHEL 9.4, x86_64, AMD EPYC 7H12 64-Core Processor (zen2), Python 3.13.4
See https://gist.github.com/casparvl/e38cb28126069845f5ca22bb766126a3 for a full test report.

edit: on top of EESSI 2025.06 (see also EESSI/software-layer#1254)

@casparvl
Copy link
Copy Markdown
Contributor

casparvl commented Oct 22, 2025

Test report by @casparvl
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#3860
SUCCESS
Build succeeded for 3 out of 3 (3 easyconfigs in total)
tcn424.local.snellius.surf.nl - Linux RHEL 9.4 (Plow), x86_64, AMD EPYC 7H12 64-Core Processor (zen2), Python 3.11.4
See https://gist.github.com/casparvl/283c9b0b2677f935a6be0040b464cf19 for a full test report.

edit: on top of EESSI 2025.06 (see also EESSI/software-layer#1238 + EESSI/software-layer#1252)

@boegel
Copy link
Copy Markdown
Member

boegel commented Oct 22, 2025

Test report by @boegel
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#3860
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
node4201.shinx.os - Linux RHEL 9.6 (Plow), x86_64, AMD EPYC 9654 96-Core Processor, Python 3.13.4
See https://gist.github.com/boegel/4f76d3593fb14d3151c1edcd3b72372b for a full test report.

edit: on top of EESSI 2025.06 (see also EESSI/software-layer#1254)

@boegel
Copy link
Copy Markdown
Member

boegel commented Oct 22, 2025

@boegelbot please test @ jsc-zen3
CORE_CNT=16
EB_ARGS="Python-3.9.5-GCCcore-10.3.0-bare.eb Python-3.10.8-GCCcore-12.2.0.eb Python-3.11.3-GCCcore-12.3.0.eb Python-3.13.1-GCCcore-14.2.0.eb --include-easyblocks-from-pr 3860 --installpath /tmp/$USER/pr23499"

@boegelbot
Copy link
Copy Markdown
Collaborator

@boegel: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=23499 EB_ARGS="Python-3.9.5-GCCcore-10.3.0-bare.eb Python-3.10.8-GCCcore-12.2.0.eb Python-3.11.3-GCCcore-12.3.0.eb Python-3.13.1-GCCcore-14.2.0.eb --include-easyblocks-from-pr 3860 --installpath /tmp/$USER/pr23499" EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_23499 --ntasks="16" ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 8523

Test results coming soon (I hope)...

Details

- notification for comment with ID 3431844868 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegel
Copy link
Copy Markdown
Member

boegel commented Oct 22, 2025

Test report by @boegel
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#3860
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
node4201.shinx.os - Linux RHEL 9.6 (Plow), x86_64, AMD EPYC 9654 96-Core Processor, Python 3.11.4
See https://gist.github.com/boegel/6125fe20360acc20c49f216b62c2f70c for a full test report.

edit: on top of EESSI 2023.06

@boegel
Copy link
Copy Markdown
Member

boegel commented Oct 22, 2025

Test report by @boegel
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#3860
SUCCESS
Build succeeded for 3 out of 3 (3 easyconfigs in total)
node3541.doduo.os - Linux RHEL 9.6 (Plow), x86_64, AMD EPYC 7552 48-Core Processor, Python 3.11.4
See https://gist.github.com/boegel/650f1c7b534d634de709c441d55d0d5b for a full test report.

edit: on top of EESSI 2023.06

@boegelbot
Copy link
Copy Markdown
Collaborator

Test report by @boegelbot
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#3860
SUCCESS
Build succeeded for 4 out of 4 (4 easyconfigs in total)
jsczen3c2.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.6, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.21
See https://gist.github.com/boegelbot/55a22ce3dfe1343de5ff54d08ea26971 for a full test report.

@boegel
Copy link
Copy Markdown
Member

boegel commented Oct 22, 2025

problem with easyconfigs test suite reporting that there's one too many checksums will be fixed after merge of:

@boegel
Copy link
Copy Markdown
Member

boegel commented Oct 22, 2025

Test report by @boegel
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#3965
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
node4201.shinx.os - Linux RHEL 9.6 (Plow), x86_64, AMD EPYC 9654 96-Core Processor, Python 3.13.4
See https://gist.github.com/boegel/d499bad837d1b12eee1d1e7df65613cb for a full test report.

…ch_ctypes_ld_library_path in Python easyconfigs
Caspar van Leeuwen and others added 2 commits October 22, 2025 21:42
@boegel boegel removed the change label Oct 22, 2025
@casparvl casparvl dismissed their stale review October 22, 2025 20:43

All requests were incorporated

Copy link
Copy Markdown
Member

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@boegel
Copy link
Copy Markdown
Member

boegel commented Oct 22, 2025

Going in, thanks @dagonzalezfo!

@boegel boegel merged commit 08aa521 into easybuilders:develop Oct 22, 2025
8 checks passed
@dagonzalezfo
Copy link
Copy Markdown
Contributor Author

Thanks to @casparvl, @ocaisa and @boegel for all the effort 🥇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fix EESSI Related to EESSI project

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants