add patch to Python easyconfigs to fix ctypes when $LD_LIBRARY_PATH is not being set#23499
Conversation
relies on prefix provided ldconfig
easybuild/easyconfigs/p/Python/Python-3.11.5-custom-ctypes.patch
Outdated
Show resolved
Hide resolved
easybuild/easyconfigs/p/Python/Python-3.11.5-custom-ctypes.patch
Outdated
Show resolved
Hide resolved
|
@dagonzalezfo I'm putting some reminders, also for myself, that we should ensure to put in some descriptive strings of what we do where. For now, they help me to understand what's going on - but you don't have to implement them into this patch. The patch is just here so we can test the functionality of the change :) |
easybuild/easyconfigs/p/Python/Python-3.11.5-custom-ctypes.patch
Outdated
Show resolved
Hide resolved
|
FYI: I want to test this a bit more to make sure it also still works for In the meantime, I created this test script: I may add some more libraries later, but I think this already covers a fair range of cases: from libs in the compat layer, to non-versioned libnames, to versioned libnames, to non-existing libnames. All seem to produce the expected result (with the regex change I suggested in my review). |
There was a problem hiding this comment.
With the addition of easybuilders/easybuild-easyblocks#3860 I think there are only a few things left:
-
Accept my PR to your feature branch dagonzalezfo#1 (if you agree, of course)
-
The change of the regex on line 35 to
expr = os.fsencode(r'^[^\(\)\s]*%s[^\(\)\s]*' % re.escape(name))
-
The change of overwriting self._name only if util.find_library() doesn't return None
-
Make sure the sysroot doesn't appear in the patch anymore, otherwise it will not apply (patch will apply before the easyconfig changes this sysroot thing).
-
Add some more comments in comments in the code. I.e. this would be lines that are added by the patch file into the final code. I think some of it is useful, non-trivial, etc. You can use my own comments on those sections of the code for inspiration.
And then, I will give it a final test drive (after rebuilding python with --include-easyblocks-from-pr 3860 of course)
…tches Small change to use the new EasyBlock config items
…me, remove sysroot info
|
Hi @casparvl, I included all the requested changes and test it locally and it seems to work correctly. Do you mind to confirm? Best, |
There was a problem hiding this comment.
Hmmm, two things:
I think the regex is a bit too generic: we're now assuming that any path containing stubs in the name can be filtered out. I think we should check that it also contains CUDA with a case insensitive match. You can check with a regular expression pattern = re.compile(r'cuda.*stubs', re.IGNORECASE) for example
If I do find_library('cuda'), it will actually find the library through findLib_gcc, and not findLib_ld. Since this (by design) will check which library would have been compiled against, it will return the stubs library. That is, essentially, by design of the find_library call.
The second can be fixed by two small changes to the code. The last lines of the _findLib_gcc(name) should be replaced with:
file = os.fsdecode(file)
if re.search(r'cuda.*stubs', file, re.IGNORECASE):
continue
return file
Similarly, the last lines of _findLib_gcc_fullname should be
if re.search(r'cuda.*stubs', file, re.IGNORECASE):
continue
return file
So that if the stubs lib is picked up, it is just ignored and the search continues.
In the end, there is one issue: it does not pick up on the driver's libcuda.so.1 either. Normally, this would have been caught by the _findSonmame_ldconfig:
$ /sbin/ldconfig -p | grep libcuda.so
libcuda.so.1 (libc6,x86-64) => /lib64/libcuda.so.1
libcuda.so (libc6,x86-64) => /lib64/libcuda.so
But, since we are running with a sysroot, we are calling <sysyroot>/sbin/ldconfig instead. And that one doesn't know about /lib64/libcuda.so.
Note that for calls to ctypes.CDLL('libcuda.so'), this is actually not an issue: they will see that find_library(...) returns None, and thus call dlopen('libcuda.so') which (correctly!) opens the driver from the default search path.
So... we are almost at identical results as the regular ctypes with LD_LIBRARY_PATH, with find_library('cuda') being the one exception.
I Can come up with two solutions:
Add yet another function to this return statement:
return _findSoname_ldconfig(name) or \
_findLib_gcc(name) or _findLib_ld(name) or \
_findLib_gcc_fullname(name) or \
_findLib_host_ldconfig(name)
where _findLib_host_ldconfig(name) then uses the host's ldconfig from /sbin/ldconfig to locate libraries. The tricky thing is that this provides a fallback for any library that is not found in the compatiblity layer. That may lead to (unexpected) differences in behaviour of EESSI accross systems, as codes that try to load libfoo may work on some (because the host provides one) and not on others (because the compat layer and software layer don't).
Handle the cuda case explicitely in _findSoname_ldconfig. I.e. something like
if name == 'cuda':
ldconfig = '/sbin/ldconfig'
else:
ld_config = '/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/sbin/ldconfig'
...
with subprocess.Popen([ldconfig, '-p'],
...
Note that this is a bit tricky to implement: it is only needed if a sysroot is set, which is something we don't know at the patch/easyconfig level, but only at the easyblock level. I.e. this change should not be part of the patch, but part of the easyblock logic that currently replaces the ldconfig path
What do you think @dagonzalezfo ?
|
Hi @casparvl , Thanks for having a look on it. About libcuda.so, it is not exposed using a sort of Do you mind to clarify? |
|
I just realize that maybe defining a new function for Option 1: Option 2: Any opinion? |
Very good point actually, I did not think about that. What I don't know is if that's on anyone's path. We may need @ocaisa to clarify this, but I guess the whole reason for symlinking the drivers was to expose them to the linker from the prefix? I have to admit, I tested this on a CPU node which does have the driver installed (a bit weird, but ok). I figured not finding the |
Hmm, I'm not sure I understand what you mean here. Would you like to make similar behavior to I think that surpasses the 'normal' behavior. I.e. even without RPATH/sysroot, |
$LD_LIBRARY_PATH is not being set
|
Patch applies to all Python versions being touched here, I've checked that manually. That's important to check, since the patch specified in Patch applies cleanly for Python 3.11.x, and with minor offsets for other versions, like:
Also still applies for Python 3.14.0: DetailsWe may need to rework the patch file at some point, but we're good for now... |
|
Test report by @casparvl edit: on top of EESSI 2025.06 (see also EESSI/software-layer#1254) |
|
Test report by @casparvl edit: on top of EESSI 2025.06 (see also EESSI/software-layer#1238 + EESSI/software-layer#1252) |
|
Test report by @boegel edit: on top of EESSI 2025.06 (see also EESSI/software-layer#1254) |
|
@boegelbot please test @ jsc-zen3 |
|
@boegel: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de PR test command '
Test results coming soon (I hope)... Details- notification for comment with ID 3431844868 processed Message to humans: this is just bookkeeping information for me, |
|
Test report by @boegel edit: on top of EESSI 2023.06 |
|
Test report by @boegel edit: on top of EESSI 2023.06 |
|
Test report by @boegelbot |
|
problem with easyconfigs test suite reporting that there's one too many checksums will be fixed after merge of: |
|
Test report by @boegel |
…ch_ctypes_ld_library_path in Python easyconfigs
…'checksums'] actually returns a copy, not a reference
|
Going in, thanks @dagonzalezfo! |
Adding patch to provide support for library resolution on python
Edit by @casparvl: depends on:
$LD_LIBRARY_PATHeasybuild-easyblocks#3860