Skip to content

WIP: update Python easyblock to use custom ctypes for shared libraries#3798

Closed
dagonzalezfo wants to merge 4 commits intoeasybuilders:developfrom
dagonzalezfo:develop
Closed

WIP: update Python easyblock to use custom ctypes for shared libraries#3798
dagonzalezfo wants to merge 4 commits intoeasybuilders:developfrom
dagonzalezfo:develop

Conversation

@dagonzalezfo
Copy link
Copy Markdown
Contributor

This was first exposed as a patch on easybuilders/easybuild-easyconfigs#23042
After reviewing it, @casparvl suggested that this should be integrated into the easyblock.

@dagonzalezfo dagonzalezfo changed the title Custom ctypes for shared libraries WIP: Custom ctypes for shared libraries Jun 26, 2025
@boegel boegel changed the title WIP: Custom ctypes for shared libraries WIP: update Python easyblock to use custom ctypes for shared libraries Jul 2, 2025
@boegel boegel added this to the release after 5.1.1 milestone Jul 2, 2025
@boegel
Copy link
Copy Markdown
Member

boegel commented Jul 2, 2025

@casparvl Can you help shepherd this contribution?

@dagonzalezfo
Copy link
Copy Markdown
Contributor Author

Hi, as a way to fix the reported issue about system provided libs (reported here: https://gitlab.com/eessi/support/-/issues/154#note_2586123208)
I need to switch this regex r'\s+(lib%s\.[^\s]+)\s+\(%s' (https://github.com/python/cpython/blob/da79ac9d26860db62500762c95b7ae534638f9a7/Lib/ctypes/util.py#L373)
by this one: r'\s+(lib%s\.[^\s]+)\s+\(%s\)\s+=>\s+(\S+)'.

However, I did not manage to make it work.

I try to fix it by using re.escape tool. However, the escaped version of the needed regex did not fit into the _findSoname_ldconfig(name) machinery.

Alternatives:

  • keep this change as a patch
  • add a major patch chaning the _findSoname_ldconfig function?

I would really appreaciate any help/opinion

Best,

Danilo

@dagonzalezfo dagonzalezfo marked this pull request as draft July 9, 2025 15:47
@casparvl
Copy link
Copy Markdown
Contributor

@dagonzalezfo so the key problem here is that using a regular expression to replace one regular expression by another is a pain? Who would have thought ;-)

I know that we conventionally use apply_regex_substitutions() as a way to patch things at the EasyBlock level, but... we don't have to do it that way, right? I'm wondering about the alternatives... I guess we can call sed? Or: define the context for a patch file in the python code, write that to a temporary file, then call patch (or some easybuild equivalent: framework has a patch step, so should already have this functionality somewhere, I guess).

I'm also trying to understand what fundamentally goes wrong here

OSError: /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/lib/../lib64/libc.so: invalid ELF header

What were you trying to do/run? I see that this file is actually not a shared library:

$ file /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/lib64/libc.so
/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/lib64/libc.so: ASCII text

But the same is true for a regular libc.so, so it's no surprise:

file /usr/lib64/libc.so
/usr/lib64/libc.so: ASCII text

But something must be calling it with the assumption that it is an ELF file?

orig_support_marker=r'if _sys.platform.startswith\(\"aix\"\):'
updated_support_marker="""
if _os.name == "posix":
if name and name.endswith(".so"):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be too strict. E.g. someone might try to load a versioned library, e.g. libSDL2-2.0.so.0. That still fails, because we never enter this 'if'.

What case are you really trying to exclude here?

I did notice that I couldn't really find a find_library call that would work for libSDL2-2.0 either (e.g. python -c "import ctypes; print(ctypes.util.find_library('SDL2-2.0'))" returns None), maybe that's the case you were trying to exclude?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code should mimic this behaviour:
LD_LIBRARY_PATH=$LIBRARY_PATH python -c "import ctypes; print(ctypes.CDLL('libSDL2-2.0.so.0'))"

Copy link
Copy Markdown
Contributor Author

@dagonzalezfo dagonzalezfo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go back to the patch that should cover the usecases mentioned in the eeesi ticket.
At the end any call should return a fullpath that can be loaded by dlopen

@casparvl
Copy link
Copy Markdown
Contributor

casparvl commented Jul 22, 2025

Writing down some more things to help me understand what's going on, after some of it became clearer in a call today with @dagonzalezfo .

We need to make sure that libraries from both the software-layer and compatibility-layer are found correctly, for calls to ctypes.util.find_library(), ctypes.CDLL() and ctypes.cdll.LoadLibrary() (which essentally calls ctypes.CDLL()). Breaking this down one by one:

  • ctypes.util.find_library() first calls _findSoname_ldconfig. This is how it will find libraries on the standard search paths, but for this to work, it needs to use the ldconfig from the sysroot (if a sysroot was configured). That was patched here in the EasyBlock. Essentially, it then grabs the library name (just the name, not full path) from the output of <sysroot>/sbin/ldconfig -p.

  • ctypes.util.find_library() then calls findLib_gcc which calls e.g.

$ gcc -Wl,-t -o /dev/null -lSDL2
...
/cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/software/SDL2/2.28.5-GCCcore-13.2.0/lib/../lib64/libSDL2.so
...

gcc -Wl,-t -o /dev/null -lSDL2 to find the SDL2 library. This prints something like /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/software/SDL2/2.28.5-GCCcore-13.2.0/lib/../lib64/libSDL2.so on the output. But: standard ctypes.util.find_library() wraps this in a _get_soname() call, to strip the path and only return the so-name (on Linux). A previous patch by @ocaisa changed this behaviour to return the full path, as the result of find_library() is often passed to a call that then loads the library (like ctypes.CDLL, but other methods may be used, like cffi).

  • ctypes.util.find_library() then calls findLib_ld, which loops through entries in the LD_LIBRARY_PATH, each time calling ld -t -L <dir_from_ld_library_path> -o /dev/null -lSDL2. I.e. it would look something like:
$ ld -t -L /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/software/SDL2/2.28.5-GCCcore-13.2.0/lib -o /dev/null -lSDL2
/cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/software/SDL2/2.28.5-GCCcore-13.2.0/lib/libSDL2.so

Again, this is wrapped in a _get_soname() by default, to strip the path and return only the so-name (on Linux). Again, this behavior was changed to return full path by @ocaisa 's previous patch.

This, however, was not enough, because if someone calls ctypes.CDLL directly (by library name, not full path), it would still fail. That's where this PR comes in.

What's needed in order to make that work, is that if ctypes.CDLL (or ctypes.cdll.LoadLibrary()) is called by library name, this gets resolved to the full library path before dlopen() is called.

I see two options:

  1. We keep the previous patch that made sure find_library() returns the full library path, and then use that function in the ctypes.CDLL __init__ to resolve self._name to a full path.
  2. We implement our own functionality (mimicing what find_library(), but remove the previous patch.

The previous patch, strictly speaking, altered the documented behavior of find_library() here, which states that on Linux the library name (and not the full path) is returned. On the other hand: documented behavior on other OS-es state that the full path is returned, so this reduces the chances that codes make hard assumptions on just the filename being returned. We need to weigh the downside of deviating from the documented behavior, against the downside of not doing anything. The latter is guaranteed to give issues when codes invoke ctypes.utils.find_library() (and potentially pass the returned value to ctypes.CDLL()), which we believe to be the most common use case. Another concern is that even if we patch ctypes.CDLL(), users might invoke ctypes.utils.find_library() and use the result in another way to open a library (e.g. through cffi, or maybe even directly invoking a dlopen call). This would then require also patching these approaches. All in all, I believe that option 1 has the fewest downsides, and is thus the way to go.

In order to call find_library(), one challenge is that ctypes.CDLL typically receives a full filename (e.g. libSDL2-2.0.so.0) as argument, while find_library only takes the library name. But what the library name is is not always easy to extract from the filename, nor can findLib_gcc and findLib_ld always resolve it. E.g. while find_library('SDL2') works (and returns the full path to libSDL2.so) there is no call to findLib_gcc or findLib_ld that will return the full path to libSDL2-2.0.so.0 (even though passing that to ctypes.CDLL is completely valid). One solution here may be to alter the syntax with which findLib_gcc and/or findLib_ld are invoked. E.g. findLib_gcc(name) currently does:

gcc -Wl,-t -o /dev/null -l<name>

where name is the so-name (i.e. without lib, and without .so.X). But, you can also ask gcc to use the full library name:

gcc -Wl,-t -o /dev/null -l:<fullname>

where fullname is e.g. libSDL2-2.0.so.0. I can confirm that this appraoch would work:

$ gcc -Wl,-t -o /dev/null -l:libSDL2-2.0.so.0 | grep SDL2
...
/cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/software/SDL2/2.28.5-GCCcore-13.2.0/lib/../lib64/libSDL2-2.0.so.0

The safest way to implement this is probably to just add this as possible search method(s) to find_library. I.e. change this return statement into

            return _findSoname_ldconfig(name) or \
                   _findLib_gcc(name) or _findLib_ld(name) or \
                   _findLib_gcc_fullname(name) or _findLib_ld_fullname(name)

And then implement _findLib_gcc_fullname and _findLib_ld_fullname identical to their existing counterparts, with the exception of the -l argument, which should then do -l:{fullname}.

I think with that, we can remove the regular expressions that are currently needed to strip the lib and .so from the self._name in the ctypes.CDLL.__init__(). That simplifies the code, makes it less error prone (as we can't do any 'incorrect' stripping for strange library filenames) and covers the libSDL2-2.0.so.0 use case, that would otherwise be impossible to cover.

@dagonzalezfo dagonzalezfo reopened this Jul 25, 2025
@dagonzalezfo
Copy link
Copy Markdown
Contributor Author

Changes implemented easybuilders/easybuild-easyconfigs#23499
Please use it with the modified easyblock

@dagonzalezfo
Copy link
Copy Markdown
Contributor Author

dagonzalezfo commented Jul 30, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants