Skip to content

Fixes for linking external code#243

Merged
gmarkall merged 4 commits intoNVIDIA:mainfrom
gmarkall:fix-linking-code-3
May 9, 2025
Merged

Fixes for linking external code#243
gmarkall merged 4 commits intoNVIDIA:mainfrom
gmarkall:fix-linking-code-3

Conversation

@gmarkall
Copy link
Contributor

@gmarkall gmarkall commented May 8, 2025

The issue fixed by this PR is that in general, when a called function makes use of external code (e.g. through a LinkableCode object), the required external code may not have been added to the link. The mechanism for constructing the list of items to link, which was to generate all the code then try to figure out what to link after the fact in the Dispatcher (using the get_cres_link_objects() function) was broken, and possibly fundamentally flawed.

The approach here is to instead keep track of all linked objects through code libraries - whenever code libraries are linked, the files they link are added to the link for the current code library.

Some refactoring and documentation of declare_device_function() is added. This is aimed at making it more understandable; prior to this PR I didn't really quite know how or why typing and lowering for external functions worked.

Additional tests for calling external code through a device function and an overload are added.

More details are in individual commit messages (the commits are incremental / self-contained) - see them for further details.

gmarkall added 4 commits May 8, 2025 10:41
When calling a device function that itself called an external function
implemented inside linkable code, the link would fail because the
linkable code was never added to the link for the calling kernel.

This commit fixes the issue by adding the linked library's list of
linking files to the current code library when it is linked - this
allows the external code dependencies of all called functions to
propagate up to the linking dependencies of the outermost function.
This introduces a new code library, an `ExternalCodeLibrary`. It can
only hold linking dependencies, so it is not possible to add IR modules
to it or get functions or generated code out of it.

It is also not serializable, like other code libraries - this could be
added, but has not been done at this point because serializing code
libraries with external dependencies is already unsupported.

When declaring an external function with `declare_device()`, any linked
libraries are added to a new `ExternalCodeLibrary` object, which is then
added to the list of libraries for the function in the target context.
Core Numba functionality then ensures that the external code library is
added to the linking dependencies of any callers.

With linking dependencies now being tracked through code libraries,
there is no need for the `Dispatcher` to try and pull together a list of
items to link to a kernel. The implementation of this process was
already broken, as it failed to see through functions implemented using
the high-level extension API (`@overload`, etc.). It is now removed.
We reduce the number of interfaces here (`declare_device_function` vs
`declare_device_function_template`) and use the standard
`make_concrete_template` function to generate a typing template instead
of inventing our own `device_function_template` class for the typing.

The `link` property of an `ExternFunction` is no longer needed - it was
previously required when the `Dispatcher` had to determine the list of
objects to link; it is now removed.
@gmarkall gmarkall requested a review from isVoid May 8, 2025 10:12
@gmarkall gmarkall added the 3 - Ready for Review Ready for review by team label May 8, 2025
@gmarkall gmarkall added 4 - Waiting on author Waiting for author to respond to review and removed 3 - Ready for Review Ready for review by team labels May 9, 2025
@gmarkall
Copy link
Contributor Author

gmarkall commented May 9, 2025

@isVoid Thanks for the review - I've made a couple of responses to the comments - I am not sure I can see any changes I can make at present, but open to further suggestions.

@gmarkall gmarkall added 4 - Waiting on reviewer Waiting for reviewer to respond to author and removed 4 - Waiting on author Waiting for author to respond to review labels May 9, 2025
@gmarkall gmarkall added 5 - Ready to merge Testing and reviews complete, ready to merge and removed 4 - Waiting on reviewer Waiting for reviewer to respond to author labels May 9, 2025
@gmarkall gmarkall merged commit de8d92f into NVIDIA:main May 9, 2025
37 checks passed
gmarkall pushed a commit that referenced this pull request May 14, 2025
There wasn't a test for bfloat16 bindings inside device functions, because `LinkableCode` objects from nested function calls were not carried into the final compile result. #243 fixed this feature gap - a test is added in this PR to follow this.

Co-authored-by: isVoid <isVoid@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

5 - Ready to merge Testing and reviews complete, ready to merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants