Merged
Conversation
When calling a device function that itself called an external function implemented inside linkable code, the link would fail because the linkable code was never added to the link for the calling kernel. This commit fixes the issue by adding the linked library's list of linking files to the current code library when it is linked - this allows the external code dependencies of all called functions to propagate up to the linking dependencies of the outermost function.
This introduces a new code library, an `ExternalCodeLibrary`. It can only hold linking dependencies, so it is not possible to add IR modules to it or get functions or generated code out of it. It is also not serializable, like other code libraries - this could be added, but has not been done at this point because serializing code libraries with external dependencies is already unsupported. When declaring an external function with `declare_device()`, any linked libraries are added to a new `ExternalCodeLibrary` object, which is then added to the list of libraries for the function in the target context. Core Numba functionality then ensures that the external code library is added to the linking dependencies of any callers. With linking dependencies now being tracked through code libraries, there is no need for the `Dispatcher` to try and pull together a list of items to link to a kernel. The implementation of this process was already broken, as it failed to see through functions implemented using the high-level extension API (`@overload`, etc.). It is now removed.
We reduce the number of interfaces here (`declare_device_function` vs `declare_device_function_template`) and use the standard `make_concrete_template` function to generate a typing template instead of inventing our own `device_function_template` class for the typing. The `link` property of an `ExternFunction` is no longer needed - it was previously required when the `Dispatcher` had to determine the list of objects to link; it is now removed.
isVoid
reviewed
May 8, 2025
This was referenced May 8, 2025
isVoid
reviewed
May 8, 2025
Contributor
Author
|
@isVoid Thanks for the review - I've made a couple of responses to the comments - I am not sure I can see any changes I can make at present, but open to further suggestions. |
isVoid
approved these changes
May 9, 2025
Merged
gmarkall
pushed a commit
that referenced
this pull request
May 14, 2025
There wasn't a test for bfloat16 bindings inside device functions, because `LinkableCode` objects from nested function calls were not carried into the final compile result. #243 fixed this feature gap - a test is added in this PR to follow this. Co-authored-by: isVoid <isVoid@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The issue fixed by this PR is that in general, when a called function makes use of external code (e.g. through a
LinkableCodeobject), the required external code may not have been added to the link. The mechanism for constructing the list of items to link, which was to generate all the code then try to figure out what to link after the fact in theDispatcher(using theget_cres_link_objects()function) was broken, and possibly fundamentally flawed.The approach here is to instead keep track of all linked objects through code libraries - whenever code libraries are linked, the files they link are added to the link for the current code library.
Some refactoring and documentation of
declare_device_function()is added. This is aimed at making it more understandable; prior to this PR I didn't really quite know how or why typing and lowering for external functions worked.Additional tests for calling external code through a device function and an overload are added.
More details are in individual commit messages (the commits are incremental / self-contained) - see them for further details.