-
Notifications
You must be signed in to change notification settings - Fork 55
Initialize calling convention and abi info in compile_cuda, not CUDAFlag
#782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Automatic reviews are disabled for this repository. |
|
Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test |
gmarkall
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes the issue in compile_subroutine(), but I want to try and hold off merging it without a proper fix, as it turns a loud failure into a silent miscompilation.
BaseContext.call_internalcompile_cuda, not CUDAFlag
| # is compiled with numba-abi. So cres should have CUDACallConv. | ||
| assert isinstance(cres.fndesc.call_conv, CUDACallConv) | ||
|
|
||
| _, result = context.call_internal_no_propagate( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because wrapper function is CABI function, lower level calls are unable to propagate to top level. Hence using call_internal_no_propagate.
PR #717 makes calling convention a configurable object in target context. In that PR, we attempt to initialize calling convention inside
CUDAFlag. However, justCUDAFlagis also a global config object for codegen, which goes in contradiction to our intention that calling convention should be per-function based. In this PR, we moved the initialization of calling conventions tocompile_cuda, which is meant to be the compile entry point of a cuda function.