Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UR_PROGRAM_INFO_KERNEL_NAMES query is going to be difficult for some adapters to implement #537

Open
aarongreig opened this issue May 23, 2023 · 1 comment
Labels
specification Changes or additions to the specification

Comments

@aarongreig
Copy link
Contributor

There isn't a way for the CUDA or HIP adapters to use their backend APIs to query out the names of the entry points in a program object. There was a thought encoded in comments in the original PI implementation that the IR could be searched through for these (see here). This works if the IR is a text format like ptx is, but if it isn't or the program was created from a device binary that solution looks less likely to be worthwhile.

It's unclear how useful this property actually is to people, the above comment suggests SYCL is using it, but we obviously aren't hitting that sycl::detail::pi::die("getKernelNames not implemented"); so they must be working around this somehow. At the very least the kernel handling in the CTS will need redesigned without the assumption that it can be used, and we should probably make it optional in the ProgramGetInfo spec/cts test unless these adapters do come up with a nice way to implement it.

@kbenzie kbenzie added the specification Changes or additions to the specification label May 24, 2023
@aarongreig
Copy link
Contributor Author

I've found a couple more things cuda and hip will struggle with (or possibly just can't implement):

  • UR_PROGRAM_INFO_NUM_KERNELS - for the same reason as KERNEL_NAMES
  • validating argument indexes in the KernelSetArg entry points - the number of arguments a kernel takes is another thing we can't query out, and this entry point doesn't map to a cuda driver call so we can't rely on translating an error code we get from the backend

The CTS changes mentioned in the original issue are now reflected in #553 but I think there's still spec work to be done to communicate that some things might not be present for all backends - maybe this needs a new capability query for program object introspection.

@kbenzie kbenzie added the needs-discussion This needs further discussion label Jan 10, 2024
@kbenzie kbenzie removed the needs-discussion This needs further discussion label Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
specification Changes or additions to the specification
Projects
None yet
Development

No branches or pull requests

2 participants