Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Hide SYCL service kernels #4519

Merged
merged 10 commits into from
Sep 15, 2021

Conversation

steffenlarsen
Copy link
Contributor

The SYCL runtime may in places benefit from being able to define "service kernels" to use behind-the-scenes work on devices. These kernels may use the same interface as other SYCL kernels, but as such will be handled no different than user-defined kernels.

These changes makes a distinction between service kernels and user-defined kernels by placing service kernels in a common namespace (cl::sycl::detail::__sycl_service_kernel__). The program manager will not grant unique kernel IDs to service kernels. This prevents service kernels from being visible to the user through kernel_bundle related interfaces.

Consequently the runtime may create device images that do not contain any kernel IDs. As such, sycl::has_kernel_bundle will now discount any device images that have no kernel IDs.

The SYCL runtime may in places benefit from being able to define
"service kernels" to use behind-the-scenes work on devices. These
kernels may use the same interface as other SYCL kernels, but as such
will be handled no different than user-defined kernels.

These changes makes a distinction between service kernels and
user-defined kernels by placing service kernels in a common namespace
(`cl::sycl::detail::__sycl_service_kernel__`). The program manager will
not grant unique kernel IDs to service kernels. This prevents service
kernels from being visible to the user through kernel_bundle related
interfaces.

Consequently the runtime may create device images that do not contain
any kernel IDs. As such, `sycl::has_kernel_bundle` will now discount
any device images that have no kernel IDs.

Signed-off-by: Steffen Larsen <[email protected]>
@steffenlarsen steffenlarsen requested a review from a team as a code owner September 8, 2021 14:04
Signed-off-by: Steffen Larsen <[email protected]>
// this purpose. As such service kernels can be identified by
// __sycl_service_kernel__ appearing in the mangled kernel name.
static bool isServiceKernel(const std::string &KernelName) {
return KernelName.find("__sycl_service_kernel__") != std::string::npos;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite straightforward approach and will work.
Though, I'm considered on its performance.
Is possible to get notion of mangling and check just the beginning of the string for Linux and ending for Windows?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking the start of the string was the original approach, but it did not consider Windows mangling. I am not very familiar with the Windows mangling scheme, but I worry that even if we could determine which mangling scheme is used, having templated classes would break the assumption about the end of the mangled name.

Copy link
Contributor

@romanovvlad romanovvlad Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we have the same(Itanium) mangling for kernels on both windows and linux.
In any case, if my another suggest is applied, the only place where we check if a kernel is a service one would be addImages function which should be called once per process for a given image. Having said that I think this is not a performance critical peace of code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems, like my yesterday's comment didn't get to post here:
Is it possible to cache this flag?
I believe, using unordered_map will allow for quick access. Also, we can eliminate use of mutexes as the initialization is only needed at boot-strap and there are only read requests during run-time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe, using unordered_map will allow for quick access.

You're thinking adding an unordered_set (I think set would do the trick) where AddImages caches all found service kernel names. Then ProgramManager::getSYCLDeviceImagesWithCompatibleState would check against that rather than calling isServiceKernel?

Also, we can eliminate use of mutexes as the initialization is only needed at boot-strap and there are only read requests during run-time.

Are you referring to m_KernelIDsMutex?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're thinking adding an unordered_set (I think set would do the trick) where AddImages caches all found service kernel names. Then ProgramManager::getSYCLDeviceImagesWithCompatibleState would check against that rather than calling isServiceKernel?

Right. This will do the thing.

Are you referring to m_KernelIDsMutex?

Never, never, never. I meant any mutex required for lazy initialization of cache. With the cache initialized at boot-strap there is no place for lazy initialization and, thus, no need for thread-safety mechanism except for ensuring that only read access takes place during run-time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. This will do the thing.

I have implemented caching of the service kernels found in ProgramManager::AddImages. This allows for faster lookup later. However, one concern I have is that the only time later that service kernels are looked up is in an assert, so in release builds we will never look up anything in the new cache.

Never, never, never. I meant any mutex required for lazy initialization of cache. With the cache initialized at boot-strap there is no place for lazy initialization and, thus, no need for thread-safety mechanism except for ensuring that only read access takes place during run-time.

I think this may apply to m_KernelIDsMutex as well. That is outside the scope of this PR, so I'll make another PR in which we can discuss further.

// this purpose. As such service kernels can be identified by
// __sycl_service_kernel__ appearing in the mangled kernel name.
static bool isServiceKernel(const std::string &KernelName) {
return KernelName.find("__sycl_service_kernel__") != std::string::npos;
Copy link
Contributor

@romanovvlad romanovvlad Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we have the same(Itanium) mangling for kernels on both windows and linux.
In any case, if my another suggest is applied, the only place where we check if a kernel is a service one would be addImages function which should be called once per process for a given image. Having said that I think this is not a performance critical peace of code.

@steffenlarsen steffenlarsen force-pushed the steffen/hide_service_kernels branch from 7314ef8 to b78f7af Compare September 10, 2021 08:28
@bader bader merged commit 49e1e74 into intel:sycl Sep 15, 2021
alexbatashev added a commit to alexbatashev/llvm that referenced this pull request Sep 18, 2021
* upstream/sycl: (36 commits)
  [SYCL] Add SYCL2020 target::device enumeration value (intel#4587)
  [SYCL][Doc] Update ITT instrumentation docs (intel#4503)
  [SYCL][L0] Make all L0 events have device-visibility (intel#4534)
  [SYCL] Updated Level-Zero backend spec according to SYCL 2020 standard (intel#4560)
  [SYCL] Add error_code support for SYCL 1.2.1 exception classes (intel#4574)
  [SYCL][CI] Provide --ci-defaults option for config script (intel#4583)
  [CI] Switch GitHub Actions to Ubuntu 20.04 (intel#4582)
  [SYCL][CUDA] Fix context clearing in PiCuda tests (intel#4483)
  [SYCL] Hide SYCL service kernels (intel#4519)
  [SYCL][L0] Fix mismatched ZE call count (intel#4559)
  [SYCL] Remove function pointers extension (intel#4459)
  [GitHub Actions] Uplift clang version in post-commit validation (intel#4581)
  [SYCL] Ignore usm prefetch dummy flag (intel#4568)
  [SYCL][Group algorithms] Add group sorting algorithms implementation (intel#4439)
  [SYCL] Resolve name clash with a user defined symbol (intel#4570)
  [clang-offload-wrapper] Do not create .tgtimg section with -emit-reg-funcs=0 (intel#4577)
  [SYCL][FPGA] Remove deprecated attribute functionality (intel#4532)
  [SYCL] Remove _class aliases (intel#4465)
  [SYCL][CUDA][HIP] Report every device in its own platform (intel#4571)
  [SYCL][L0] make_device shouldn't need platform as an input (intel#4561)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants