Skip to content

[SYCL][CUDA] Fix cupti library dynamic loading#17272

Merged
kbenzie merged 1 commit intointel:syclfrom
npmiller:fix-cupti-loading
Mar 4, 2025
Merged

[SYCL][CUDA] Fix cupti library dynamic loading#17272
kbenzie merged 1 commit intointel:syclfrom
npmiller:fix-cupti-loading

Conversation

@npmiller
Copy link
Contributor

@npmiller npmiller commented Mar 3, 2025

This patch fixes loading the cupti library on Linux to rely on the dynamic linker to figure out where the library is.

It also disable cupti tracing on Windows, as far as I know this is only used on Linux at the moment since the tracing tools don't support Windows. Additionally there are some potential security issues of loading the library just by name on Windows, so we just leave Windows support out for now.

This is following up on the discussions on oneapi-src/unified-runtime#1070, statically linking the cupti library was also ruled out as it makes the adapter library go from around 600KiB to around 32MiB.

This patch fixes loading the cupti library on Linux to rely on the
dynamic linker to figure out where the library is.

It also disable cupti tracing on Windows, as far as I know this is only
used on Linux at the moment since the tracing tools don't support
Windows.
@npmiller npmiller requested a review from a team as a code owner March 3, 2025 14:06
@npmiller npmiller requested a review from ldrumm March 3, 2025 14:06
@npmiller
Copy link
Contributor Author

npmiller commented Mar 4, 2025

@intel/llvm-gatekeepers this is ready to merge, test failures are unrelated to this patch, Nvidia one was disabled in #17275 and the AMD one is tracked in #17212

@kbenzie kbenzie merged commit 0e155f0 into intel:sycl Mar 4, 2025
29 of 31 checks passed
rafbiels added a commit to rafbiels/llvm that referenced this pull request Mar 5, 2025
find_package(CUDA) is deprecated since CMake 3.10 and the functionality
we need is provided by find_package(CUDAToolkit) since CMake 3.17.

Thanks to SYCL configuration now requiring CMake >3.20 (intel#13664), we can
rely on find_package(CUDAToolkit) to work in all setups.

Remove the deprecated calls and replace them with the recommended one.
Clean up all extra CMake code dealing with finding CUPTI as that is
also no longer needed (partially thanks to intel#17272). Replace all
variables from the old module with corresponding ones from the new one.

This solves multiple issues with finding libraries, notably including
the failure to find libcuda.so automatically on systems where the CUDA
driver is not installed and only the toolkit is available. This is a
reasonable use case for building DPC++ on a build machine without a
GPU and distributing for use on GPU machines.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants