[libclc] Initial support for cross-compiling OpenCL libraries by jhuber6 · Pull Request #174022 · llvm/llvm-project

jhuber6 · 2025-12-30T21:46:41Z

Summary:
The other GPU enabled libraries, (openmp, flang-rt, compiler-rt, libc,
libcxx, libcxx-abi) all support builds through a runtime cross-build. In
these builds we use a separate CMake build that cross-compiles to a
single target.

This patch provides basic support for this with the libclc libraries.
Changes include adding support for the more standard GPU compute triples
(amdgcn-amd-amdhsa, nvptx64-nvidia-cuda) and building only one target in
this mode.

Some things left to do:

This patch does not change the compiler invocations, this method would
allow us to use standard CMake routines but this keeps it minimal.

The prebuild support is questionable and doesn't fit into this scheme
because it's a host executable, I'm ignoring it for now.

The installed location should just use the triple with no libclc/
subdirectory handling I believe.

arsenm

I don't think cmake provides out of the box OpenCL support as a compile language. At one point we had the boilerplate to add it as a language here

arsenm · 2025-12-31T08:10:59Z

libclc/cmake/modules/AddLibclc.cmake

+      # FIXME: Is this utility even necessary? The `-mlink-builtin-bitcode`
+      # option used to link the library in discards the modified linkage.


This is probably a leftover from before we did that. Also, I thought there was at least one other hack in prepare-builtins

The two things it does from what I can tell

Removes "opencl.ocl.version" metadata strings

Replaces any non-external function with linkonce_odr linkage

The first one, I'm pretty sure llvm-link deduplicates now. For the second one, this is mostly just a cheap way to let the functions link while being eliminated by the compiler if unused. I think the ROCm Device Libs do that as well

The version metadata is a workaround for having hundreds of entries. We had an AMDGPU pass to deduplicate them, but that was moved into the IR linker sometime in the last year

Unfortunately I don't know how people really use libclc. I'd really like to just remove it but I can imagine someone complaining about this being gone, because -mlink-builtin-bitcode makes the linkage change unnecessary and the linker deduplication now makes the manual handling unnecessary

This is probably a leftover from before we did that.

Thanks for the information. I wasn't aware of it.

Unfortunately I don't know how people really use libclc. I'd really like to just remove it but I can imagine someone complaining about this being gone, because -mlink-builtin-bitcode makes the linkage change unnecessary and the linker deduplication now makes the manual handling unnecessary

Our downstream targets also use -mlink-builtin-bitcode to link libclc bitcode files.
I think it is probably fine to drop this prepare_builtins utility given that the linkage change is unnecessary. I have just tried to drop the linkage change in our downstream code and doesn't find anything wrong in some basic testing.

There are a few additional changes that prepare_builtins do in our downstream code, e.g.
https://github.com/intel/llvm/blob/4decbf0da29f7daba8a87361456a264a331e2b5d/libclc/utils/prepare-builtins.cpp#L85-L110
https://github.com/intel/llvm/blob/4decbf0da29f7daba8a87361456a264a331e2b5d/libclc/utils/prepare-builtins.cpp#L129-L146
I'll check if they can be removed in the downstream as well.

The code object version is better done via -Xclang -mcode-object-version=none by the compiler flags. The wchar issue is confounding, that should be a property of the target so I'd imagine it suggests you're mixing incompatible targets.

Removing CPU target features like that is a massive hack, but it's not an unprecedented one since we do similarly weird things in the ROCm Device Libs. Ideally we partition these libraries more intelligently, but does always_inline work?

The code object version is better done via -Xclang -mcode-object-version=none by the compiler flags. The wchar issue is confounding, that should be a property of the target so I'd imagine it suggests you're mixing incompatible targets.

Thanks for the suggestion. Will try it.

Removing CPU target features like that is a massive hack, but it's not an unprecedented one since we do similarly weird things in the ROCm Device Libs. Ideally we partition these libraries more intelligently, but does always_inline work?

always_inline works but it won't be used in libclc compile flags and it is problematic to inline bitcode files with incompatible target-features. The issue is probably that our downstream should build separate compatible bitcodes for supported targets.

wenju-he · 2026-01-04T10:52:40Z

libclc/CMakeLists.txt

 set( amdgcn--_devices tahiti )
 set( amdgcn-mesa-mesa3d_devices ${amdgcn--_devices} )
 set( amdgcn--amdhsa_devices none )
+set( amdgcn-amd-amdhsa_devices none )


is there difference between amdgcn--amdhsa_devices and amdgcn-amd-amdhsa_devices?

Nope, but all the other GPU compute targets use amdgcn-amd-amdhsa and I'd like these to go in the same directory eventually.

perhaps we can drop amdgcn--amdhsa_devices target from libclc since the bitcode will be the same?

I was planning on trimming this up in a PR tomorrow. We definitely could though I don't know the subtleties here, since maybe someone depends on the specific triple.

I also really want to change how the files are installed, we should use the regular triple and then use the specific device as a subdirectory if possible. Right now we have like libclc/ in the resource dir, when it should be more like standard PER_TARGET_RUNTIME_DIR I'd say.

The amd vendor only does anything with spirv

wenju-he · 2026-01-04T11:10:58Z

libclc/CMakeLists.txt

 set( clspv64--_devices none )
 set( nvptx--_devices none )
 set( nvptx64--_devices none )
+set( nvptx64-nvidia-cuda_devices none )


Should the libclc implementation for this triple be unified with nvptx--nvidiacl/nvptx64--nvidiacl? Otherwise nvptx64-nvidia-cuda_devices won't pick up the code under ptx-nvidiacl.
nv implementations are in ptx-nvidiacl folder (https://github.com/llvm/llvm-project/tree/main/libclc/clc/lib/ptx-nvidiacl and https://github.com/llvm/llvm-project/tree/main/libclc/opencl/lib/ptx-nvidiacl)
and libclc uses the triple components to select the folder (see

llvm-project/libclc/CMakeLists.txt

Line 320 in 7ada892

list( APPEND opencl_dirs ${DARCH} ${DARCH}-${OS} ${DARCH}-${VENDOR}-${OS} )

).

Yeah, probably. I don't really know anything about OpenCL on NVIDIA but I'm just trying to make things work with the normal compute triples. What would that require?

I don't know about history of using nvidiacl in libclc. If nvptx64-nvidia-cuda is the right choice, we can probably rename ptx-nvidiacl folders so that nvptx64-nvidia-cuda will pick up files in those folders and drop nvptx64--nvidiacl target.

wenju-he · 2026-01-04T11:20:57Z

libclc/CMakeLists.txt

  set( LIBCLC_STANDALONE_BUILD FALSE )

-  set( LLVM_PACKAGE_VERSION ${LLVM_VERSION} )
+  set( PACKAGE_VERSION ${LLVM_PACKAGE_VERSION} )


use LLVM_VERSION_MAJOR? LLVM_PACKAGE_VERSION is not set in in-tree build.

Weird, LLVM_PACKAGE_VERSION should be set if LLVM_VERSION is set from what I know. We need PACKAGE_VERSION here set so finding the resource directory actually works properly. I suppose I can just set both.

I'm a little confused what in-tree means here. If it's in the context like we're doing here with a runtimes build it should be set.

I'm a little confused what in-tree means here. If it's in the context like we're doing here with a runtimes build it should be set.

sorry, you're right that LLVM_PACKAGE_VERSION is set in in-tree build when libclc is in LLVM_ENABLE_RUNTIMES.

It is just that our downstream still uses LLVM_ENABLE_PROJECTS="...,libclc" and in this case LLVM_PACKAGE_VERSION is not set. The reasons of not switching to LLVM_ENABLE_RUNTIMES yet are:

prepare_builtins has build fail when libclc is in LLVM_ENABLE_RUNTIMES on windows when MSVC generator is used, see Update LLVM google/clspv#1521 (comment) and analysis at Update LLVM google/clspv#1521 (comment). CMAKE_C_COMPILER set at

llvm-project/llvm/cmake/modules/LLVMExternalProjectUtils.cmake

Line 228 in 80b62cb

set(compiler_args -DCMAKE_C_COMPILER=${LLVM_RUNTIME_OUTPUT_INTDIR}/clang-cl${CMAKE_EXECUTABLE_SUFFIX}

doesn't work. This could be an blocking issue for switching libclc to add_library since CMAKE_C_COMPILER will be used for compiling .cl files.

LLVM_ENABLE_RUNTIMES="libclc" build is much slower than LLVM_ENABLE_PROJECTS="libclc" in debug build when compiling a execution-time support library which depends on libclc. I have a local workaround for this issue and long term solution might be removing the dependency on libclc.

Alright, so just setting both for now is the easiest?

Alright, so just setting both for now is the easiest?

yeah. Perhaps this code

llvm-project/flang/CMakeLists.txt

Lines 335 to 337 in 80b62cb

if (NOT PACKAGE_VERSION)

set(PACKAGE_VERSION ${LLVM_VERSION_MAJOR})

endif()

is better.

jhuber6 · 2026-01-04T14:32:42Z

@wenju-he @frasercrmck General question, how difficult would it be to port this project to use add_library instead? We would need some custom language support, that's just CMake boilerplate. Getting the final linked library can easily be done with a custom command to call llvm-link as llvm-link on a static library will link everything inside it.

wenju-he · 2026-01-05T05:01:47Z

@wenju-he @frasercrmck General question, how difficult would it be to port this project to use add_library instead? We would need some custom language support, that's just CMake boilerplate. Getting the final linked library can easily be done with a custom command to call llvm-link as llvm-link on a static library will link everything inside it.

Using add_library like other runtime library do should be the goal for libclc, see #156778 (review).
Other than the cmake change and llvm-link you mentioned, I see three additional tasks:

Drop github.com/[libclc] Override generic symbol using llvm-link --override flag instead of using weak linkage #156778 and refactor to avoid duplicates symbols in two files. I can take this task.
Remove .ll files which don't need clang compilation (the first 3 files will be removed by [libclc] Refine __clc_fp*_subnormals_supported #157633):
https://github.com/llvm/llvm-project/blob/main/libclc/opencl/lib/generic/subnormal_disable.ll
https://github.com/llvm/llvm-project/blob/main/libclc/opencl/lib/generic/subnormal_helper_func.ll
https://github.com/llvm/llvm-project/blob/main/libclc/opencl/lib/generic/subnormal_use_default.ll
https://github.com/llvm/llvm-project/blob/main/libclc/opencl/lib/r600/image/get_image_attributes_impl.ll
https://github.com/llvm/llvm-project/blob/main/libclc/opencl/lib/r600/image/read_image_impl.ll
https://github.com/llvm/llvm-project/blob/main/libclc/opencl/lib/r600/image/write_image_impl.ll
Fix that CMAKE_C_COMPILER set during runtime build stage is ignored by cmake MSVC generator.

@frasercrmck please advice if something is missing.

jhuber6 · 2026-01-06T15:50:13Z

This is much simpler now that I landed some of the other PRs, includes #174611

wenju-he · 2026-01-07T00:12:44Z

libclc/CMakeLists.txt

+if( LLVM_RUNTIMES_BUILD AND LLVM_DEFAULT_TARGET_TRIPLE MATCHES "^nvptx|^amdgcn" )
+  set( LIBCLC_DEFAULT_TARGET ${LLVM_DEFAULT_TARGET_TRIPLE} )
+endif()
+set( LIBCLC_TARGETS_TO_BUILD ${LIBCLC_DEFAULT_TARGET}


would it better to pass -DLIBCLC_TARGETS_TO_BUILD=${LLVM_DEFAULT_TARGET_TRIPLE} as runtime configuration options so that there is not customization here?

Yeah, I'm not completely sold on how to handle this. the fundamental difference here is that we are expecting to build a per-target toolchain. The libclc project completely breaks normal CMake projects by just building a ton of different architectures through custom commands.

We should only be building a single one, that's the expected way these cross-builds work. I think that's something that should be correct by construction. That being said, functionally it's not a major distinction right now because libclc dodges every bit of normal CMake. Normally this would implicitly put --target=amdgcn-amd-amdhsa on all your compiles and that's how you'd get the target code. Since we're doing it manually here it's more to fit in with the expected usage. I.e. the following builds the runtime for your target

-DRUNTIMES_amdgcn-amd-amdhsa_LLVM_ENABLE_RUNTIMES=libclc

I'm not sure if there's a more elegant solution to this. If I had my way we'd rewrite all of this to use a custom language and do it the normal way with the above mechanism. Until then I'm not sure. This was the easiest way I could think of to do it. I have in the past added cache files for required GPU configs, but since this is basically overriding what the runtimes build above is trying to do I'm not so sure.

TL;DR DRUNTIMES_amdgcn-amd-amdhsa_LLVM_ENABLE_RUNTIMES=libclc builds for a single target only and is the expected behavior. libclc doesn't get this because we do everything custom.

thanks @jhuber6, the direction looks to me.

Summary: The other GPU enabled libraries, (openmp, flang-rt, compiler-rt, libc, libcxx, libcxx-abi) all support builds through a runtime cross-build. In these builds we use a separate CMake build that cross-compiles to a single target. This patch provides basic support for this with the libclc libraries. Changes include adding support for the more standard GPU compute triples (amdgcn-amd-amdhsa, nvptx64-nvidia-cuda) and building only one target in this mode. Some things left to do: This patch does not change the compiler invocations, this method would allow us to use standard CMake routines but this keeps it minimal. The prebuild support is questionable and doesn't fit into this scheme because it's a host executable, I'm ignoring it for now. The installed location should just use the triple with no libclc/ subdirectory handling I believe.

wenju-he

LGTM

jhuber6 requested review from arsenm, frasercrmck and shiltian December 30, 2025 21:46

llvmbot added the libclc libclc OpenCL library label Dec 30, 2025

arsenm reviewed Dec 31, 2025

View reviewed changes

arsenm requested a review from wenju-he December 31, 2025 18:08

wenju-he reviewed Jan 4, 2026

View reviewed changes

jhuber6 force-pushed the libclc branch from af79034 to 65da21f Compare January 6, 2026 15:49

wenju-he reviewed Jan 7, 2026

View reviewed changes

jhuber6 force-pushed the libclc branch from 65da21f to db8af55 Compare January 7, 2026 01:25

wenju-he approved these changes Jan 7, 2026

View reviewed changes

arsenm added the cmake Build system in general and CMake in particular label Jan 7, 2026

jhuber6 merged commit 9315747 into llvm:main Jan 7, 2026
11 checks passed

		# FIXME: Is this utility even necessary? The `-mlink-builtin-bitcode`
		# option used to link the library in discards the modified linkage.

	if (NOT PACKAGE_VERSION)
	set(PACKAGE_VERSION ${LLVM_VERSION_MAJOR})
	endif()

Conversation

jhuber6 commented Dec 30, 2025

Uh oh!

arsenm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wenju-he Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wenju-he Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jhuber6 commented Jan 4, 2026

Uh oh!

wenju-he commented Jan 5, 2026

Uh oh!

jhuber6 commented Jan 6, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wenju-he left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wenju-he Jan 5, 2026 •

edited

Loading

wenju-he Jan 5, 2026 •

edited

Loading