Skip to content

Commit 84cc1e1

Browse files
[SYCL] Support multiple call operators in kernel functor (#8525)
A named function object that is used to define a SYCL kernel, can have more than one definition of function call operator `'operator()()'`. For such cases, prior to this change, the compiler front end always returned the first available 'operator()()' call found in the functor, regardless of whether that was the one the user invoked from within the kernel invocation function. This resulted in incorrect kernel invocation and incorrect attribute collection ( if the 'operator()()' calls had attributes applied on them). This is best illustrated by the following example: ``` class MyKernel { public: void operator()() const { // code } [[intel::reqd_sub_group_size(4)]] void operator()(sycl::id<1> id) const { // code } }; int main() { Q.submit([&](sycl::handler& cgh) { MyKernel kernelFunctorObject; cgh.parallel_for(sycl::range<1>(16), kernelFunctorObject); }); return 0; } ``` In the above example, the `paralle_for` kernel invocation function has a `range` argument, in addition to the functor. In the simplified DPC++ header implementation of parallel_for below, ``` template <typename KernelName = auto_name, typename KernelType, int Dims> void parallel_for(range<Dims> numWorkItems, const KernelType &kernelFunc) { using NameT = typename get_kernel_name_t<KernelName, KernelType>::name; kernel_parallel_for<NameT, KernelType, Dims>(kernelFunc); template <typename KernelName, typename KernelType, int Dims> ATTR_SYCL_KERNEL void kernel_parallel_for(const KernelType &KernelFunc) { KernelFunc(id<Dims>()); } ``` KernelFunc() is invoked with 'id' as an argument because 'parallel_for' invocation has a 'range' passed to it. However, for the example above, the CFE returns `'operator()()'` instead of `'void operator()(sycl::id<1> id) const'`. This results in incorrect kernel being invoked as well as losing the `intel::reqd_sub_group_size(4)]` attribute, the user intended to use. The root of the kernel is the function marked with `“[[clang::sycl_kernel]]”` and the user’s kernel is “`KernelFunc`”, and any C++ attributes the user declared for the kernel are on that function. In the new implementation, we use `RecursiveASTVisitor` Visitor to traverse the nodes from the root of the kernel function (e.g. “kernel_parallel_for”) and find the version of `“operator()()”` that is called by `“KernelFunc()”`. There will only be one call to `“KernelFunc()”` in that call graph because the DPC++ headers are structured such that the user’s kernel function is only called once. This will result in the correct kernel/operator()() call being invoked by the CFE.
1 parent f45fb51 commit 84cc1e1

File tree

5 files changed

+263
-83
lines changed

5 files changed

+263
-83
lines changed

clang/include/clang/Sema/Sema.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14450,7 +14450,7 @@ class Sema final {
1445014450

1445114451
bool isDeclAllowedInSYCLDeviceCode(const Decl *D);
1445214452
void checkSYCLDeviceVarDecl(VarDecl *Var);
14453-
void copySYCLKernelAttrs(const CXXRecordDecl *KernelObj);
14453+
void copySYCLKernelAttrs(CXXMethodDecl *CallOperator);
1445414454
void ConstructOpenCLKernel(FunctionDecl *KernelCallerFunc, MangleContext &MC);
1445514455
void SetSYCLKernelNames();
1445614456
void MarkDevices();

0 commit comments

Comments
 (0)