[SYCL] Add support for SYCL 2020 in class group#5447
Conversation
c384248 to
f4921ad
Compare
|
This change is good, but I think if we're going to add some SYCL 2020 functionality to the The SYCL 2020 group class is documented here: https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#group-class, and I don't think many extra changes are required:
|
|
@Pennycook added functions you asked for in your comment. Also extended tests for group and added test for local_id on device (intel/llvm-test-suite#821). |
sycl/include/CL/sycl/group.hpp
Outdated
There was a problem hiding this comment.
I'd like @gmlueck to double-check this. Greg, can this be declared in our headers as a template function if it isn't a template function in the specification? My concern here is that this might be user-observable: a user could call get_local_linear_id<Dimensions + 1>, for example, or might have to insert extra template keywords in some situations.
If I'm right to be concerned, I think we can work around this by moving the specialization to a helper function, something like the following:
namespace sycl {
namespace detail {
size_t get_local_linear_id(sycl::group<0>& g) const {
return g.get_local_id();
}
// + Specializations for group<1> and group<2>
}
}
size_t get_local_linear_id() const {
return sycl::detail::get_local_linear_id(*this);
}There was a problem hiding this comment.
Yes, it does seem weird that we would implement this function as a template when it is not that way in the spec. It seems like it would be easy to fix by calling some separate templated function as John proposes. I assume the separate function could also be a private member function of group?
class group {
public:
size_t get_local_linear_id() const {
return get_local_linear_id_impl<Dimensions>();
}
private:
template <int dims>
typename detail::enable_if_t<(dims == 1), size_t>
get_local_linear_id_impl() const {
id<Dimensions> localId = get_local_id();
return localId[0];
}
}
Or, you could use constexpr-if, but that would require the compiler to be C++17.
sycl/include/CL/sycl/group.hpp
Outdated
There was a problem hiding this comment.
I appreciate the specification could be clearer here, but the intent of this function is to return the maximum number of work-items in any work-group for the kernel that is currently executing, rather than the maximum number of work-items supported by the device.
Because we don't support non-uniform work-group sizes, I think you can replace all of this with:
size_t get_max_local_range() const {
return get_local_range();
}The reason both functions exist is that they return different values for the sub_group class. Supporting both in group allows developers to write generic code accepting either a sub_group or group.
sycl/include/CL/sycl/group.hpp
Outdated
There was a problem hiding this comment.
The specification for this function says:
Return true for exactly one work-item in the work-group, if the calling work-item is the leader of the work-group, and false for all other work-items in the work-group.
The leader of the work-group is determined during construction of the work-group, and is invariant for the lifetime of the work-group. The leader of the work-group is guaranteed to be the work-item with a local id of 0.
Because of the last sentence, I think the correct implementation is:
bool leader() const {
return (get_local_linear_id() == 0);
}f72fbd2 to
1fc564a
Compare
1fc564a to
2258997
Compare
|
@alexbatashev, take a look at this failure - https://github.com/intel/llvm/runs/5155471215?check_suite_focus=true. I think we should improve the CI scripts to avoid this. |
Pennycook
left a comment
There was a problem hiding this comment.
Sorry for not being clear in my initial review -- my comment about get_local_linear_id() applies to all of the new template functions you've introduced here. I think there are three such functions left, and I've called out the first occurrence of each.
sycl/include/CL/sycl/group.hpp
Outdated
There was a problem hiding this comment.
This should call a get_local_linear_range_impl().
sycl/include/CL/sycl/group.hpp
Outdated
There was a problem hiding this comment.
This should call a get_group_linear_range_impl().
sycl/include/CL/sycl/group.hpp
Outdated
There was a problem hiding this comment.
This should call a get_group_linear_id_impl().
7fbb9a7 to
3cba04f
Compare
Done for rest of the functions |
|
@aobolensk Could you please check if this failure is related to your PR: Failed Tests (1): |
Never mind, it is unrelated. Found a PR isolated to L0 plugin which has the same failure: #5541 |
…/llvm into refactor_existing_workflows * 'refactor_existing_workflows' of github.com:alexbatashev/llvm: (2051 commits) [SYCL][L0] Honor property::queue::enable_profiling (intel#5543) [SYCL][CI] Enable sccache on Windows (intel#5589) [SYCL][Doc] Move internal design docs (intel#5556) [sycl-post-link] Initialize the integer Value variable (intel#5585) [CI] Fix nightly builds (intel#5584) [SYCL][L0] Fix use of copy-engines in L0 interop queue (intel#5579) Update OpenCL headers tag to dcd5bed (intel#5575) [SYCL] Fix warning for InOrderQueueSyncCheck unit test build (intel#5577) [SYCL][HIP] Remove arch requirement for running lit tests (intel#5253) [SYCL][L0] Fix timestamp calculation (in ns) (intel#5555) [SYCL] Fix sync of host task vs kernel for in-order queue (intel#5551) [sycl-post-link] Add a check for device globals with device_image_scope (intel#5517) [SYCL] Fix SYCL Kernel Body Check (intel#5546) [SYCL] Add support for SYCL 2020 in class group (intel#5447) Fix tests after 1c729d7 Use align attribute for kernel pointer arg alignment Fix tests after 18834dc Mark pointer-typed kernel arguments as ABI aligned [CI] Add experimental Windows build to GitHub Actions nightly (intel#5560) [sycl-post-link][NFC] Address clang-tidy concerns in the sycl-post-link (intel#5552) Fix lit test after changes in Clang Improve backward compatibility for DISubRange ...
Implement missing methods for
class groupaccording to SYCL 2020 4.9.1.7.