Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -255,9 +255,9 @@ The following group algorithms support `ballot_group`, `fixed_size_group`,

A ballot-group is a non-contiguous subset of a group, representing a collection
of all work-items in the group that share the same value of some predicate.
Ballot-groups are always created in a range of two: the first ballot-group
contains all work-items where the predicate is true, and the second
ballot-group contains all work-items where the predicate is false.
Ballot-groups are always created in a range of two: the one ballot-group
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for reviewers: I removed mentions of "first" and "second", because below we explicitly say that work-items where the predicate is true are in ballot-group with local id = 1 and the rest of work-items are in ballot-group with local id = 0.

Which meant that 1 is "first" and 0 is "second". I think here we shouldn't use any ordering-related terms to avoid contradiction with the detailed description of methods below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also remove the "the", though I think both are correct.

Suggested change
Ballot-groups are always created in a range of two: the one ballot-group
Ballot-groups are always created in a range of two: one ballot-group

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that "the" should be removed here.

contains all work-items where the predicate is true, and another ballot-group
contains all work-items where the predicate is false.


==== Creation
Expand Down Expand Up @@ -304,7 +304,7 @@ public:
using linear_id_type = uint32_t;
static constexpr int dimensions = 1;
static constexpr sycl::memory_scope fence_scope = ParentGroup::fence_scope;

id_type get_group_id() const;

id_type get_local_id() const;
Expand Down Expand Up @@ -335,7 +335,7 @@ than only sub-groups.
----
id_type get_group_id() const;
----
_Returns_: An `id` representing the index of the ballot-group.
_Returns_: An `id` representing the index of the ballot-group in a parent group.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for reviewers: I'm not entirely sure if this clarification is needed, perhaps that piece of information is implied and should be obvious, but I though that it would be better to state it explicitly.

If it is decided that it makes sense to add such clarification, then I should have another pass over the doc, because I haven't been consistent with adding it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not even sure the clarification is that clear. What does it mean to have an "index in a parent group". You can create multiple ballot-groups from the same sub-group, and they may have the same indices.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The clearest thing to do here might be to say something really precise (and verbose), like:

Suggested change
_Returns_: An `id` representing the index of the ballot-group in a parent group.
_Returns_: An `id` representing the index of the ballot-group to which the
calling work-item belongs, as determined by the work-item's associated
`predicate` value.

...but I appreciate this isn't perfect either. One of the things we've been exploring is to replace the concept of a ballot-group with "the result of a logical partitioning", which might make this easier to explain. This would then read as "the index of the logical partition to which..."


NOTE: This will always be either 1 (representing the group of work-items where
the predicate was true) or 0 (representing the group of work-items where the
Expand All @@ -352,7 +352,7 @@ the ballot-group.
----
range_type get_group_range() const;
----
_Returns_: A `range` representing the number of ballot-groups.
_Returns_: A `range` representing the number of ballot-groups in a parent group.

NOTE: This will always return a `range` of 2, as there will always be two groups;
one representing the group of work-items where the predicate was true and
Comment on lines +355 to 358
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could simplify this to:

Suggested change
_Returns_: A `range` representing the number of ballot-groups in a parent group.
NOTE: This will always return a `range` of 2, as there will always be two groups;
one representing the group of work-items where the predicate was true and
_Returns_: The result of `range<1>(2)`.
NOTE: This always returns a `range` of 2, as there will always be two groups;
one representing the group of work-items where the predicate was true and

Expand All @@ -376,19 +376,19 @@ _Returns_: A linearized version of the `id` returned by `get_group_id()`.
----
id_type get_local_linear_id() const;
----
_Returns_: A linearized version of the `id` returned by `get_local_linear_id()`.
_Returns_: A linearized version of the `id` returned by `get_local_id()`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!


[source,c++]
----
range_type get_group_linear_range() const;
linear_id_type get_group_linear_range() const;
----
_Returns_: A linearized version of the `id` returned by `get_group_range()`.
_Returns_: A linearized version of the `range` returned by `get_group_range()`.

[source,c++]
----
range_type get_local_linear_range() const;
linear_id_type get_local_linear_range() const;
----
_Returns_: A linearized version of the `id` returned by `get_local_range()`.
_Returns_: A linearized version of the `range` returned by `get_local_range()`.

[source,c++]
----
Expand Down Expand Up @@ -477,7 +477,7 @@ public:
using linear_id_type = uint32_t;
static constexpr int dimensions = 1;
static constexpr sycl::memory_scope fence_scope = ParentGroup::fence_scope;

id_type get_group_id() const;

id_type get_local_id() const;
Expand Down Expand Up @@ -521,7 +521,8 @@ the fixed-size-group.
----
range_type get_group_range() const;
----
_Returns_: A `range` representing the number of fixed-size-groups.
_Returns_: A `range` representing the number of fixed-size-groups in a parent
group.
Comment on lines +524 to +525
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like:

Suggested change
_Returns_: A `range` representing the number of fixed-size-groups in a parent
group.
_Returns_: A `range` representing the number of fixed-size-groups created by
partitioning the parent group with `Size`.

I'm being pedantic, but the number of fixed-size-groups in a parent group might be a concept that some people struggle with. If somebody partitions a group into fixed-size-groups of size 1 and again into fixed-size-groups of size 2, those two partitionings don't know anything about one another.


[source,c++]
----
Expand All @@ -540,28 +541,28 @@ _Returns_: A linearized version of the `id` returned by `get_group_id()`.
----
id_type get_local_linear_id() const;
----
_Returns_: A linearized version of the `id` returned by `get_local_linear_id()`.
_Returns_: A linearized version of the `id` returned by `get_local_id()`.

[source,c++]
----
range_type get_group_linear_range() const;
linear_id_type get_group_linear_range() const;
----
_Returns_: A linearized version of the `id` returned by `get_group_range()`.
_Returns_: A linearized version of the `range` returned by `get_group_range()`.

[source,c++]
----
range_type get_local_linear_range() const;
linear_id_type get_local_linear_range() const;
----
_Returns_: A linearized version of the `id` returned by `get_local_range()`.
_Returns_: A linearized version of the `range` returned by `get_local_range()`.

[source,c++]
----
bool leader() const;
----
_Returns_: `true` for exactly one work-item in the fixed-size-group, if the calling
work-item is the leader of the fixed-size-group, and `false` for all other
work-items in the fixed-size-group. The leader of the fixed-size-group is guaranteed
to be the work-item for which `get_local_id()` returns 0.
_Returns_: `true` for exactly one work-item in the fixed-size-group, if the
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for reviewers: no real changes here, just a re-formatting to fit into 80 symbol limit per line

calling work-item is the leader of the fixed-size-group, and `false` for all
other work-items in the fixed-size-group. The leader of the fixed-size-group is
guaranteed to be the work-item for which `get_local_id()` returns 0.


==== Usage Examples
Expand Down Expand Up @@ -641,7 +642,7 @@ public:
using linear_id_type = uint32_t;
static constexpr int dimensions = 1;
static constexpr sycl::memory_scope fence_scope = ParentGroup::fence_scope;

id_type get_group_id() const;

id_type get_local_id() const;
Expand All @@ -668,7 +669,7 @@ public:
----
id_type get_group_id() const;
----
_Returns_: An `id` representing the index of the tangle-group.
_Returns_: An `id` representing the index of the tangle-group in a parent group.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_Returns_: An `id` representing the index of the tangle-group in a parent group.
_Returns_: The result of `id<1>(0)`.


NOTE: This will always be an `id` with all values set to 0, since there can
only be one tangle-group.
Expand Down Expand Up @@ -705,19 +706,19 @@ _Returns_: A linearized version of the `id` returned by `get_group_id()`.
----
id_type get_local_linear_id() const;
----
_Returns_: A linearized version of the `id` returned by `get_local_linear_id()`.
_Returns_: A linearized version of the `id` returned by `get_local_id()`.

[source,c++]
----
range_type get_group_linear_range() const;
linear_id_type get_group_linear_range() const;
----
_Returns_: A linearized version of the `id` returned by `get_group_range()`.
_Returns_: A linearized version of the `range` returned by `get_group_range()`.

[source,c++]
----
range_type get_local_linear_range() const;
linear_id_type get_local_linear_range() const;
----
_Returns_: A linearized version of the `id` returned by `get_local_range()`.
_Returns_: A linearized version of the `range` returned by `get_local_range()`.

[source,c++]
----
Expand Down Expand Up @@ -746,9 +747,8 @@ auto sg = it.get_sub_group();
auto will_branch = sg.get_local_linear_id() % 2 == 0;
if (will_branch)
{
// wait for all work-items that took the branch to hit the barrier
// no explicit barrier is needed, get_tangle_group behaves like a barrier
auto inner = sycl::ext::oneapi::experimental::get_tangle_group(sg);
sycl::group_barrier(inner);

// reduce across subset of outer work-items that took the branch
float ix = sycl::reduce_over_group(inner, x, plus<>());
Expand Down Expand Up @@ -837,7 +837,8 @@ public:
----
id_type get_group_id() const;
----
_Returns_: An `id` representing the index of the opportunistic-group.
_Returns_: An `id` representing the index of the opportunistic-group in a parent
group.
Comment on lines +840 to +841
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_Returns_: An `id` representing the index of the opportunistic-group in a parent
group.
_Returns_: The result of `id<1>(0)`.


NOTE: This will always be an `id` with all values set to 0, since there can
only be one opportunistic-group.
Expand Down Expand Up @@ -875,19 +876,19 @@ _Returns_: A linearized version of the `id` returned by `get_group_id()`.
----
id_type get_local_linear_id() const;
----
_Returns_: A linearized version of the `id` returned by `get_local_linear_id()`.
_Returns_: A linearized version of the `id` returned by `get_local_id()`.

[source,c++]
----
range_type get_group_linear_range() const;
linear_id_type get_group_linear_range() const;
----
_Returns_: A linearized version of the `id` returned by `get_group_range()`.
_Returns_: A linearized version of the `range` returned by `get_group_range()`.

[source,c++]
----
range_type get_local_linear_range() const;
linear_id_type get_local_linear_range() const;
----
_Returns_: A linearized version of the `id` returned by `get_local_range()`.
_Returns_: A linearized version of the `range` returned by `get_local_range()`.

[source,c++]
----
Expand Down Expand Up @@ -965,3 +966,14 @@ executing the same control flow (without introducing significant overhead). If
we decide at a later date that `tangle_group` should support only sub-groups,
we should revisit the name to avoid creating confusion.
--

. Inconsistencies in `fixed_size_group`
+
--
`fixed_size_group` accepts template argument `PartitionSize` of type
`std::size_t`, which can be an unsigned 64-bit integer. However,
`linear_id_type` is defined as `uint32_t` for this group. Even though with the
current restrictions it is impossible to encounter such situation, there is
still a mismatch allowing to create a group so big that linear IDs won't be
correct for some of its work-items.
--
Comment on lines +969 to +979
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is just a bug, and we should update linear_id_type with size_t.

My initial prototype only works with sub-groups (which are limited to uint32_t) and that's probably how this crept in. If/when we allow partitioning of work-groups (and other groups) into fixed-size groups, we may need size_t.