[SYCL][Docs][Joint matrix] Add overloads and restrictions for the offset load store#15499
[SYCL][Docs][Joint matrix] Add overloads and restrictions for the offset load store#15499steffenlarsen merged 4 commits intointel:syclfrom
Conversation
| void joint_matrix_store(Group g, | ||
| const joint_matrix<Group, T, use::a, Rows, Cols, Layout> &res, | ||
| multi_ptr<T, Space, IsDecorated> dest, size_t stride); | ||
| multi_ptr<T, Space, IsDecorated> dest, size_t Stride); |
There was a problem hiding this comment.
Why capitalize this parameter name? All the other parameter names start with a lower case letter. Our style is that function parameter names are lower case (snake_case) while template parameter names are upper case (CamelCase).
I see below that you have added parameter names RowIndex and ColIndex. These should be row_index and col_index to be consistent.
|
|
||
| - The `Stride` argument to `joint_matrix_load` and | ||
| `joint_matrix_store` must be a multiple of 8 bytes. Also, `Stride` | ||
| should not exceed `2^24^` bytes. |
There was a problem hiding this comment.
The stride parameter is the number of elements, not the number of bytes. It would be better to reword this like:
The
strideparameter tojoint_matrix_loadandjoint_matrix_storehas the following restrictions:
- The value
stride * sizeof(T1)must be a multiple of 8, and- The value of
stride * sizeof(T1)must not exceed 224.
| these checked APIs: | ||
|
|
||
| - The `Stride` argument must be a multiple of 8 bytes. Also, `Stride` | ||
| should not exceed `2^24^` bytes. |
There was a problem hiding this comment.
See my comment in the other file about the wording of this restriction.
|
@intel/llvm-gatekeepers, please help merge this. |
|
|
||
| - If these restrictions are not satisfied, users can switch to slower | ||
| implementations of `joint_matrix_load` and `joint_matrix_store` by | ||
| setting the driver flag `IGC_JointMatrixLoadStoreOpt=1`. |
There was a problem hiding this comment.
@dkhaldi , It is better to use IGC_JointMatrixLoadStoreOpt=2, as more optimizations may kick in, especially for big shapes.
joint_matrix_loadandjoint_matrix_storewhere the offsets are separated from the base pointer and added as separate arguments. I kept the same name as the expectation is to remove the regular variants once the new ones are used instead.joint_matrix_load/storeon PVC since in the current implementation, no runtime checks are added as they are expensive. The fall back to 1d load/store is done using a flag instead.