-
Notifications
You must be signed in to change notification settings - Fork 155
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
For some work-group sizes the current implementation of reduction is not working well. The implementation in repo assumes that all of work-items will be executed. Even if the workgroup size isn't the multiple of the width of SIMD size. This change returns the SLM+barrier for the final calculation of the reduction (performance degradation). TODO: Remove the SLM+barrier and force execution of the whole reduction built-in function with NoMask on asm level. (cherry picked from commit c147d7f)
- Loading branch information
Showing
1 changed file
with
33 additions
and
25 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters