Skip to content

Conversation

@adstraw
Copy link
Contributor

@adstraw adstraw commented Sep 6, 2023

This PR adds an intrinsic to create barriers that can be used with existing barrier intrinsics for synchronization. The prior method of barrier allocation was to use alloc_buffer e.g. as follows barrier = T.alloc_buffer([1], "uint64", scope="shared") and then pass the pointer and offset to that barrier allocation for use in the barrier intrinsics. This was a functional interface, but also caused problems with alignment of other non-barrier shared memory allocations. See removed workarounds marked with TODO in the tests in this PR. At the expense of the additional create_barriers intrinsic we get a simplified interface using barrier ID rather than pointer / offset passed to the barrier intrinsics and more low-level codegen control which is used in this PR to solve the alignment issue.

Copy link
Contributor

@JosephTheOctonaut JosephTheOctonaut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

Copy link
Contributor

@csullivan csullivan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @adstraw! Only one question below and nit for consideration now or later

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants