Skip to content
This repository was archived by the owner on May 9, 2024. It is now read-only.

Refactor GPU shared memory tests and add L0-specific ones [2/2] #637

Merged
merged 11 commits into from
Aug 21, 2023

Conversation

kurapov-peter
Copy link
Contributor

This patch moves most of the aggregation function implementations to the cpp module. It also applies some fixes that are required since we don't have the automatic zero initialization of the memory. I tried to be as close to cuda implementations as possible to keep it simple and generalize later. It also re-enables some tests that were excluded from CI after turning smem support on by default.

@kurapov-peter
Copy link
Contributor Author

Resolves #527, #544

@kurapov-peter kurapov-peter merged commit 2602ad1 into main Aug 21, 2023
@kurapov-peter kurapov-peter deleted the pakurapo/smem-test-3 branch August 21, 2023 09:26
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants