Fixes block-scope run-length decode one-past-the-end memory access into smem TempStorage #626
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Closes #591.
Each thread decodes up to
DECODED_ITEMS_PER_THREAD
items at a time.If a thread is assigned to the last run, we make sure that it won't proceed with the next run by setting its one-past-the-end of its current run to correspond to
DECODED_ITEMS_PER_THREAD
.However, in the last iteration of their loop, i.e., after having decoded their last of the
DECODED_ITEMS_PER_THREAD
items, threads enter again the conditional to fetch the next run, if applicable, even if they exit the loop right after, which would result in one-past-the-last item access intotemp_storage.runs.run_offsets
.Checklist