Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes block-scope run-length decode one-past-the-end memory access into smem TempStorage #626

Merged
merged 1 commit into from
Oct 31, 2023

Conversation

elstehle
Copy link
Contributor

Description

Closes #591.

Each thread decodes up to DECODED_ITEMS_PER_THREAD items at a time.
If a thread is assigned to the last run, we make sure that it won't proceed with the next run by setting its one-past-the-end of its current run to correspond to DECODED_ITEMS_PER_THREAD.

    // If this thread is getting assigned the last run, we make sure it will not fetch any other run after this
    DecodedOffsetT assigned_run_end = (assigned_run == BLOCK_RUNS - 1)
                                        ? thread_decoded_offset + DECODED_ITEMS_PER_THREAD
                                        : temp_storage.runs.run_offsets[assigned_run + 1];

However, in the last iteration of their loop, i.e., after having decoded their last of the DECODED_ITEMS_PER_THREAD items, threads enter again the conditional to fetch the next run, if applicable, even if they exit the loop right after, which would result in one-past-the-last item access into temp_storage.runs.run_offsets.

    for (DecodedOffsetT i = 0; i < DECODED_ITEMS_PER_THREAD; i++)
    {
      decoded_items[i] = val;
      item_offsets[i]  = thread_decoded_offset - assigned_run_begin;
      if (thread_decoded_offset == assigned_run_end - 1)
      {
        assigned_run++;
        assigned_run_begin = temp_storage.runs.run_offsets[assigned_run];
        ...
        val                = temp_storage.runs.run_values[assigned_run];
      }
      thread_decoded_offset++;

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@elstehle elstehle requested review from a team as code owners October 26, 2023 19:53
@elstehle elstehle requested review from gevtushenko and wmaxey and removed request for a team October 26, 2023 19:53
@elstehle elstehle force-pushed the fix/rld-oob-smem-access branch 2 times, most recently from 341eabc to f8a3fda Compare October 26, 2023 21:32
@elstehle elstehle force-pushed the fix/rld-oob-smem-access branch from f05e23a to f41027b Compare October 27, 2023 16:39
@elstehle elstehle force-pushed the fix/rld-oob-smem-access branch from f41027b to 59abd5d Compare October 31, 2023 11:38
@elstehle elstehle merged commit cc222bd into NVIDIA:main Oct 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[BUG]: BlockRunLengthDecode may access out-of-bounds TempStorage
3 participants