Skip to content

[rocRAND][hipRAND] Update error handling in rocRAND/hipRAND#1227

Merged
stanleytsang-amd merged 3 commits into
developfrom
users/umfranzw/update_error_handling
Aug 28, 2025
Merged

[rocRAND][hipRAND] Update error handling in rocRAND/hipRAND#1227
stanleytsang-amd merged 3 commits into
developfrom
users/umfranzw/update_error_handling

Conversation

@umfranzw
Copy link
Copy Markdown
Contributor

@umfranzw umfranzw commented Aug 15, 2025

Motivation

In some files, rocRAND and hipRAND use a simplified definition of HIP_CHECK that calls GTest's ASSERT_EQ to compare the value it's passed with hipSuccess, and does not exit on error (just fails the test).

As of ROCm 7.0, hipGetLastError's behaviour has changed such that the internal error state is not cleared with every HIP API call (only on calls to hipGetLastError). If we do not exit when HIP_CHECK fails, then the internal error state will persist, and the error will be caught the next time hipGetLastError is called in a subsequent test.

Technical Details

This change replaces the simplified HIP_CHECK definitions with a version that exits on failure.

hipRAND was also relying on the simplified HIP_CHECK to check both calls that return a hipError_t and hiprandStatus_t. With the new version of HIP_CHECK, this no longer works. This change adds a separate HIPRAND_CHECK macro to handle checks of type hiprandStatus_t.

Test Plan

Built and ran all tests for rocRAND and hipRAND in order to verify that they work correctly.

Test Result

All tests built and passed successfully.

Submission Checklist

In some files, rocRAND and hipRAND use a simplified definition of HIP_CHECK
that calls GTest's ASSERT_EQ to compare the value it's passed with
hipSuccess, and does not exit on error (just fails the test).

As of ROCm 7.0, hipGetLastError's behaviour has changed such that the internal
error state is not cleared with every HIP API call (only on calls to hipGetLastError).
If we do not exit when HIP_CHECK fails, then the internal error state will
persist, and the error will be caught the next time hipGetLastError is called in
a subsequent test.

This change replaces the simplified HIP_CHECK definitions with a version that
exits on failure.

hipRAND was also relying on the simplified HIP_CHECK to check both calls that return
a hipError_t and hiprandStatus_t. With the new version of HIP_CHECK, this no longer works.
This change adds a separate HIPRAND_CHECK macro to handle checks of type hiprandStatus_t.

The version numbers for rocRAND and hipRAND have also been updated.
@umfranzw umfranzw force-pushed the users/umfranzw/update_error_handling branch from f9b5afa to 51a0f47 Compare August 20, 2025 16:43
@umfranzw umfranzw requested a review from a team as a code owner August 20, 2025 16:43
@umfranzw umfranzw marked this pull request as draft August 27, 2025 19:05
@umfranzw
Copy link
Copy Markdown
Contributor Author

Converting this to a draft while I look into the reported CI issues.

Calling HIP_CHECK(error) won't work, since HIP_CHECK declares a variable called error.
@umfranzw umfranzw marked this pull request as ready for review August 27, 2025 19:59
@stanleytsang-amd stanleytsang-amd merged commit 1cb48da into develop Aug 28, 2025
25 of 27 checks passed
@stanleytsang-amd stanleytsang-amd deleted the users/umfranzw/update_error_handling branch August 28, 2025 01:57
assistant-librarian Bot pushed a commit to ROCm/hipRAND that referenced this pull request Aug 28, 2025
[rocRAND][hipRAND] Update error handling in rocRAND/hipRAND
 (#1227)

## Motivation

In some files, rocRAND and hipRAND use a simplified definition of
`HIP_CHECK` that calls GTest's `ASSERT_EQ` to compare the value it's
passed with `hipSuccess`, and does not exit on error (just fails the
test).

As of ROCm 7.0, `hipGetLastError`'s behaviour has changed such that the
internal error state is not cleared with every HIP API call (only on
calls to `hipGetLastError`). If we do not exit when `HIP_CHECK` fails,
then the internal error state will persist, and the error will be caught
the next time `hipGetLastError` is called in a subsequent test.

## Technical Details

This change replaces the simplified `HIP_CHECK` definitions with a
version that exits on failure.

hipRAND was also relying on the simplified `HIP_CHECK` to check both
calls that return a `hipError_t` and `hiprandStatus_t`. With the new
version of `HIP_CHECK`, this no longer works. This change adds a
separate `HIPRAND_CHECK` macro to handle checks of type
`hiprandStatus_t`.

## Test Plan

Built and ran all tests for rocRAND and hipRAND in order to verify that
they work correctly.

## Test Result

All tests built and passed successfully.

## Submission Checklist

- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
assistant-librarian Bot pushed a commit to ROCm/rocRAND that referenced this pull request Aug 28, 2025
[rocRAND][hipRAND] Update error handling in rocRAND/hipRAND
 (#1227)

## Motivation

In some files, rocRAND and hipRAND use a simplified definition of
`HIP_CHECK` that calls GTest's `ASSERT_EQ` to compare the value it's
passed with `hipSuccess`, and does not exit on error (just fails the
test).

As of ROCm 7.0, `hipGetLastError`'s behaviour has changed such that the
internal error state is not cleared with every HIP API call (only on
calls to `hipGetLastError`). If we do not exit when `HIP_CHECK` fails,
then the internal error state will persist, and the error will be caught
the next time `hipGetLastError` is called in a subsequent test.

## Technical Details

This change replaces the simplified `HIP_CHECK` definitions with a
version that exits on failure.

hipRAND was also relying on the simplified `HIP_CHECK` to check both
calls that return a `hipError_t` and `hiprandStatus_t`. With the new
version of `HIP_CHECK`, this no longer works. This change adds a
separate `HIPRAND_CHECK` macro to handle checks of type
`hiprandStatus_t`.

## Test Plan

Built and ran all tests for rocRAND and hipRAND in order to verify that
they work correctly.

## Test Result

All tests built and passed successfully.

## Submission Checklist

- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants