Skip to content

Enable architectures for libhipcxx (fixes #2504)#2946

Merged
obersteiner merged 1 commit into
mainfrom
users/moberste/enable_archs_for_libhipcxx
Feb 24, 2026
Merged

Enable architectures for libhipcxx (fixes #2504)#2946
obersteiner merged 1 commit into
mainfrom
users/moberste/enable_archs_for_libhipcxx

Conversation

@obersteiner
Copy link
Copy Markdown
Contributor

@obersteiner obersteiner commented Jan 15, 2026

Motivation

So far libhipcxx only enables a very limited number of architectures. With this PR we want to widen this support.

Test Plan

Ideally we want to test all added architectures.

Test Logs

Libhipcxx is passing all Linux architectures that we could test via TheRock:

https://github.com/ROCm/TheRock/actions/runs/22229769520?pr=2946

Submission Checklist

@obersteiner obersteiner force-pushed the users/moberste/enable_archs_for_libhipcxx branch from fdafc59 to 7b58281 Compare January 15, 2026 14:55
@obersteiner obersteiner added gfx103X-linux gfx110X-dgpu Issue/PR relates to gfx110X-dgpu family. gfx1151 Issue/PR relates to gfx1151. gfx120X-all Issue/PR relates to gfx120X-all family gfx103X gfx90X-dcgpu gfx1103 Issue/PR relates to gfx1103 gfx1150 gfx1153 gfx906 gfx908 gfx90a labels Jan 15, 2026
@obersteiner obersteiner force-pushed the users/moberste/enable_archs_for_libhipcxx branch from 47ffdb5 to 8609a4a Compare January 19, 2026 09:51
@obersteiner obersteiner force-pushed the users/moberste/enable_archs_for_libhipcxx branch 2 times, most recently from 9f02139 to 38a6728 Compare February 2, 2026 10:40
@obersteiner obersteiner force-pushed the users/moberste/enable_archs_for_libhipcxx branch 2 times, most recently from 1a60720 to 268ad54 Compare February 6, 2026 13:24
@obersteiner obersteiner marked this pull request as ready for review February 16, 2026 14:09
@obersteiner obersteiner requested a review from geomin12 February 16, 2026 14:09
@obersteiner
Copy link
Copy Markdown
Contributor Author

obersteiner commented Feb 16, 2026

Todo: We need to disable the changes in amdgpu_family_matrix.py before merging.

The failing gfx90a tests seem to be a node issue as other packages have similar issues (and we test with gfx90a on our dev CI which is passing all of our tests)

@obersteiner obersteiner changed the title Enable architectures for libhipcxx Enable architectures for libhipcxx (fixes #2504) Feb 16, 2026
marbre
marbre previously requested changes Feb 16, 2026
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes need to be reverted before the merge but are okay for testing.

hipSPARSELt # https://github.com/ROCm/TheRock/issues/2042
composable_kernel # https://github.com/ROCm/TheRock/issues/1245
rocWMMA # https://github.com/ROCm/TheRock/issues/1944
libhipcxx # https://github.com/ROCm/TheRock/issues/2504
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/ROCm/TheRock/actions/runs/21752139280/job/63742765914?pr=2946#step:10:3072

[20](https://github.com/ROCm/TheRock/actions/runs/21752139280/job/63742765914?pr=2946#step:10:3025)
  5: Failed Tests (25):
  5:   libhip++ :: cuda/float/bfloat_cos_comparison.pass.cpp
  5:   libhip++ :: cuda/float/bfloat_exp_comparison.pass.cpp
  5:   libhip++ :: cuda/float/bfloat_log_comparison.pass.cpp
  5:   libhip++ :: cuda/float/bfloat_sin_comparison.pass.cpp
  5:   libhip++ :: cuda/float/bfloat_sqrt_comparison.pass.cpp
  5:   libhip++ :: cuda/float/half_cos_comparison.pass.cpp
  5:   libhip++ :: cuda/float/half_exp_comparison.pass.cpp
  5:   libhip++ :: cuda/float/half_log_comparison.pass.cpp
  5:   libhip++ :: cuda/float/half_sin_comparison.pass.cpp
  5:   libhip++ :: cuda/float/half_sqrt_comparison.pass.cpp
  5:   libhip++ :: heterogeneous/atomic/atomic_cuda_float.pass.cpp
  5:   libhip++ :: heterogeneous/atomic/atomic_cuda_generic.pass.cpp
  5:   libhip++ :: heterogeneous/atomic/atomic_cuda_signed.pass.cpp
  5:   libhip++ :: heterogeneous/atomic/atomic_cuda_unsigned.pass.cpp
  5:   libhip++ :: heterogeneous/atomic/atomic_std_float.pass.cpp
  5:   libhip++ :: heterogeneous/atomic/atomic_std_generic.pass.cpp
  5:   libhip++ :: heterogeneous/atomic/atomic_std_signed.pass.cpp
  5:   libhip++ :: heterogeneous/atomic/atomic_std_unsigned.pass.cpp
  5:   libhip++ :: heterogeneous/atomic/flag.pass.cpp
  5:   libhip++ :: heterogeneous/atomic/reference_cuda.pass.cpp
  5:   libhip++ :: heterogeneous/atomic/reference_std.pass.cpp
  5:   libhip++ :: heterogeneous/optional.pass.cpp
  5:   libhip++ :: heterogeneous/pair.pass.cpp
  5:   libhip++ :: heterogeneous/tuple.pass.cpp
  5:   libhip++ :: heterogeneous/variant.pass.cpp

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geomin12 @amd-justchen can you help looking into the potential node-issue? I think we should wait for a green signal and not fully rely on the local testing done by @obersteiner and team. We should make sure tests passed at least once before merging :)

Copy link
Copy Markdown
Contributor Author

@obersteiner obersteiner Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geomin, @amd-justchen @marbre: In the logs I see errors like out of memory exceptions that I also see for example for thrust. We do not use a lot of memory in our tests (we do not have any large numeric tests but instead have only very small tests). An example for the rocThrust error https://github.com/ROCm/TheRock/actions/runs/21752139280/job/63742765864?pr=2946#step:10:19181

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those tests are passing now.

hipSPARSELt # https://github.com/ROCm/TheRock/issues/2042
composable_kernel # https://github.com/ROCm/TheRock/issues/1245
rocWMMA # https://github.com/ROCm/TheRock/issues/1944
libhipcxx # https://github.com/ROCm/TheRock/issues/2504
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are test failures for the so far tested gfx90a, see below. I assume this should be fixed before allowing to run on other gfx90X architectures?

Copy link
Copy Markdown
Contributor

@geomin12 geomin12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the gfx90X failures are being resolved, noted in #3506

# Label is linux-gfx110X-gpu-rocm, fetch-gfx-targets should be ["gfx1100"]
"test-runs-on": "",
"family": "gfx110X-all",
"fetch-gfx-targets": [],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm glad the tests work :)

but let's please revert this back to original state, we lack the machines to run all tests

i would post the test logs in the PR description as proof of working, then potentially add skip-ci label as this is already proven working and no need to take CI resources

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have posted the link to the old test into the description + dropped the commit that changed the file + added the skip-ci label.

@obersteiner obersteiner force-pushed the users/moberste/enable_archs_for_libhipcxx branch from aaff0fb to dbae602 Compare February 23, 2026 16:48
@obersteiner obersteiner added the ci:skip Skip all CI builds/tests for this PR label Feb 23, 2026
@obersteiner obersteiner requested a review from geomin12 February 23, 2026 16:51
Copy link
Copy Markdown
Contributor

@geomin12 geomin12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks for this update

@geomin12 geomin12 removed gfx103X-linux gfx110X-dgpu Issue/PR relates to gfx110X-dgpu family. gfx1151 Issue/PR relates to gfx1151. gfx120X-all Issue/PR relates to gfx120X-all family gfx103X gfx90X-dcgpu gfx1103 Issue/PR relates to gfx1103 gfx1150 gfx1153 gfx906 gfx908 gfx90a gfx1152 labels Feb 23, 2026
@geomin12
Copy link
Copy Markdown
Contributor

I'll need to fix skip-ci label, looks like it isn't working properly with other labels. will fix this in a following PR

I removed the labels, so skip-ci will work

@geomin12 geomin12 added ci:skip Skip all CI builds/tests for this PR and removed ci:skip Skip all CI builds/tests for this PR labels Feb 23, 2026
Copy link
Copy Markdown
Member

@marbre marbre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All currently linked checks seem to have been canceled (or CI runs might be still queued due do the force push)? Code-change wise this looks good but I don't have any signal from CI for your changes. Therefore leaving it to @geomin12 and you to judge.

@marbre marbre dismissed their stale review February 23, 2026 17:37

Dismissing as outlined in the last review comment.

@obersteiner obersteiner merged commit f8fbe17 into main Feb 24, 2026
165 of 198 checks passed
@obersteiner obersteiner deleted the users/moberste/enable_archs_for_libhipcxx branch February 24, 2026 14:07
@github-project-automation github-project-automation Bot moved this from TODO to Done in TheRock Triage Feb 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci:skip Skip all CI builds/tests for this PR

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants