Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Realm: Slow data transposes on GPU using HIP #1789

Open
Tracked by #1032
mariodirenzo opened this issue Nov 8, 2024 · 0 comments
Open
Tracked by #1032

Realm: Slow data transposes on GPU using HIP #1789

mariodirenzo opened this issue Nov 8, 2024 · 0 comments

Comments

@mariodirenzo
Copy link

As mentioned in the Legion meeting on 11/06/2024, we observe very slow data copies on AMD GPUs when running HTR++ on Tioga.
Two profile logs produced with the version of legion 023d0c3 on Lassen (
prof_lassen.log) and on Tioga (prof_tioga.log) are attached.
The profiled configuration uses only one GPU on one node and makes several copies of the same data changing its layout. The logs show that the copies on Tioga are about 10x slower than those on Lassen and, as discussed in the Legion meeting, this is most likely due to the absence of the new DMA.

For @seemamirch, the input file needed to reproduce these logs is base.json and the logs are produced on both systems by launching the code with PROFILE=1 $HTR_DIR/prometeo.sh -i base.json -o .

@elliottslaughter, can you please add this issue to #1032?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant