Skip to content

Cherry-pick hipblaslt changes for rocm 7.0#675

Merged
vamovsik merged 6 commits into
release/rocm-rel-7.0from
users/alexbrown/release/rocm-rel-7.0-cp
Jul 22, 2025
Merged

Cherry-pick hipblaslt changes for rocm 7.0#675
vamovsik merged 6 commits into
release/rocm-rel-7.0from
users/alexbrown/release/rocm-rel-7.0-cp

Conversation

@AlexBrownAMD
Copy link
Copy Markdown
Contributor

hipblaslt changes required for rocm 7.0:

mahmoodw and others added 6 commits July 15, 2025 11:15
This includes 2 changes:
- Unrestricted the temp sgprs needed for gsu from being contiguous,
avoiding overflow for certain kernels
- Account for additional temp sgprs that will be required for code gen,
up to physical limits

---
🔁 Imported from
[ROCm/hipBLASLt#2184](ROCm/hipBLASLt#2184) 🧑‍💻
Originally authored by @mahmoodw

---------

Co-authored-by: assistant-librarian[bot] <210906412+assistant-librarian[bot]@users.noreply.github.com>
Co-authored-by: mahmoodw <wmahmood@amd.com>
There were more SGPR allocations than what is needed for . This PR
addresses that while keeping backward compatibility.

hipblaslt-test on gfx950:
[----------] Global test environment tear-down
[==========] 19937 tests from 12 test suites ran. (928859 ms total)
[  PASSED  ] 19937 tests.
hipBLASLt version: 100100
hipBLASLt git version: 36ab695-dirty
command line: ./hipblaslt-test


hipblaslt-test on gfx942:
[----------] Global test environment tear-down
[==========] 54591 tests from 11 test suites ran. (3033316 ms total)
[  PASSED  ] 54591 tests.
hipBLASLt version: 100100
hipBLASLt git version: 36ab695
command line: ./hipblaslt-test


---
🔁 Imported from
[ROCm/hipBLASLt#2250](ROCm/hipBLASLt#2250)
🧑‍💻 Originally authored by @aliry95amd

Co-authored-by: aliry95amd <ayazdani@amd.com>
Co-authored-by: assistant-librarian[bot] <assistant-librarian[bot]@users.noreply.github.com>
This PR fixes a bug that occurred when negative WGM values were used for
StreamK kernels.
- removes all major GEMMs from the GridBased folder
- adds major GEMMs for Origami libraries in gfx950
- adds fallback kernels for all GEMMs
- adds the latest custom kernels for BBS_TN/HHS_TN
- some modifications to file names, kernel names, bias datatypes

---------

Co-authored-by: aliry95amd <ayazdani@amd.com>
Co-authored-by: b-shi <brianshi@amd.com>
Clone of the following PRs from develop.

#493
#489
#491

---------

Co-authored-by: Alex Brown <alex.brown@amd.com>
Add custom HHS/BBS TN kernels to equality library

Cherry-picked from develop. Original PRs:
ROCm/hipBLASLt#2193
ROCm/hipBLASLt#2211
@vamovsik vamovsik merged commit 75e444d into release/rocm-rel-7.0 Jul 22, 2025
6 checks passed
@vamovsik vamovsik deleted the users/alexbrown/release/rocm-rel-7.0-cp branch July 22, 2025 13:44
rahulc-gh added a commit to ROCm/hipBLASLt that referenced this pull request Jul 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants