-
Notifications
You must be signed in to change notification settings - Fork 165
[rocblas][tensile] Use kernel ISA during build with enforcement #2162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
bstefanuk
merged 10 commits into
ROCm:develop
from
bstefanuk:bug/tensile-build-with-no-enumerate2
Nov 6, 2025
Merged
[rocblas][tensile] Use kernel ISA during build with enforcement #2162
bstefanuk
merged 10 commits into
ROCm:develop
from
bstefanuk:bug/tensile-build-with-no-enumerate2
Nov 6, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Codecov Report✅ All modified and coverable lines are covered by tests.
Additional details and impacted files@@ Coverage Diff @@
## develop #2162 +/- ##
============================================
- Coverage 88.75% 67.19% -21.55%
============================================
Files 301 362 +61
Lines 25607 50705 +25098
Branches 0 5708 +5708
============================================
+ Hits 22725 34069 +11344
- Misses 2882 13052 +10170
- Partials 0 3584 +3584
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
yoichiyoshida
approved these changes
Oct 20, 2025
1 task
…efanuk/rocm-libraries into bug/tensile-build-with-no-enumerate2
bstefanuk
commented
Oct 29, 2025
bstefanuk
commented
Oct 29, 2025
geomin12
reviewed
Oct 31, 2025
amcamd
approved these changes
Nov 5, 2025
Contributor
Author
|
Testing failure assessment:
Using gardener override to merge. |
assistant-librarian bot
pushed a commit
to ROCm/Tensile
that referenced
this pull request
Nov 6, 2025
[rocblas][tensile] Use kernel ISA during build with enforcement (#2162) ## Motivation Performance regressions are found when adding --no-enumerate to the TensileCreateLibrary build. This PR re-implements the kernel ISA reliance from #2094 without needing to change logic files. ## Technical Details - Rely only on the kernel's ISA during the build phase. - Add additional ISA enforcement given architecture details extracted from logic files. ## Test Plan - Local performance testing for specific sizes - Comprehensive performance testing through gemmaiperf - Standard CI testing ## Test Result - See CI results in this PR for standard pipeline checks. - Performance: tested on 6665 sizes using `rocblas-bench` on gfx950 (results below) - Performance: select sizes were evaluated on gfx942 and confirmed no performance change beyond +/-1% `Single precision NN` Stat | Result -- | -- Average (% speed up) | 0.50 Median (% speed up) | 0.01 Count Faster | 3482 Count Slower | 3161 `Single precision TN` Stat | Result -- | -- Average (% speed up) | 4.17 Median (%speed up) | -0.02 Count Faster | 3042 Count Slower | 3579 `Complex double precision TN` Stat | Result -- | -- Average (% speed up) | 0.18 Median (% speed up) | 0.04 Count Faster | 4452 Count Slower | 2207 ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
1 task
This was referenced Nov 7, 2025
bstefanuk
added a commit
that referenced
this pull request
Nov 7, 2025
## Motivation The patch [0008-Revert-remove-options-no-enumerate-966.patch](https://github.com/ROCm/TheRock/blob/77e4a8304c0544a7ee5779fcabcee265a00f38ba/patches/amd-mainline/rocm-libraries/0008-Revert-remove-options-no-enumerate-966.patch) can be removed now that `--no-enumerate` is no longer needed in tensile. (PR #2162). This PR allows the patch to be removed without breaking CI. ## Technical Details Use `rm -f` to allow the pipeline to continue even if the file is missing. ## Test Plan Low risk, standard CI testing is sufficient. ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
bsyrowik
added a commit
to ROCm/TheRock
that referenced
this pull request
Nov 7, 2025
## Motivation Pick up rocWMMA changes for compatibility with TheRock build. ## Technical Details Deleted a patch that is no longer required due to: ROCm/rocm-libraries#2162 <!-- Explain the changes along with any relevant GitHub links. --> ## Test Plan <!-- Explain any relevant testing done to verify this PR. --> ## Test Result <!-- Briefly summarize test outcomes. --> ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
rponnuru5
pushed a commit
to ROCm/TheRock
that referenced
this pull request
Nov 7, 2025
## Motivation Pick up rocWMMA changes for compatibility with TheRock build. ## Technical Details Deleted a patch that is no longer required due to: ROCm/rocm-libraries#2162 <!-- Explain the changes along with any relevant GitHub links. --> ## Test Plan <!-- Explain any relevant testing done to verify this PR. --> ## Test Result <!-- Briefly summarize test outcomes. --> ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
bstefanuk
added a commit
that referenced
this pull request
Nov 10, 2025
## Motivation The `gfx1103` logic files used an incorrect reference to `navi33`, which has been exposed due to the recent inclusion of new logic file consistency checks in #2162. ## Technical Details Update the schedule name in the rocblas logic files to properly map to `gfx1103` which is consistent with the architectureMap in Common.py ## Test Plan Local testing before and after this change shows the following outputs Command: ``` Tensile/bin/TensileCreateLibrary [...] --architecture=gfx1103 /path/to/Tensile/Logic/asm_full build_gfx1103 HIP ``` Result (before; develop) ``` ValueError: Architecture mismatch: gfx1103 does not match navi33. Review the library logic file ``` Result (after) ``` # Reading logic files: 32 thread(s), 144 tasks .............................. 100.0% (took 4.0 secs) # Generating kernels: 32 thread(s), 2690 tasks .............................. 100.0% (took 23.1 secs) # Compiling source kernels .............................. 100.0% (took 0.0 secs) ``` ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
Performance regressions are found when adding --no-enumerate to the TensileCreateLibrary build. This PR re-implements the kernel ISA reliance from #2094 without needing to change logic files.
Technical Details
Test Plan
Test Result
rocblas-benchon gfx950 (results below)Single precision NNSingle precision TNComplex double precision TNSubmission Checklist