Changing sgpr limits#2184
Closed
mahmoodw wants to merge 2 commits into
Closed
Conversation
KKyang
previously approved these changes
Jun 5, 2025
AlexBrownAMD
previously approved these changes
Jun 5, 2025
Contributor
Author
|
/AzurePipelines run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
AlexBrownAMD
approved these changes
Jun 18, 2025
| # Use VGPR up to next occupancy threshold: | ||
| maxVgprs, occupancy = self.getMaxRegsForOccupancy(kernel["NumThreads"], self.vgprPool.size(), self.sgprPool.size(), \ | ||
| # Account for additional temp sgprs that will be required for code gen, up to physical limits. +5 approximates upper end of required temp space for GSU sync | ||
| requiredSgprs = min(self.sgprPool.size() + 5, self.states.regCaps["MaxSgpr"]) |
Collaborator
There was a problem hiding this comment.
"+5" should not be required for all kernels.
Contributor
Author
There was a problem hiding this comment.
That's right, this is meant to be a temporary fix to unblock perf teams. A debt ticket will be issued to more accurately collect the high-water mark. Any suggestions or insight is appreciated.
Contributor
|
Closing the pull request in this repo. Please refer to the migrated pull request for updates. |
mahmoodw
added a commit
to ROCm/rocm-libraries
that referenced
this pull request
Jul 3, 2025
In regards to LWPTENSILE-1696 This includes 2 changes: - Unrestricted the temp sgprs needed for gsu from being contiguous, avoiding overflow for certain kernels - Account for additional temp sgprs that will be required for code gen, up to physical limits --- 🔁 Imported from [ROCm/hipBLASLt#2184](ROCm/hipBLASLt#2184) 🧑💻 Originally authored by @mahmoodw --------- Co-authored-by: mahmoodw <wmahmood@amd.com> Co-authored-by: mahmoodw <44450175+mahmoodw@users.noreply.github.com>
AlexBrownAMD
pushed a commit
to ROCm/rocm-libraries
that referenced
this pull request
Jul 9, 2025
In regards to LWPTENSILE-1696 This includes 2 changes: - Unrestricted the temp sgprs needed for gsu from being contiguous, avoiding overflow for certain kernels - Account for additional temp sgprs that will be required for code gen, up to physical limits --- 🔁 Imported from [ROCm/hipBLASLt#2184](ROCm/hipBLASLt#2184) 🧑💻 Originally authored by @mahmoodw --------- Co-authored-by: mahmoodw <wmahmood@amd.com> Co-authored-by: mahmoodw <44450175+mahmoodw@users.noreply.github.com>
mahmoodw
added a commit
to ROCm/rocm-libraries
that referenced
this pull request
Jul 10, 2025
This includes 2 changes: - Unrestricted the temp sgprs needed for gsu from being contiguous, avoiding overflow for certain kernels - Account for additional temp sgprs that will be required for code gen, up to physical limits --- 🔁 Imported from [ROCm/hipBLASLt#2184](ROCm/hipBLASLt#2184) 🧑💻 Originally authored by @mahmoodw --------- Co-authored-by: assistant-librarian[bot] <210906412+assistant-librarian[bot]@users.noreply.github.com> Co-authored-by: mahmoodw <wmahmood@amd.com>
AlexBrownAMD
pushed a commit
to ROCm/rocm-libraries
that referenced
this pull request
Jul 15, 2025
This includes 2 changes: - Unrestricted the temp sgprs needed for gsu from being contiguous, avoiding overflow for certain kernels - Account for additional temp sgprs that will be required for code gen, up to physical limits --- 🔁 Imported from [ROCm/hipBLASLt#2184](ROCm/hipBLASLt#2184) 🧑💻 Originally authored by @mahmoodw --------- Co-authored-by: assistant-librarian[bot] <210906412+assistant-librarian[bot]@users.noreply.github.com> Co-authored-by: mahmoodw <wmahmood@amd.com>
SathiyarajRam
pushed a commit
to ROCm/rocm-libraries
that referenced
this pull request
Jul 15, 2025
This includes 2 changes: - Unrestricted the temp sgprs needed for gsu from being contiguous, avoiding overflow for certain kernels - Account for additional temp sgprs that will be required for code gen, up to physical limits --- 🔁 Imported from [ROCm/hipBLASLt#2184](ROCm/hipBLASLt#2184) 🧑💻 Originally authored by @mahmoodw --------- Co-authored-by: assistant-librarian[bot] <210906412+assistant-librarian[bot]@users.noreply.github.com> Co-authored-by: mahmoodw <wmahmood@amd.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In regards to LWPTENSILE-1696
This includes 2 changes: