Revert "remove options --no-enumerate (#966)" #1558

marbre · 2025-09-11T23:54:38Z

This reverts commit 68a380c.

The root cause makes no sense and the lack of any reproduction information makes it just lore that we will carry forward indefinitely (and if true, could be hiding a serious problem).

This reverts commit 68a380c. #966 (comment)

TorreZuk · 2025-09-12T15:38:07Z

Don't reintroduce this now unless you first run some perf regression checks, this was recently reverted so ticket and @NaveenElumalaiAMD can be consulted.

stellaraccident · 2025-09-12T15:52:21Z

We will revert in three business days as the justification makes no sense, and we can't be carrying things like this indefinitely that lack a root cause or any theory of why there is an impact. If the project maintainers disagree, then I would suggest that more analysis is done and/or some automated test put in place which verifies the expected behavior. If there are details in an internal ticket, please include them on the public record if being used as a justification.

TorreZuk · 2025-09-12T23:20:08Z

It looks like this revert of the revert would again introduce serious performance drops. Naveen had the reproduced our internal QA team regression analysis before his revert in #966. Details are listed in internal regression ticket SWDEV-546097, but I have just reproduced a similar 12% drop on a larger gemm using MI250X using this PR change.
./rocblas-bench -f gemm_ex -r h -m 7744 -n 7744 -k 7744 --lda 7744 --ldb 7744 --ldc 7744 --ldd 7744 --compute_type s --transposeB T, see double precision similar drop, many GEMMs are listed.
You can read the ticket but for the community reader, performance drops were reported for MI200, MI300 and MI300X, many around 25% GFLOPs drop, on larger ~ 2k+ sized GEMMs. I will continue to review the tensile project history on Monday to try and analyze where things went wrong with this option. The original regression happened after 7.0 branch and was fixed before any point release.

stellaraccident · 2025-09-12T23:28:18Z

It looks like this revert of the revert would again introduce serious performance drops. Naveen had the reproduced our internal QA team regression analysis before his revert in #966. Details are listed in internal regression ticket SWDEV-546097, but I have just reproduced a similar 12% drop on a larger gemm using MI250X using this PR change. ./rocblas-bench -f gemm_ex -r h -m 7744 -n 7744 -k 7744 --lda 7744 --ldb 7744 --ldc 7744 --ldd 7744 --compute_type s --transposeB T, see double precision similar drop, many GEMMs are listed. You can read the ticket but for the community reader, performance drops were reported for MI200, MI300 and MI300X, many around 25% GFLOPs drop, on larger ~ 2k+ sized GEMMs. I will continue to review the tensile project history on Monday to try and analyze where things went wrong with this option. The original regression happened after 7.0 branch and was fixed before any point release.

Thank you - we really need to root cause this situation. None of the devs can see a rational reason for such an action at a distance impact, and it could be a serious/nuanced issue.

stellaraccident · 2025-09-16T19:57:15Z

Given that @TorreZuk has reproduced the performance drop, we need to hold and focus on root cause.

The reason I'm picking on this: we build on CI systems that don't have GPUs and there would seem to be no link possible between this flag and performance. If there is, that would be troubling indeed.

We have to root cause what the connection here is, not just for this to revert but to ensure that we aren't building software in an already compromised state.

TorreZuk

With a few changes I can avoid the regressions this causes but I am still trying to analyze the design flaws rather than just allowing this to proceed. Build and bench was the original design so that looks like it crept into places where it shouldn't have with a default ISA even outside benchmarking. Hopefully by tomorrow I can PR changes for review

TorreZuk · 2025-09-16T21:19:58Z

shared/tensile/Tensile/cmake/TensileConfig.cmake

+  # We do not need to do device enumeration at library build time.
+  set(Options ${Options} "--no-enumerate")
+


This doesn't follow the other code pattern for options so probably better to wrap in a control option, e.g.
if (NOT Tensile_ENUMERATE).
"We do not..." is too ambiguous, state your use case which I presume is build only on possibly a CPU only node. This function may bey used by other community members with build and benchmark pattern.

The library build path does build on a CPU-only node without any AMD software installed (drivers, ROCm or otherwise). If there are other use cases hitting this path, then it needs better isolation. It would seem to not just be "community" paths, though, since we failed something in one of our own flows.

Yes just want the comment in the code improved and cmake control var for older default build for benchmarking. Working on revisions that will allow this PR to merge #1636

TorreZuk

Work is still underway in Tensile to unblock this but for now can't go in as is.
Probably will still want a cmake variable to control this option the same as all the others for backward compatibility. I can push this commit to this PR when it is unblocked.

TorreZuk · 2025-11-06T22:57:27Z

This change to not enumerate was included with what @bstefanuk did in #2162 that is now merged so closing this PR.

Revert "remove options --no-enumerate (#966)"

c42f096

This reverts commit 68a380c. #966 (comment)

marbre requested a review from stellaraccident September 11, 2025 23:54

marbre requested a review from a team as a code owner September 11, 2025 23:54

github-actions bot added the shared: tensile label Sep 11, 2025

stellaraccident approved these changes Sep 11, 2025

View reviewed changes

marbre requested review from bstefanuk and davidd-amd September 11, 2025 23:56

assistant-librarian bot added the organization: ROCm label Sep 12, 2025

TorreZuk requested review from NaveenElumalaiAMD and amcamd September 12, 2025 15:36

TorreZuk added the project: rocblas label Sep 12, 2025

TorreZuk self-assigned this Sep 12, 2025

bstefanuk approved these changes Sep 16, 2025

View reviewed changes

TorreZuk reviewed Sep 16, 2025

View reviewed changes

bstefanuk self-requested a review September 18, 2025 17:39

TorreZuk requested changes Oct 3, 2025

View reviewed changes

TorreZuk closed this Nov 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Revert "remove options --no-enumerate (#966)" #1558

Revert "remove options --no-enumerate (#966)" #1558

Uh oh!

marbre commented Sep 11, 2025

Uh oh!

TorreZuk commented Sep 12, 2025

Uh oh!

stellaraccident commented Sep 12, 2025

Uh oh!

TorreZuk commented Sep 12, 2025

Uh oh!

stellaraccident commented Sep 12, 2025

Uh oh!

stellaraccident commented Sep 16, 2025

Uh oh!

TorreZuk left a comment

Uh oh!

TorreZuk Sep 16, 2025

Uh oh!

stellaraccident Sep 16, 2025

Uh oh!

TorreZuk Sep 18, 2025

Uh oh!

TorreZuk left a comment

Uh oh!

TorreZuk commented Nov 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		# We do not need to do device enumeration at library build time.
		set(Options ${Options} "--no-enumerate")

Revert "remove options --no-enumerate (#966)" #1558

Revert "remove options --no-enumerate (#966)" #1558

Uh oh!

Conversation

marbre commented Sep 11, 2025

Uh oh!

TorreZuk commented Sep 12, 2025

Uh oh!

stellaraccident commented Sep 12, 2025

Uh oh!

TorreZuk commented Sep 12, 2025

Uh oh!

stellaraccident commented Sep 12, 2025

Uh oh!

stellaraccident commented Sep 16, 2025

Uh oh!

TorreZuk left a comment

Choose a reason for hiding this comment

Uh oh!

TorreZuk Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

stellaraccident Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

TorreZuk Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

TorreZuk left a comment

Choose a reason for hiding this comment

Uh oh!

TorreZuk commented Nov 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants