-
Notifications
You must be signed in to change notification settings - Fork 165
Revert "remove options --no-enumerate (#966)" #1558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This reverts commit 68a380c. #966 (comment)
|
Don't reintroduce this now unless you first run some perf regression checks, this was recently reverted so ticket and @NaveenElumalaiAMD can be consulted. |
|
We will revert in three business days as the justification makes no sense, and we can't be carrying things like this indefinitely that lack a root cause or any theory of why there is an impact. If the project maintainers disagree, then I would suggest that more analysis is done and/or some automated test put in place which verifies the expected behavior. If there are details in an internal ticket, please include them on the public record if being used as a justification. |
|
It looks like this revert of the revert would again introduce serious performance drops. Naveen had the reproduced our internal QA team regression analysis before his revert in #966. Details are listed in internal regression ticket SWDEV-546097, but I have just reproduced a similar 12% drop on a larger gemm using MI250X using this PR change. |
Thank you - we really need to root cause this situation. None of the devs can see a rational reason for such an action at a distance impact, and it could be a serious/nuanced issue. |
|
Given that @TorreZuk has reproduced the performance drop, we need to hold and focus on root cause. The reason I'm picking on this: we build on CI systems that don't have GPUs and there would seem to be no link possible between this flag and performance. If there is, that would be troubling indeed. We have to root cause what the connection here is, not just for this to revert but to ensure that we aren't building software in an already compromised state. |
TorreZuk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With a few changes I can avoid the regressions this causes but I am still trying to analyze the design flaws rather than just allowing this to proceed. Build and bench was the original design so that looks like it crept into places where it shouldn't have with a default ISA even outside benchmarking. Hopefully by tomorrow I can PR changes for review
| # We do not need to do device enumeration at library build time. | ||
| set(Options ${Options} "--no-enumerate") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't follow the other code pattern for options so probably better to wrap in a control option, e.g.
if (NOT Tensile_ENUMERATE).
"We do not..." is too ambiguous, state your use case which I presume is build only on possibly a CPU only node. This function may bey used by other community members with build and benchmark pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The library build path does build on a CPU-only node without any AMD software installed (drivers, ROCm or otherwise). If there are other use cases hitting this path, then it needs better isolation. It would seem to not just be "community" paths, though, since we failed something in one of our own flows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes just want the comment in the code improved and cmake control var for older default build for benchmarking. Working on revisions that will allow this PR to merge #1636
TorreZuk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Work is still underway in Tensile to unblock this but for now can't go in as is.
Probably will still want a cmake variable to control this option the same as all the others for backward compatibility. I can push this commit to this PR when it is unblocked.
|
This change to not enumerate was included with what @bstefanuk did in #2162 that is now merged so closing this PR. |
This reverts commit 68a380c.
#966 (comment)