-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Reduce TP for targets with more than 64 Registers Part 2 : Make operations on fixedRegs less expensive #113713
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
df9d60f to
a91ef0b
Compare
56dc88c to
134eea5
Compare
030b684 to
d2b16f9
Compare
d2b16f9 to
6db2ad5
Compare
97ea8dd to
477aa53
Compare
477aa53 to
dc1d20a
Compare
|
@BruceForstall @kunalspathak This PR is ready for review. |
|
odd superpmi-replay failure. trying again. /azp run runtime-coreclr superpmi-replay |
|
/azp run runtime-coreclr superpmi-replay |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Nice TP wins!
|
/ba-g failures are timeout |
Rationale for this change
The reason why
updateNextFixedRefis expensive is due to us repeatedly working onfixedRegswhich is aregMaskTP. This change splitsfixedRegsinto 2SingleTypeRegSetvariablesfixedRegsLowandfixedRegsHigh. This necessitates some changes to how we call the impacted methods. ImplementingupdateNextFixedRefas a templated method allows us to have 2 separate versions of this method without adding duplicate methodsChanges included here
fixedRegsis split into 2 separateSingleTypeRegSeton targets with more than 64 registersupdateNextFixedRefimplementation that works on the newly createdSingleTypeRegSetupdateNextFixedRefDispatchwhich calls the requiredupdateNextFixedRefbased on register number.Impact of this change
This change should have minimal to no impact on targets with less than 64 registers. Local testing done with additional eGPR enabled(to simulate more than 64 registers) show the following results
Overall (-0.35% to -0.12%)
MinOpts (-1.09% to -0.31%)
FullOpts (-0.32% to -0.12%)
Will help with reducing regression in 'procesKills()` - #112962