-
Notifications
You must be signed in to change notification settings - Fork 15.5k
[VPlan] Remove ExtractLastLane for plans with scalar VFs. #171145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -371,10 +371,9 @@ void UnrollState::unrollBlock(VPBlockBase *VPB) { | |
| continue; | ||
| } | ||
|
|
||
| if (match(&R, m_ExtractLastLaneOfLastPart(m_VPValue(Op0))) || | ||
| match(&R, m_ExtractPenultimateElement(m_VPValue(Op0)))) { | ||
| addUniformForAllParts(cast<VPSingleDefRecipe>(&R)); | ||
| if (Plan.hasScalarVFOnly()) { | ||
| if (Plan.hasScalarVFOnly()) { | ||
| if (match(&R, m_ExtractLastPart(m_VPValue(Op0))) || | ||
| match(&R, m_ExtractPenultimateElement(m_VPValue(Op0)))) { | ||
|
Comment on lines
+375
to
+376
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would be simpler handle each match separately?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure, the only difference is the index to extract from?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right, but folding that difference seems to complicate two trivial cases, the first which should hold for any VF, scalar or not: and the second which deals with scalar VF only: ? Are both missing Can also hoist from below the |
||
| auto *I = cast<VPInstruction>(&R); | ||
| bool IsPenultimatePart = | ||
| I->getOpcode() == VPInstruction::ExtractPenultimateElement; | ||
|
|
@@ -383,7 +382,10 @@ void UnrollState::unrollBlock(VPBlockBase *VPB) { | |
| I->replaceAllUsesWith(getValueForPart(Op0, PartIdx)); | ||
| continue; | ||
| } | ||
| // For vector VF, always extract from the last part. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Worth retaining the comment, perhaps emphasizing that
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, re-added in e6e3f94 |
||
| } | ||
| if (match(&R, m_ExtractLastLaneOfLastPart(m_VPValue(Op0))) || | ||
| match(&R, m_ExtractPenultimateElement(m_VPValue(Op0)))) { | ||
| addUniformForAllParts(cast<VPSingleDefRecipe>(&R)); | ||
| R.setOperand(0, getValueForPart(Op0, UF - 1)); | ||
| continue; | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such BuildVector operands were visited and processed earlier by the code below folding them into Broadcast if all their operands are equal. Does that folding hold for VF=1 too; if so should such redundant Broadcasts be eliminated too? Worth looking for extracts from broadcast too/instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like currently there are no extracts of broadcasts that could be folded in the tests. For VF=1, the broadcasts itself should be removed, possibly before hitting this pattern, but I might be missing something.