Skip to content

[BACKEND] Optimization to sink broadcast ops#2274

Merged
ptillet merged 1 commit intotriton-lang:mainfrom
ThomasRaoux:broadcast_reorder
Sep 13, 2023
Merged

[BACKEND] Optimization to sink broadcast ops#2274
ptillet merged 1 commit intotriton-lang:mainfrom
ThomasRaoux:broadcast_reorder

Conversation

@ThomasRaoux
Copy link
Copy Markdown
Collaborator

Try to move broadcast ops after arithmetic and convert ops in order to reduce the amount of work needed.

Try to move broadcast ops after arithmetic and convert ops in
order to reduce the amount of work needed.
seenBroadcast = true;
srcType = broadcastOp.getSrc().getType();
} else if (srcType != broadcastOp.getSrc().getType()) {
// If the broadcast have different types we cannot re-order.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice improvement. I think you can generalize it slightly further by having a common srcShape and allowing the scalar types to differ. This would allow the pattern to work for AddPtrOp for example.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for the delay, thanks for the advice. I'm sending another PR to handle this.

return isa<arith::ExtSIOp, arith::ExtUIOp, arith::ExtFOp>(op);
auto isExtOrBroadcastOp = [](Operation *op) {
return isa<arith::ExtSIOp, arith::ExtUIOp, arith::ExtFOp,
triton::BroadcastOp>(op);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this apply to ExpandDim and Splat as well?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splat takes a scalar source so it is already handled. I'll extend to ExpandDim

@ptillet ptillet merged commit cf7f8c5 into triton-lang:main Sep 13, 2023
ThomasRaoux added a commit that referenced this pull request Sep 18, 2023
Improve patterns that sync broadcast to reduce the arithmetic density
and also hoist convert on top of expand_dims to do less work.

This address comments in #2274
ThomasRaoux added a commit that referenced this pull request Sep 18, 2023
Improve patterns that sync broadcast to reduce the arithmetic density
and also hoist convert on top of expand_dims to do less work.

This address comments in #2274
ptillet pushed a commit that referenced this pull request Sep 18, 2023
)

Improve patterns that sync broadcast to reduce the arithmetic density
and also hoist convert on top of expand_dims to do less work.

This address comments in #2274
alexander-zinoviev pushed a commit to alexander-zinoviev/triton that referenced this pull request Sep 21, 2023
Try to move broadcast ops after arithmetic and convert ops in order to
reduce the amount of work needed.
alexander-zinoviev pushed a commit to alexander-zinoviev/triton that referenced this pull request Sep 21, 2023
…iton-lang#2331)

Improve patterns that sync broadcast to reduce the arithmetic density
and also hoist convert on top of expand_dims to do less work.

This address comments in triton-lang#2274
pingzhuu pushed a commit to siliconflow/triton that referenced this pull request Apr 2, 2024
Try to move broadcast ops after arithmetic and convert ops in order to
reduce the amount of work needed.
pingzhuu pushed a commit to siliconflow/triton that referenced this pull request Apr 2, 2024
…iton-lang#2331)

Improve patterns that sync broadcast to reduce the arithmetic density
and also hoist convert on top of expand_dims to do less work.

This address comments in triton-lang#2274
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants