-
Notifications
You must be signed in to change notification settings - Fork 5.5k
Add optimizer to convert min_by/max_by to row number function #25190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dbc5e09 to
f1cd3c3
Compare
presto-main-base/src/main/java/com/facebook/presto/sql/planner/PlanOptimizers.java
Show resolved
Hide resolved
f1cd3c3 to
d928aef
Compare
d928aef to
655ae6e
Compare
|
Maybe I am missing something but In
|
Correct, here is the definition of |
And |
| import static com.facebook.presto.sql.relational.Expressions.comparisonExpression; | ||
| import static com.google.common.collect.ImmutableMap.toImmutableMap; | ||
|
|
||
| public class MinMaxByToWindowFunction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a small comment explaining the plan changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will add in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| private int eagerPlanValidationThreadPoolSize = 20; | ||
| private boolean innerJoinPushdownEnabled; | ||
| private boolean inEqualityJoinPushdownEnabled; | ||
| private boolean rewriteMinMaxByToTopNEnabled; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be on by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess row number adds sorting so might not be always efficient but if your performance numbers show other wise then we can make it on by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to be conservative for now. Will consider to set it to be true after getting more stats for this optimizer
Description
This optimization converts queries like
to
Here feature1, feature2 are maps. This rewrite can avoid the expensive cost of aggregations on feature1 and feature2. This is commonly used in getting latest features in machine learning workload.
Motivation and Context
Query optimization to reduce cost.
Impact
Query optimization to reduce cost.
Test Plan
Unit tests
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.