-
Notifications
You must be signed in to change notification settings - Fork 180
Remove redundant push-down-filters derived for bucket-non-null agg #4843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ation Signed-off-by: Heng Qian <[email protected]>
| CalciteLogicalIndexScan(table=[[OpenSearch, hits]]) | ||
| physical: | | ||
| CalciteEnumerableIndexScan(table=[[OpenSearch, hits]], PushDownContext=[[AGGREGATION->rel#:LogicalAggregate.NONE.[](input=RelSubset#,group={0, 2},c=COUNT(),sum(IsRefresh)=SUM($1),avg(ResolutionWidth)=AVG($3)), SORT_AGG_METRICS->[2 DESC LAST], PROJECT->[c, sum(IsRefresh), avg(ResolutionWidth), WatchID, ClientIP], LIMIT->10, LIMIT->10000], OpenSearchRequestBuilder(sourceBuilder={"from":0,"size":0,"timeout":"1m","aggregations":{"WatchID|ClientIP":{"multi_terms":{"terms":[{"field":"WatchID"},{"field":"ClientIP"}],"size":10,"min_doc_count":1,"shard_min_doc_count":0,"show_term_doc_count_error":false,"order":[{"c":"desc"},{"_key":"asc"}]},"aggregations":{"sum(IsRefresh)":{"sum":{"field":"IsRefresh"}},"avg(ResolutionWidth)":{"avg":{"field":"ResolutionWidth"}},"c":{"value_count":{"field":"_index"}}}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)]) | ||
| CalciteEnumerableIndexScan(table=[[OpenSearch, hits]], PushDownContext=[[AGGREGATION->rel#:LogicalAggregate.NONE.[](input=RelSubset#,group={0, 1},c=COUNT(),sum(IsRefresh)=SUM($2),avg(ResolutionWidth)=AVG($3)), PROJECT->[c, sum(IsRefresh), avg(ResolutionWidth), WatchID, ClientIP], SORT_AGG_METRICS->[0 DESC LAST], LIMIT->10, LIMIT->10000], OpenSearchRequestBuilder(sourceBuilder={"from":0,"size":0,"timeout":"1m","aggregations":{"WatchID|ClientIP":{"multi_terms":{"terms":[{"field":"WatchID"},{"field":"ClientIP"}],"size":10,"min_doc_count":1,"shard_min_doc_count":0,"show_term_doc_count_error":false,"order":[{"c":"desc"},{"_key":"asc"}]},"aggregations":{"sum(IsRefresh)":{"sum":{"field":"IsRefresh"}},"avg(ResolutionWidth)":{"avg":{"field":"ResolutionWidth"}},"c":{"value_count":{"field":"_index"}}}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the groupset and SORT_AGG_METRICS digest changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before this PR, the plan can only match the rule agg-filter-project-scan after being transformed by some calcite's transpose rule and project merge rule. So the index in group set is different from the original plan(i.e. logical plan).
After adding another new rule agg-project-filter-project-scan, the original plan can match this new one after trimming(it will add another project before scan), so the new digest has exactly the same ref index in the group set as logical plan.
But overall, though the aggregation digest is different, their generated request builders are exactly the same.
| return rex instanceof RexCall rexCall | ||
| && rexCall.getOperator().equals(SqlStdOperatorTable.IS_NOT_NULL) | ||
| && rexCall.getOperands().get(0) instanceof RexInputRef; | ||
| && rexCall.getOperator().equals(SqlStdOperatorTable.IS_NOT_NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be simplified to rex.isA(SqlKind.IS_NOT_NULL)
Signed-off-by: Heng Qian <[email protected]>
| final LogicalFilter filter = call.rel(2); | ||
| final LogicalProject bottomProject = call.rel(3); | ||
| final CalciteLogicalIndexScan scan = call.rel(4); | ||
| boolean isBucketNullable = Config.isBucketNullableAgg.test(aggregate); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I feel it's a little confusing because it's semantics seem to be buket not nullable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. That's a typo. Will correct it.
Signed-off-by: Heng Qian <[email protected]>
|
The backport to To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4843-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 ede63cb7b0c4976c818d3dada2447ee1985d43b5
# Push it to GitHub
git push --set-upstream origin backport/backport-4843-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-devThen, create a pull request where the |
|
The backport to To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4843-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 ede63cb7b0c4976c818d3dada2447ee1985d43b5
# Push it to GitHub
git push --set-upstream origin backport/backport-4843-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-devThen, create a pull request where the |
…pensearch-project#4843) * Remove redundant push-down-filters derived for bucket-non-null aggregation Signed-off-by: Heng Qian <[email protected]> * Address comment Signed-off-by: Heng Qian <[email protected]> * Fix name typo Signed-off-by: Heng Qian <[email protected]> --------- Signed-off-by: Heng Qian <[email protected]> (cherry picked from commit ede63cb) Signed-off-by: Heng Qian <[email protected]>
…ket-non-null agg (#4843) (#4851) * Remove redundant push-down-filters derived for bucket-non-null agg (#4843) * Remove redundant push-down-filters derived for bucket-non-null aggregation Signed-off-by: Heng Qian <[email protected]> * Address comment Signed-off-by: Heng Qian <[email protected]> * Fix name typo Signed-off-by: Heng Qian <[email protected]> --------- Signed-off-by: Heng Qian <[email protected]> (cherry picked from commit ede63cb) Signed-off-by: Heng Qian <[email protected]> * Fix compiling Signed-off-by: Heng Qian <[email protected]> * Fix IT Signed-off-by: Heng Qian <[email protected]> --------- Signed-off-by: Heng Qian <[email protected]>
…pensearch-project#4843) * Remove redundant push-down-filters derived for bucket-non-null aggregation Signed-off-by: Heng Qian <[email protected]> * Address comment Signed-off-by: Heng Qian <[email protected]> * Fix name typo Signed-off-by: Heng Qian <[email protected]> --------- Signed-off-by: Heng Qian <[email protected]>
Description
In our current implementation/final plan, we always push down a redundant filter of ISNOTNULL for group fields if params
BUCKET_NULLABLE=falseor the group field is time span. However, since we've setMISSING_BUCKETto false in the aggregation builder, we actually don't need this filter which will introduce more overhead especially the filter is pushed down to be a script.This PR enhance the above case by adding a new rule, in which we detect the derived filter from bucket-non-null agg and then ignore them in the push down process.
Related Issues
Resolves #4811
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.