-
Notifications
You must be signed in to change notification settings - Fork 180
Pushdown sort aggregate metrics #4603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Lantao Jin <[email protected]>
Signed-off-by: Lantao Jin <[email protected]>
Signed-off-by: Lantao Jin <[email protected]>
Signed-off-by: Lantao Jin <[email protected]>
Signed-off-by: Lantao Jin <[email protected]>
Signed-off-by: Lantao Jin <[email protected]>
...src/main/java/org/opensearch/sql/opensearch/planner/physical/SortAggregationMetricsRule.java
Outdated
Show resolved
Hide resolved
...arch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggPushDownAction.java
Outdated
Show resolved
Hide resolved
| return newTraitSet; | ||
| } | ||
|
|
||
| public CalciteLogicalIndexScan pushDownSortAggregateMetrics(Sort sort) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can only do this operation when bucket_nullable=false, right? Any place to check this parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AggPushDownAction.L97 and L126
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] Will it be better to put these checks in one place if possible? Now the prerequisites are split both here and in aggAction.pushDownSortAggMetrics, although they are both aggregationBuilders and some checking seems overlapped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cannot get the bucket_nullable information here. Aggregation has already pushed down. So AggPushDownAction.L97 and L126 would be better IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I see you've already get the pushed aggregation builder here.
List<AggregationBuilder> aggregationBuilders =
pushDownContext.getAggPushDownAction().getAggregationBuilder().getLeft();
Or otherwise L290-310 could all be moved into aggAction.pushDownSortAggMetrics.
It's also fine to me if not making change since the logic is correct anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, but I don't want always to check CompositeValuesSourceBuilder.missingBucket() here since not all ValuesSourceBuilders are impacted by bucket_nullable, for example date histogram. So moving to here has to check types, sounds not worth to check types twice.
Signed-off-by: Lantao Jin <[email protected]>
Signed-off-by: Lantao Jin <[email protected]>
Signed-off-by: Lantao Jin <[email protected]>
| return newTraitSet; | ||
| } | ||
|
|
||
| public CalciteLogicalIndexScan pushDownSortAggregateMetrics(Sort sort) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] Will it be better to put these checks in one place if possible? Now the prerequisites are split both here and in aggAction.pushDownSortAggMetrics, although they are both aggregationBuilders and some checking seems overlapped.
...arch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggPushDownAction.java
Show resolved
Hide resolved
… buckets Signed-off-by: Lantao Jin <[email protected]>
| // count agg optimized, get the path name from field names | ||
| path = fieldNames.get(collations.get(0).getFieldIndex()); | ||
| } else if (metric instanceof ValuesSourceAggregationBuilder.LeafOnly) { | ||
| path = metric.getName(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit, non-blocking] Could this be combined with above branch? i.e. Does path = fieldNames.get(collations.get(0).getFieldIndex()); also works for metric instanceof ValuesSourceAggregationBuilder.LeafOnly?
| List<String> fieldNames, | ||
| CompositeAggregationBuilder composite) { | ||
| String path; | ||
| AggregationBuilder metric = composite.getSubAggregations().stream().findFirst().orElse(null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we only support only 1 agg metric for this enhancement? And anywhere to check this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we only support only 1 agg metric for this enhancement? And anywhere to check this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catching. added restricted checking.
Signed-off-by: Lantao Jin <[email protected]>
|
The backport to To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4603-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 5ebed8465dc9d498602a25b802207d63ee5030c8
# Push it to GitHub
git push --set-upstream origin backport/backport-4603-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-devThen, create a pull request where the |
* convert sort aggregate metrics to term sort Signed-off-by: Lantao Jin <[email protected]> * refactor Signed-off-by: Lantao Jin <[email protected]> * fix conflicts Signed-off-by: Lantao Jin <[email protected]> * fix conflicts2 Signed-off-by: Lantao Jin <[email protected]> * Add more javadoc Signed-off-by: Lantao Jin <[email protected]> * address comments Signed-off-by: Lantao Jin <[email protected]> * delete incorrect comments Signed-off-by: Lantao Jin <[email protected]> * convert composite agg to multi-terms agg for sort metrics on multiple buckets Signed-off-by: Lantao Jin <[email protected]> * avoid the case of 'stat count, sum ... | sort count' Signed-off-by: Lantao Jin <[email protected]> --------- Signed-off-by: Lantao Jin <[email protected]> (cherry picked from commit 5ebed84)
* Pushdown sort aggregate metrics (#4603) * convert sort aggregate metrics to term sort Signed-off-by: Lantao Jin <[email protected]> * refactor Signed-off-by: Lantao Jin <[email protected]> * fix conflicts Signed-off-by: Lantao Jin <[email protected]> * fix conflicts2 Signed-off-by: Lantao Jin <[email protected]> * Add more javadoc Signed-off-by: Lantao Jin <[email protected]> * address comments Signed-off-by: Lantao Jin <[email protected]> * delete incorrect comments Signed-off-by: Lantao Jin <[email protected]> * convert composite agg to multi-terms agg for sort metrics on multiple buckets Signed-off-by: Lantao Jin <[email protected]> * avoid the case of 'stat count, sum ... | sort count' Signed-off-by: Lantao Jin <[email protected]> --------- Signed-off-by: Lantao Jin <[email protected]> (cherry picked from commit 5ebed84) * fix conflicts Signed-off-by: Lantao Jin <[email protected]> --------- Signed-off-by: Lantao Jin <[email protected]>
* default-main: (34 commits) Enhance dynamic source clause to support only metadata filters (opensearch-project#4554) Make nested alias type support referring to outer context (opensearch-project#4673) Update big5 ppl queries and check plans (opensearch-project#4668) Support push down sort after limit (opensearch-project#4657) Use table scan rowType in filter pushdown could fix rename issue (opensearch-project#4670) Fix: Support Alias Fields in MIN, MAX, FIRST, LAST, and TAKE Aggregations (opensearch-project#4621) Fix bin nested fields issue (opensearch-project#4606) Add `per_minute`, `per_hour`, `per_day` function support (opensearch-project#4531) Pushdown sort aggregate metrics (opensearch-project#4603) Followup: Change ComparableLinkedHashMap to compare Key than Value (opensearch-project#4648) Mitigate the CI failure caused by 500 Internal Server Error (opensearch-project#4646) Allow renaming group-by fields to existing field names (opensearch-project#4586) Publish internal modules separately for downstream reuse (opensearch-project#4484) Revert "Update grammar files and developer guide (opensearch-project#4301)" (opensearch-project#4643) Support Automatic Type Conversion for REX/SPATH/PARSE Command Extractions (opensearch-project#4599) Replace all dots in fields of table scan's PhysType (opensearch-project#4633) Return comparable LinkedHashMap in `valueForCalcite()` of ExprTupleValue (opensearch-project#4629) Refactor JsonExtractAllFunctionIT and MapConcatFunctionIT (opensearch-project#4623) Pushdown case function in aggregations as range queries (opensearch-project#4400) Update GEOIP function to support IP types as input (opensearch-project#4613) ... # Conflicts: # docs/user/ppl/functions/conversion.rst
* default-main: (34 commits) Enhance dynamic source clause to support only metadata filters (opensearch-project#4554) Make nested alias type support referring to outer context (opensearch-project#4673) Update big5 ppl queries and check plans (opensearch-project#4668) Support push down sort after limit (opensearch-project#4657) Use table scan rowType in filter pushdown could fix rename issue (opensearch-project#4670) Fix: Support Alias Fields in MIN, MAX, FIRST, LAST, and TAKE Aggregations (opensearch-project#4621) Fix bin nested fields issue (opensearch-project#4606) Add `per_minute`, `per_hour`, `per_day` function support (opensearch-project#4531) Pushdown sort aggregate metrics (opensearch-project#4603) Followup: Change ComparableLinkedHashMap to compare Key than Value (opensearch-project#4648) Mitigate the CI failure caused by 500 Internal Server Error (opensearch-project#4646) Allow renaming group-by fields to existing field names (opensearch-project#4586) Publish internal modules separately for downstream reuse (opensearch-project#4484) Revert "Update grammar files and developer guide (opensearch-project#4301)" (opensearch-project#4643) Support Automatic Type Conversion for REX/SPATH/PARSE Command Extractions (opensearch-project#4599) Replace all dots in fields of table scan's PhysType (opensearch-project#4633) Return comparable LinkedHashMap in `valueForCalcite()` of ExprTupleValue (opensearch-project#4629) Refactor JsonExtractAllFunctionIT and MapConcatFunctionIT (opensearch-project#4623) Pushdown case function in aggregations as range queries (opensearch-project#4400) Update GEOIP function to support IP types as input (opensearch-project#4613) ... Signed-off-by: Asif Bashar <[email protected]>
* convert sort aggregate metrics to term sort Signed-off-by: Lantao Jin <[email protected]> * refactor Signed-off-by: Lantao Jin <[email protected]> * fix conflicts Signed-off-by: Lantao Jin <[email protected]> * fix conflicts2 Signed-off-by: Lantao Jin <[email protected]> * Add more javadoc Signed-off-by: Lantao Jin <[email protected]> * address comments Signed-off-by: Lantao Jin <[email protected]> * delete incorrect comments Signed-off-by: Lantao Jin <[email protected]> * convert composite agg to multi-terms agg for sort metrics on multiple buckets Signed-off-by: Lantao Jin <[email protected]> * avoid the case of 'stat count, sum ... | sort count' Signed-off-by: Lantao Jin <[email protected]> --------- Signed-off-by: Lantao Jin <[email protected]>
Description
Pushdown sort aggregate metrics:
SORT_AGG_METRICSafterAGGREGATIONin pushdown contextcompositebucket agg toterms/histo/date-histoagg if sort one metric onsinglebucketcompositebucket agg (all terms source) tomulti-termsagg if sort one metric onmultiplebucketsRelated Issues
Resolves #4282
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.