-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Open
Labels
:Analytics/AggregationsAggregationsAggregations>enhancementTeam:AnalyticsMeta label for analytical engine team (ESQL/Aggs/Geo)Meta label for analytical engine team (ESQL/Aggs/Geo)
Description
Description
Queries sorted by field have improved a lot over the years when it comes to dynamic pruning:
- Ancient versions of Elasticsearch would always collect all the matches to return a single page of data. This would have terrible performance when paging through all hits, since it would essentially run in quadratic time with the number of documents in a shard.
- Then we introduced index sorting and queries whose sort order is congruent with the index sort could skip irrelevant data.
- Then we introduced dynamic pruning when the sort field is indexed with points, by leveraging the index to skip hits that cannot possibly make it to the page that we are retrieving. This yielded major speedups when paginating through all hits. This is the current state.
- In the future, we should look into supporting dynamic pruning when sorting on keyword fields too.
The composite aggregation is very similar to sorted queries, yet it is currently at stage 2 in the evolution of sorted queries with regards to dynamic pruning. Unless you are aggregating on the primary index sort field, computing a single page of data requires collecting all matches that match the query.
Can we add dynamic pruning support to the composite aggregation so that computing a single page of results wouldn't need to look at all matches? Ideally it would reuse the same logic that we are using for sorting queries via the LeafFieldComparator#competitiveIterator and LeafCollector#competitiveIterator APIs.
Relates to #85759.
Metadata
Metadata
Assignees
Labels
:Analytics/AggregationsAggregationsAggregations>enhancementTeam:AnalyticsMeta label for analytical engine team (ESQL/Aggs/Geo)Meta label for analytical engine team (ESQL/Aggs/Geo)