-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Save memory when histogram agg is not on top #57277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This saves some memory when the `histogram` aggregation is not a top level aggregation by dropping `asMultiBucketAggregator` in favor of natively implementing multi-bucket storage in the aggregator. For the most part this just uses the `LongKeyedBucketOrds` that we built the first time we did this.
|
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
|
I'm going to add a test for the new debug information. |
| * Base class for functionality shared between aggregators for this | ||
| * {@code histogram} aggregation. | ||
| */ | ||
| public abstract class AbstractHistogramAggregator extends BucketsAggregator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a TODO that the range and numeric version of the aggregator shared a ton of code. Now they share a superclass that provides all that code.
| ValuesSource valuesSource, DocValueFormat formatter, SearchContext context, | ||
| Aggregator parent, Map<String, Object> metadata) throws IOException { | ||
| ValuesSource.Range rangeValueSource = (ValuesSource.Range) valuesSource; | ||
| if (rangeValueSource.rangeType().isNumeric() == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved this check into the ctor so I can use the ctor reference that we've been doing elsewhere.
| Aggregator parent, | ||
| boolean collectsFromSingleBucket, | ||
| Map<String, Object> metadata) throws IOException { | ||
| if (collectsFromSingleBucket == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the important line!
| fieldType.setName("field"); | ||
| try (IndexReader reader = w.getReader()) { | ||
| IndexSearcher searcher = new IndexSearcher(reader); | ||
| InternalHistogram histogram = search(searcher, new MatchAllDocsQuery(), aggBuilder, fieldType); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do this kind of thing all over the place so I figured I'd make a utility method for it.
| Releasables.close(releasables); | ||
| releasables.clear(); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice touch. likely usable across many future tests
|
I'm pulling some performance numbers for this. I'll likely merge before they get done and update with them once they come in. I'm fairly confident in it though. |
|
About a 38% performance gain in the test that I ran: Before: After: |
This saves some memory when the `histogram` aggregation is not a top level aggregation by dropping `asMultiBucketAggregator` in favor of natively implementing multi-bucket storage in the aggregator. For the most part this just uses the `LongKeyedBucketOrds` that we built the first time we did this.
…7377) This saves some memory when the `histogram` aggregation is not a top level aggregation by dropping `asMultiBucketAggregator` in favor of natively implementing multi-bucket storage in the aggregator. For the most part this just uses the `LongKeyedBucketOrds` that we built the first time we did this.
This saves some memory when the
histogramaggregation is not a toplevel aggregation by dropping
asMultiBucketAggregatorin favor ofnatively implementing multi-bucket storage in the aggregator. For the
most part this just uses the
LongKeyedBucketOrdsthat we built thefirst time we did this.