-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Combining filter rewrite and skip list to optimize sub aggregation #19573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ing sub aggregation Signed-off-by: Ankit Jain <[email protected]>
Signed-off-by: Ankit Jain <[email protected]>
Signed-off-by: Ankit Jain <[email protected]>
|
❌ Gradle check result for 82bc95d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Ankit Jain <[email protected]>
|
❌ Gradle check result for aff3dc6: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Ankit Jain <[email protected]>
|
The Jenkins job url is https://build.ci.opensearch.org/job/benchmark-pull-request/5188/ . Final results will be published once the job is completed. |
Benchmark ResultsBenchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-pull-request/5188/
|
Benchmark Baseline Comparison ResultsBenchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-compare/212/
|
Benchmark ResultsBenchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-pull-request/5187/
|
Benchmark Baseline Comparison ResultsBenchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-compare/213/
|
Signed-off-by: Asim Mahmood <[email protected]>
WalkthroughThis pull request combines filter rewrite and skip list optimization strategies for histogram sub-aggregations. It introduces a LeafCollectionMode enum to AggregatorBase to track collection strategies, implements HistogramSkiplistLeafCollector for efficient skip-list-based bucketing, refactors DateHistogramAggregator's skiplist logic, and modifies RangeAggregator to toggle collection modes during pre-aggregation optimization. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Key areas requiring attention:
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (6)
server/src/main/java/org/opensearch/search/aggregations/bucket/histogram/DateHistogramAggregator.java (1)
236-239: Consider adding debug logging for skiplist collector selection.Given the performance-sensitive nature of this optimization and the benchmark debugging discussed in PR comments, consider adding trace-level logging when the skiplist path is selected. This would help with future performance debugging.
if (canUseSkiplist(skipper, singleton)) { skipListCollectorsUsed++; + logger.trace("Using skiplist collector for field: {}", fieldName); return new HistogramSkiplistLeafCollector(singleton, skipper, preparedRounding, bucketOrds, sub, this); }server/src/main/java/org/opensearch/search/aggregations/bucket/HistogramSkiplistLeafCollector.java (3)
110-129: Consider caching and validatingowningBucketOrdfor correctness.Based on prior discussion, this collector only works correctly when
owningBucketOrdremains constant across calls. Consider caching the firstowningBucketOrdand either asserting or falling back to non-skiplist behavior if a different value is received:+ private long cachedOwningBucketOrd = -1; + @Override public void collect(int doc, long owningBucketOrd) throws IOException { + if (cachedOwningBucketOrd == -1) { + cachedOwningBucketOrd = owningBucketOrd; + } else { + assert cachedOwningBucketOrd == owningBucketOrd : "HistogramSkiplistLeafCollector requires constant owningBucketOrd"; + } + if (doc > upToInclusive) { advanceSkipper(doc, owningBucketOrd); }Based on learnings from past review comments discussing this exact scenario.
146-158: Performance optimization opportunity when sub-collector is NO_OP.The fast path using
stream.count(upToExclusive)is a good optimization. However, consider extracting the sub-collector check outside the loop to avoid repeatedNO_OP_COLLECTORcomparisons:@Override public void collect(DocIdStream stream, long owningBucketOrd) throws IOException { // This will only be called if its the sub aggregation + final boolean isNoOpSub = (sub == NO_OP_COLLECTOR); for (;;) { int upToExclusive = upToInclusive + 1; if (upToExclusive < 0) { // overflow upToExclusive = Integer.MAX_VALUE; } if (upToSameBucket) { - if (sub == NO_OP_COLLECTOR) { + if (isNoOpSub) { // stream.count maybe faster when we don't need to handle sub-aggs long count = stream.count(upToExclusive);
21-25: Add more detailed class-level Javadoc.The current documentation is minimal. Consider expanding to describe the skip-list optimization strategy, when this collector should be used, and any invariants (e.g., constant owningBucketOrd requirement).
server/src/test/java/org/opensearch/search/aggregations/bucket/filterrewrite/FilterRewriteSubAggTests.java (2)
292-356: Consider adding verification that the skiplist path is actually exercised.The test correctly verifies functional correctness of the range + date histogram aggregation with skiplist-enabled index configuration. However, it doesn't confirm that the skiplist optimization path was actually taken during execution. Consider adding assertions or instrumentation to verify the optimization triggered (e.g., checking profiling output, using a test hook, or verifying collection mode).
767-836: Consider extracting a helper to reduce duplication.The method contains three similar blocks for indexing document batches. Each block follows the same pattern of creating documents with date and metric fields, then committing. While the explicit structure aids readability, extracting a helper method like
addDocumentBatch(IndexWriter writer, int count, long timestamp, long metric)would reduce repetition and improve maintainability.Example refactor:
private void addDocumentBatch(IndexWriter writer, int count, long timestamp, long metricValue) throws IOException { for (int i = 0; i < count; i++) { org.apache.lucene.document.Document doc = new org.apache.lucene.document.Document(); doc.add(SortedNumericDocValuesField.indexedField(dateFieldName, timestamp)); doc.add(new LongPoint(dateFieldName, timestamp)); doc.add(new NumericDocValuesField(longFieldName, metricValue)); doc.add(new LongPoint(longFieldName, metricValue)); writer.addDocument(doc); } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
CHANGELOG.md(1 hunks)server/src/main/java/org/opensearch/search/aggregations/AggregatorBase.java(3 hunks)server/src/main/java/org/opensearch/search/aggregations/bucket/HistogramSkiplistLeafCollector.java(1 hunks)server/src/main/java/org/opensearch/search/aggregations/bucket/histogram/DateHistogramAggregator.java(4 hunks)server/src/main/java/org/opensearch/search/aggregations/bucket/range/RangeAggregator.java(1 hunks)server/src/test/java/org/opensearch/search/aggregations/bucket/filterrewrite/FilterRewriteSubAggTests.java(5 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
server/src/main/java/org/opensearch/search/aggregations/bucket/HistogramSkiplistLeafCollector.java (1)
server/src/main/java/org/opensearch/search/aggregations/LeafBucketCollector.java (1)
LeafBucketCollector(51-179)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
- GitHub Check: gradle-check
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: Analyze (java)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: detect-breaking-change
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: Mend Security Check
🔇 Additional comments (7)
server/src/main/java/org/opensearch/search/aggregations/AggregatorBase.java (1)
76-76: LGTM on the new leaf collection mode API.The introduction of
LeafCollectionModeenum and the protected field provides a clean mechanism for sub-aggregators to detect when their parent is using the filter rewrite optimization path. The mutable protected field is acceptable since aggregators are processed single-threaded per segment context.Also applies to: 240-255
server/src/main/java/org/opensearch/search/aggregations/bucket/histogram/DateHistogramAggregator.java (1)
298-313: Skiplist eligibility check is well-structured.The
canUseSkiplisthelper correctly gates skiplist usage by:
- Requiring non-null skipper and singleton
- Disabling for hardBounds (with appropriate TODO)
- Enabling for top-level aggregations
- Checking only the direct parent's mode (addressing prior reviewer feedback)
The instanceof pattern matching on line 309-310 is clean and safe.
server/src/main/java/org/opensearch/search/aggregations/bucket/range/RangeAggregator.java (1)
338-344: LGTM on the mode toggle implementation.The try-finally pattern correctly sets
leafCollectorModetoFILTER_REWRITEbefore attempting optimization and ensures reset toNORMALafterward, regardless of success or exception. This aligns with the reviewer feedback to scope the mode change tightly aroundtryPrecomputeAggregationForLeaf.The mode must be set before
tryOptimizeexecutes because sub-aggregators querygetLeafCollectorMode()during their leaf collector instantiation within the optimization path.server/src/test/java/org/opensearch/search/aggregations/bucket/filterrewrite/FilterRewriteSubAggTests.java (4)
15-16: LGTM!The new imports are appropriately added to support skiplist-based testing infrastructure, including index sort configuration, multi-segment management, and field data handling.
Also applies to: 20-20, 26-27, 31-33, 36-37, 42-42
628-645: LGTM!The overload pattern cleanly extends the test infrastructure to support custom IndexSettings while maintaining backward compatibility for existing tests.
739-750: LGTM!Replacing the hardcoded
"timestamp"string with thedateFieldNameclass field improves consistency and maintainability.
838-843: LGTM!Simple, focused helper method that improves readability when parsing date strings to timestamps.
...t/java/org/opensearch/search/aggregations/bucket/filterrewrite/FilterRewriteSubAggTests.java
Show resolved
Hide resolved
|
❌ Gradle check result for 5348994: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
Failed with, retrying |
…pensearch-project#19573) Signed-off-by: Ankit Jain <[email protected]> Signed-off-by: Asim Mahmood <[email protected]> Co-authored-by: Asim Mahmood <[email protected]>
…pensearch-project#19573) Signed-off-by: Ankit Jain <[email protected]> Signed-off-by: Asim Mahmood <[email protected]> Co-authored-by: Asim Mahmood <[email protected]>
Description
This change aims at optimizing the sub aggregation by leveraging multi range traversal for top level aggregation, and skip list for the sub aggregation. This optimization helps improve the performance for such aggregation requests upto 10x (needs sorting on field for which skip list is utilized). Queries
range-auto-date-histoandrange-auto-date-histo-metricsfrom thebig5benchmark workload should directly benefit from this optimization once change is extended for auto histogram aggregations as well - #20057I have copied over theCopying overBitSetDocIdStreamclass in OpenSearch from Lucene for now as it is not public, but should look at eventually getting rid of it.BitSetDocIdStreamwas resulting in 3 implementations ofDocIdStreammaking some of the calls polymorphic. Hence, instead added helper class to initializeBitSetDocIdStreamin the lucene namespace. We have followed the same pattern for few other extensions in LuceneRelated Issues
Related to #17447, #19384
Check List
[ ] API changes companion pull request created, if applicable.[ ] Public documentation issue/PR created, if applicable.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.
Summary by CodeRabbit
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.