-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Streaming Aggregation #18874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streaming Aggregation #18874
Conversation
|
❌ Gradle check result for cfce243: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
cfce243 to
064e610
Compare
|
❌ Gradle check result for 064e610: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
064e610 to
d936286
Compare
|
❌ Gradle check result for d936286: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
d936286 to
bc86e80
Compare
|
❌ Gradle check result for bc86e80: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
bc86e80 to
31a6479
Compare
|
❌ Gradle check result for 31a6479: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
31a6479 to
3215bc1
Compare
|
❌ Gradle check result for 3215bc1: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
3dd2dce to
e79f0fc
Compare
|
❌ Gradle check result for e79f0fc: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
e79f0fc to
11c8299
Compare
|
❌ Gradle check result for 11c8299: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
11c8299 to
ef194ae
Compare
|
❌ Gradle check result for ef194ae: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
ef194ae to
44f645f
Compare
|
❌ Gradle check result for 44f645f: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
server/src/main/java/org/opensearch/search/DefaultSearchContext.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/search/internal/ContextIndexSearcher.java
Outdated
Show resolved
Hide resolved
44f645f to
2496fb0
Compare
|
❌ Gradle check result for 2496fb0: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
2496fb0 to
4875792
Compare
|
❌ Gradle check result for 4875792: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
...ava/org/opensearch/search/aggregations/bucket/terms/GlobalOrdinalsStringTermsAggregator.java
Outdated
Show resolved
Hide resolved
...ava/org/opensearch/search/aggregations/bucket/terms/GlobalOrdinalsStringTermsAggregator.java
Show resolved
Hide resolved
Signed-off-by: bowenlan-amzn <[email protected]>
Signed-off-by: bowenlan-amzn <[email protected]>
server/src/main/java/org/opensearch/search/internal/ContextIndexSearcher.java
Outdated
Show resolved
Hide resolved
Signed-off-by: bowenlan-amzn <[email protected]>
Signed-off-by: bowenlan-amzn <[email protected]>
|
good work isolating streaming aggs flow with regular search flow @bowenlan-amzn @harshavamsi. |
|
❌ Gradle check result for 49ebf96: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 49ebf96: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
* Streaming Aggregation - query param 'stream' in rest search action and stored in search request - On coordinator, we uses stream search transport action and search async action uses the new stream callback - On data node, stream transport action pass stream search flag to search context for shard search, aggregation - Reduce context has stream flag from search request - Sync onPhaseDone for both result consumption callbacks, shard and stream - Result consumer separation between stream and shard - Data node aggregation stream segment aggregation results back, and complete stream by shard result. - Optimize memory usage on coordinator - `reduce size = shard_number * ((1.5 * size) + 10)` (needs improve, big accuracy problem) - Remove the unnecessary memory allocation for handling sub aggregation when no sub aggregation exists - Only allocate doc counts per segment in Terms Bucket Aggregator - Remove the priority queue from Terms Bucket Aggregator, return all buckets in build aggregation - Stream search callback - Stream channel listener - Enable c2 compiler for local gradlew run - Disable filter optimization * Add mock stream transport for testing Signed-off-by: bowenlan-amzn <[email protected]> * innerOnResponse delegate to innerOnCompleteResponse for compatibility Signed-off-by: bowenlan-amzn <[email protected]> * Refactor the streaming interface for streaming search Signed-off-by: bowenlan-amzn <[email protected]> * Separating out stream from regular Signed-off-by: Harsha Vamsi Kalluri <[email protected]> * Fix aggregator and split sendBatch Signed-off-by: Harsha Vamsi Kalluri <[email protected]> * refactor and fix some bugs Signed-off-by: bowenlan-amzn <[email protected]> * buildAggBatch return list of internal aggregations Signed-off-by: bowenlan-amzn <[email protected]> * batch reduce size for stream search Signed-off-by: bowenlan-amzn <[email protected]> * Remove stream execution hint Signed-off-by: bowenlan-amzn <[email protected]> * Clean up InternalTerms Signed-off-by: bowenlan-amzn <[email protected]> * Clean up Signed-off-by: bowenlan-amzn <[email protected]> * Refactor duplication in search service Signed-off-by: bowenlan-amzn <[email protected]> * Update change log Signed-off-by: bowenlan-amzn <[email protected]> * clean up Signed-off-by: bowenlan-amzn <[email protected]> * Add tests for StreamingStringTermsAggregator and SendBatch Signed-off-by: Harsha Vamsi Kalluri <[email protected]> Signed-off-by: bowenlan-amzn <[email protected]> * Clean up and address comments Signed-off-by: bowenlan-amzn <[email protected]> * experimental api annotation Signed-off-by: bowenlan-amzn <[email protected]> * change sendBatch to package private Signed-off-by: bowenlan-amzn <[email protected]> * add type Signed-off-by: bowenlan-amzn <[email protected]> --------- Signed-off-by: Rishabh Maurya <[email protected]> Signed-off-by: bowenlan-amzn <[email protected]> Signed-off-by: Harsha Vamsi Kalluri <[email protected]> Co-authored-by: Rishabh Maurya <[email protected]> Co-authored-by: Harsha Vamsi Kalluri <[email protected]>
Description
RFC: #16774
Change based on #18424 Streaming Transport.
High Level Idea
Instead of return one response per shard search, we can return one partial aggregation response per segment and use streaming transport to send it back immediately.
For the existing shard search response, we can still put in all the other results apart from aggregation and send that back as the last response of the stream and complete it.
This way moves the reduce logic and aggregation result memory consumption from data node to coordinator node, making coordinator node the single place to scale up to handle high cardinality aggregation scenarios.
Run the Code
Please see
SubAggregationITunderorg.opensearch.streaming.aggregationAdd these to JUnit run configuration in Intellij
Change List
Entry
streamquery parameter when calling search rest endpoint. User needs to opt in by setstream=trueto use the streaming aggregation featureTransport Layer
StreamTransportSearchActionextendsTransportSearchActionand override async action provider to provide a stream async action.StreamSearchTransportServiceto register request handler and response handler.StreamSearchChannelListenerto send results back from data node.StreamTransportResponsewhich provides stream of results from data node.Coordinator
StreamSearchQueryThenFetchAsyncAction, provide a streaming action listenerStreamSearchActionListenerstream resultsbefore the last one go through a new consumption path.StreamQueryPhaseResultConsumerwhich extends the existingQueryPhaseResultConsumer. It adds a newconsumeStreamResultto pushstream resultsinto the same bufferpendingMerges.pendingMerges. We can sync these 2 callbacks inStreamSearchQueryThenFetchAsyncActionand trigger the onPhaseDone without race condition.Data Node
StreamingStringTermsAggregatorto do the per segment aggregation and able to reset after building aggregation. Comparing to current defaultGlobalOrdinalsStringTermsAggregator, the key changes areBucketCollectorProcessorbuild aggregation batchContextIndexSearchertrigger the aggregation batch build, reset, and send back the batch result usingStreamSearchChannelListenerCoverage
Things Left
reduce_sizeparameter. And research what's the rightreduce_sizeto keep similar accuracy as the current terms aggregationRelated Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.