Skip to content

Conversation

@rishabhmaurya
Copy link
Contributor

@rishabhmaurya rishabhmaurya commented Oct 1, 2025

Description

Add streaming aggregation planning layer. Introduces smart fallback logic for streaming aggregations to prevent coordinator overhead and performance regressions as after #19373, cluster settings based flag will controls streaming for term aggs, so its enabled for all or none for term aggs and no way to control per request.
The high level idea is to determine the cost of running a complete agg tree using streaming mode, if the estimated overhead is too high on coordinator or it may perform poorly compared to traditional approach of flushing per shard, then fallback and recreate agg tree without using Streamable aggregators.

Changes

Streamable interface for aggregators (in future other collectors) that support streaming with cost metrics reporting

FlushModeResolver - Analyzes aggregation cost metrics and decides when to use streaming vs traditional processing

StreamingCostMetrics - Captures bucket count, cardinality, and document estimates for cost analysis

AggregatorTreeEvaluator - Evaluates entire aggregation tree streaming feasibility and falls back to traditional aggregators when needed.

Enhanced existing StreamStringTermAggregator to implement Streamable interface with cost metrics

Added configurable thresholds for streaming decisions (max buckets, min cardinality ratio, min bucket count)

Automatically falls back to per-shard processing when streaming would cause coordinator overload or performance regression for low cardinality cases.

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@rishabhmaurya
Copy link
Contributor Author

rishabhmaurya commented Oct 1, 2025

@bowenlan-amzn @harshavamsi @mch2 please take a look. I'm still working on testing it and fixing checks

@github-actions
Copy link
Contributor

github-actions bot commented Oct 1, 2025

❌ Gradle check result for c618a71: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

github-actions bot commented Oct 1, 2025

❌ Gradle check result for 6e5888c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@rishabhmaurya rishabhmaurya force-pushed the rishma-stream-planning branch from aa03fc7 to 4c608dd Compare October 2, 2025 19:40
@rishabhmaurya rishabhmaurya force-pushed the rishma-stream-planning branch from 833b301 to 79fa652 Compare October 2, 2025 21:06
@github-project-automation github-project-automation bot moved this from In-Review to In Progress in Performance Roadmap Oct 2, 2025
@rishabhmaurya rishabhmaurya added the backport 3.3 Backport to 3.3 branch label Oct 2, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Oct 2, 2025

❕ Gradle check result for 79fa652: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 3, 2025

✅ Gradle check result for bbb5979: SUCCESS

@rishabhmaurya rishabhmaurya merged commit c851fdf into opensearch-project:main Oct 3, 2025
33 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in Performance Roadmap Oct 3, 2025
opensearch-trigger-bot bot pushed a commit that referenced this pull request Oct 3, 2025
#19488)

* Planning for flush mode for streaming aggs

Signed-off-by: Rishabh Maurya <[email protected]>

* Address PR comments

Signed-off-by: Rishabh Maurya <[email protected]>

* Fix for nested aggs and more unit tests

Signed-off-by: Rishabh Maurya <[email protected]>

* Integ test to validate stream agg used using profile output

Signed-off-by: Rishabh Maurya <[email protected]>

* Make StreamNumericTermsAggregator streamable

Signed-off-by: Rishabh Maurya <[email protected]>

* Integ test for StreamNumericTermsAggregator

Signed-off-by: Rishabh Maurya <[email protected]>

* Improve coverage and PR comments

Signed-off-by: Rishabh Maurya <[email protected]>

* Minor refactor and address PR comments

Signed-off-by: Rishabh Maurya <[email protected]>

---------

Signed-off-by: Rishabh Maurya <[email protected]>
(cherry picked from commit c851fdf)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
rishabhmaurya pushed a commit that referenced this pull request Oct 3, 2025
#19488) (#19512)

* Planning for flush mode for streaming aggs



* Address PR comments



* Fix for nested aggs and more unit tests



* Integ test to validate stream agg used using profile output



* Make StreamNumericTermsAggregator streamable



* Integ test for StreamNumericTermsAggregator



* Improve coverage and PR comments



* Minor refactor and address PR comments



---------


(cherry picked from commit c851fdf)

Signed-off-by: Rishabh Maurya <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
peteralfonsi pushed a commit to peteralfonsi/OpenSearch that referenced this pull request Oct 15, 2025
opensearch-project#19488)

* Planning for flush mode for streaming aggs

Signed-off-by: Rishabh Maurya <[email protected]>

* Address PR comments

Signed-off-by: Rishabh Maurya <[email protected]>

* Fix for nested aggs and more unit tests

Signed-off-by: Rishabh Maurya <[email protected]>

* Integ test to validate stream agg used using profile output

Signed-off-by: Rishabh Maurya <[email protected]>

* Make StreamNumericTermsAggregator streamable

Signed-off-by: Rishabh Maurya <[email protected]>

* Integ test for StreamNumericTermsAggregator

Signed-off-by: Rishabh Maurya <[email protected]>

* Improve coverage and PR comments

Signed-off-by: Rishabh Maurya <[email protected]>

* Minor refactor and address PR comments

Signed-off-by: Rishabh Maurya <[email protected]>

---------

Signed-off-by: Rishabh Maurya <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 3.3 Backport to 3.3 branch v3.3.0

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants