Skip to content

Conversation

@Swiddis
Copy link
Collaborator

@Swiddis Swiddis commented Jan 8, 2026

Description

Straightforward update to use the setting in V2 as we do in Calcite. Lots of test changes though :/

Related Issues

N/A

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Simeon Widdis <[email protected]>
Signed-off-by: Simeon Widdis <[email protected]>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 8, 2026

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Increased composite aggregation bucket capacity from 1,000 to 10,000 entries
    • Aggregations now honor cluster-level bucket limits for safer queries
  • Documentation

    • Updated example request/response payloads to show the larger aggregation size

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

This PR increases the default composite aggregation bucket size from 1000 to 10000 in docs and test expectations, and adds configurable bucket-size support via new methods in OpenSearchRequestBuilder and AggregationQueryBuilder, with related test adjustments.

Changes

Cohort / File(s) Summary
Documentation
docs/user/optimization/optimization.rst
Example payloads updated to show composite aggregation size 10000
Test expected outputs (PPL big5)
integ-test/src/test/resources/expectedOutput/ppl/big5/*
Multiple YAML expectations updated: composite.size from 1000 → 10000
Test expected outputs (clickbench & misc)
integ-test/src/test/resources/expectedOutput/ppl/clickbench/*, explain_*.{yaml,json}, explain_*_agg_push.*
Many expected outputs updated to reflect composite.size = 10000 and minor formatting changes in sources
Request builder
opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.java
Added getQueryBucketSize() and getMaxBuckets() to provide bucket size with cluster-limit enforcement
Index scan aggregation integration
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanAggregationBuilder.java
Passes requestBuilder.getQueryBucketSize() into AggregationQueryBuilder constructor
AggregationQueryBuilder logic
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilder.java
Added queryBucketSize field, new constructor AggregationQueryBuilder(ExpressionSerializer,int), and use of queryBucketSize instead of hard-coded bucket size
Unit tests
opensearch/src/test/java/.../OpenSearchIndexScanOptimizationTest.java, .../AggregationQueryBuilderTest.java
Added lenient stub for getQueryBucketSize() and new test validating custom bucket-size constructor
YAML assertion utility
integ-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java
Reworked assertYamlEquals to use YAML parsing/semantic comparison with YAMLMapper and formatted diffs

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • Support composite aggregation paginating #4884 — Modifies composite-aggregation construction and request-building code to control composite bucket sizing/pagination semantics (closely related changes to AggregationQueryBuilder/OpenSearchRequestBuilder).

Suggested labels

PPL, pushdown, calcite

Suggested reviewers

  • LantaoJin
  • qianheng-aws
  • penghuo
  • ps48
  • kavithacm
  • derek-ho
  • joshuali925
  • GumpacG
  • anirudha
🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 38.89% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title directly describes the main change: updating V2 to reference the plugins.query.buckets setting, which aligns with the core objective and code changes throughout the PR.
Description check ✅ Passed The description clearly relates to the changeset by explaining that it updates V2 to use the plugins.query.buckets setting (matching Calcite behavior) and acknowledges the extensive test updates required.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2711d23 and 6f302ce.

📒 Files selected for processing (1)
  • integ-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java
🧰 Additional context used
📓 Path-based instructions (2)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • integ-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java

⚙️ CodeRabbit configuration file

**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring

  • Flag classes >500 lines as needing organization review
  • Check for dead code, unused imports, and unused variables
  • Identify code reuse opportunities across similar implementations
  • Assess holistic maintainability - is code easy to understand and modify?
  • Flag code that appears AI-generated without sufficient human review
  • Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • integ-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify NULL input tests for all new functions

  • Check boundary condition tests (min/max values, empty inputs)
  • Validate error condition tests (invalid inputs, exceptions)
  • Ensure multi-document tests for per-document operations
  • Flag smoke tests without meaningful assertions
  • Check test naming follows pattern: test
  • Verify test data is realistic and covers edge cases
  • Verify test coverage for new business logic
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • integ-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java
🧬 Code graph analysis (1)
integ-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java (2)
core/src/main/java/org/opensearch/sql/utils/YamlFormatter.java (1)
  • YamlFormatter (19-51)
core/src/test/java/org/opensearch/sql/utils/YamlFormatter.java (1)
  • YamlFormatter (18-43)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
  • GitHub Check: security-it-linux (21)
  • GitHub Check: security-it-linux (25)
  • GitHub Check: build-linux (25, unit)
  • GitHub Check: build-linux (21, integration)
  • GitHub Check: bwc-tests-rolling-upgrade (21)
  • GitHub Check: build-linux (21, unit)
  • GitHub Check: build-linux (25, integration)
  • GitHub Check: bwc-tests-full-restart (25)
  • GitHub Check: build-linux (21, doc)
  • GitHub Check: bwc-tests-rolling-upgrade (25)
  • GitHub Check: build-linux (25, doc)
  • GitHub Check: bwc-tests-full-restart (21)
  • GitHub Check: build-windows-macos (macos-14, 25, unit)
  • GitHub Check: build-windows-macos (macos-14, 21, doc)
  • GitHub Check: build-windows-macos (macos-14, 25, doc)
  • GitHub Check: build-windows-macos (macos-14, 25, integration)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 21, unit)
  • GitHub Check: build-windows-macos (macos-14, 21, integration)
  • GitHub Check: security-it-windows-macos (macos-14, 21)
  • GitHub Check: security-it-windows-macos (windows-latest, 21)
  • GitHub Check: security-it-windows-macos (macos-14, 25)
  • GitHub Check: security-it-windows-macos (windows-latest, 25)
  • GitHub Check: CodeQL-Scan (java)
  • GitHub Check: test-sql-cli-integration (21)
🔇 Additional comments (2)
integ-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java (2)

22-22: LGTM!

The addition of YAMLMapper for parsing YAML strings into objects is appropriate. Since this mapper is only used for deserialization (parsing), the default configuration is sufficient. The actual formatting for diff output correctly delegates to YamlFormatter.formatToYaml() which has the proper serialization settings configured.

Also applies to: 48-48


472-491: Well-structured YAML-aware comparison with appropriate fallback.

The implementation correctly:

  1. Parses YAML into objects for semantic comparison (ignoring formatting differences)
  2. Uses YamlFormatter.formatToYaml() for consistent diff output when assertions fail
  3. Falls back to string comparison if parsing fails

The intentional pattern on line 480 where assertEquals is called after confirming inequality is a good technique for generating readable assertion error messages.

One minor observation: the catch block on line 482 uses a broad Exception type. While JsonProcessingException would be more specific, the defensive approach is reasonable for a test utility fallback path.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🤖 Fix all issues with AI agents
In @integ-test/src/test/resources/expectedOutput/ppl/clickbench/q15.yaml:
- Line 21: Update the aggregation bucket size to 10000 everywhere: change the
AGGREGATION_BUCKET_SIZE constant in AggregationQueryBuilder.java from 1000 to
10000, and update the Calcite expected-output test files that still use
"size":1000 (e.g., explain_agg_counts_by2.yaml, explain_dedup_complex2.yaml,
explain_stats_by_span.json and other files under the calcite expectedOutput set)
to use "size":10000 so they match the PPL test files; after making these edits
regenerate or re-run the affected integration tests to refresh expectations.

In @integ-test/src/test/resources/expectedOutput/ppl/clickbench/q17.yaml:
- Line 19: The expected output contains a hardcoded composite aggregation size
of 10000; update the composite aggregation size value to 1000 to match the
default plugins.query.buckets setting (default defined in SQLIntegTestCase.java
/ UnifiedQueryOpenSearchIT.java / AggregationQueryBuilder.java) so the test
reflects the default behavior (change the "size":10000 entry under the
"composite" aggregation to "size":1000).

In @integ-test/src/test/resources/expectedOutput/ppl/clickbench/q32.yaml:
- Line 21: Update the hardcoded composite bucket size in the expected YAML back
to the configured default of 1000: find the occurrence of
"aggregations\":{\"composite_buckets\":{\"composite\":{\"size\":10000" in
integ-test/src/test/resources/expectedOutput/ppl/clickbench/q32.yaml and change
the size value from 10000 to 1000 so the expected output matches the configured
plugins.query.buckets default used by UnifiedQueryOpenSearchIT,
SQLIntegTestCase, and StandaloneIT.

In @integ-test/src/test/resources/expectedOutput/ppl/clickbench/q41.yaml:
- Line 33: The PPL test expected output for q41.yaml should use a composite
aggregation bucket size of 10000; update the composite aggregation JSON in
integ-test/src/test/resources/expectedOutput/ppl/clickbench/q41.yaml so the
"composite":{"size":10000,...} appears for the PPL/non-calcite variant while
leaving calcite-enabled test files at size 1000 (they are driven by
plugins.query.buckets default of 1000); ensure any other PPL expectedOutput
files that mirror this change are updated consistently and do not change
calcite-enabled fixtures.

In @integ-test/src/test/resources/expectedOutput/ppl/clickbench/q8.yaml:
- Line 19: The expected output contains a hardcoded composite bucket size of
10000; update the value to 1000 to match the default
plugins.query.buckets/DEFAULT_QUERY_BUCKET_SIZE used in the codebase. Locate the
string "aggregations\":{\"composite_buckets\":{\"composite\":{\"size\":10000" in
the expected YAML (ppl/clickbench/q8.yaml) and replace 10000 with 1000 so test
fixtures align with SQLIntegTestCase.DEFAULT_QUERY_BUCKET_SIZE.

In @integ-test/src/test/resources/expectedOutput/ppl/clickbench/q9.yaml:
- Line 19: The composite aggregation bucket size is hardcoded as "size":10000 in
the test expected output but should match the codebase default
AGGREGATION_BUCKET_SIZE and the integration test setting plugins.query.buckets
(1000); update the YAML entry that contains the composite aggregation size (the
fragment with "composite":{"size":10000,...) to use "size":1000 so the expected
output aligns with AggregationQueryBuilder.AGGREGATION_BUCKET_SIZE and the
plugins.query.buckets default.

In
@opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.java:
- Around line 357-365: The getQueryBucketSize method can NPE when
settings.getSettingValue(Settings.Key.QUERY_BUCKET_SIZE) returns null and also
may return non-positive values; update getQueryBucketSize to read the configured
value into a local Integer, treat null as a safe default (e.g., 1 or a
configured fallback), then compute the bucket size as the min between that safe
value and getMaxBuckets(), and finally ensure the returned value is at least 1
(or another validated minimum) before returning; reference
Settings.Key.QUERY_BUCKET_SIZE, settings.getSettingValue(...),
getQueryBucketSize(), and getMaxBuckets() when making the change.
- Around line 367-378: getMaxBuckets currently swallows all Exceptions; change
it to catch only the expected setting-related exceptions (e.g.,
IllegalArgumentException, NoSuchElementException, or the specific
Settings/Config exception thrown by settings.getSettingValue) instead of a bare
Exception, log a warning or info with the exception and that DEFAULT_MAX_BUCKETS
will be used, and return DEFAULT_MAX_BUCKETS; ensure a logger (e.g., private
static final Logger logger) is available in OpenSearchRequestBuilder and use
logger.warn("Using default max buckets due to error reading setting", e) (or
similar) to record the fallback.
🧹 Nitpick comments (2)
opensearch/src/test/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanOptimizationTest.java (1)

107-109: Consider whether lenient stubbing is necessary.

The lenient() stub for getQueryBucketSize() allows the method to be called without strict verification. If all test scenarios are expected to call this method (which appears likely given the aggregation query building), consider using regular when() stubbing instead for stricter test assertions.

Alternative with strict stubbing
-    lenient()
-        .when(requestBuilder.getQueryBucketSize())
-        .thenReturn(AggregationQueryBuilder.AGGREGATION_BUCKET_SIZE);
+    when(requestBuilder.getQueryBucketSize())
+        .thenReturn(AggregationQueryBuilder.AGGREGATION_BUCKET_SIZE);
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilder.java (1)

56-69: Consider adding validation for the bucket size parameter.

The new constructor and field correctly implement configurable bucket sizing with good backward compatibility via constructor delegation. However, there's no validation to ensure queryBucketSize is within reasonable bounds (e.g., positive and not excessively large).

✨ Proposed validation
  /** Aggregation Query Builder Constructor with custom bucket size. */
  public AggregationQueryBuilder(ExpressionSerializer serializer, int queryBucketSize) {
+   if (queryBucketSize <= 0) {
+     throw new IllegalArgumentException(
+       String.format("Query bucket size must be positive, got: %d", queryBucketSize));
+   }
    this.compositeBuilder = new CompositeAggregationBuilder(serializer);
    this.metricBuilder = new MetricAggregationBuilder(serializer);
    this.queryBucketSize = queryBucketSize;
  }

Alternatively, consider documenting in JavaDoc the expected valid range for the queryBucketSize parameter.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 46302b7 and 2711d23.

📒 Files selected for processing (48)
  • docs/user/optimization/optimization.rst
  • integ-test/src/test/resources/expectedOutput/ppl/big5/composite_date_histogram_daily.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/big5/composite_terms.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/big5/composite_terms_keyword.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/big5/date_histogram_hourly_agg.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/big5/date_histogram_minute_agg.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/big5/keyword_terms.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/big5/keyword_terms_low_cardinality.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/big5/multi_terms_keyword.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/big5/terms_significant_1.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/big5/terms_significant_2.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q10.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q11.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q12.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q13.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q14.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q15.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q16.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q17.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q18.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q22.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q23.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q28.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q31.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q32.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q33.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q34.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q37.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q38.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q41.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q42.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q8.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q9.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/explain_filter_agg_push.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/explain_output.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/explain_patterns_agg_push.json
  • integ-test/src/test/resources/expectedOutput/ppl/explain_patterns_simple_pattern_agg_push.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/explain_sort_agg_push.json
  • integ-test/src/test/resources/expectedOutput/ppl/explain_sort_then_agg_push.json
  • integ-test/src/test/resources/expectedOutput/ppl/explain_stats_by_span.json
  • integ-test/src/test/resources/expectedOutput/ppl/explain_stats_by_span_non_bucket_nullable.json
  • integ-test/src/test/resources/expectedOutput/ppl/explain_stats_by_timespan.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/explain_stats_by_timespan2.yaml
  • opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanAggregationBuilder.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilder.java
  • opensearch/src/test/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanOptimizationTest.java
  • opensearch/src/test/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilderTest.java
🧰 Additional context used
📓 Path-based instructions (5)
integ-test/src/test/resources/**/*

⚙️ CodeRabbit configuration file

integ-test/src/test/resources/**/*: - Verify test data is realistic and representative

  • Check data format matches expected schema
  • Ensure test data covers edge cases and boundary conditions

Files:

  • integ-test/src/test/resources/expectedOutput/ppl/big5/composite_terms.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q33.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q10.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q42.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q23.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q14.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/explain_filter_agg_push.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q41.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/explain_sort_then_agg_push.json
  • integ-test/src/test/resources/expectedOutput/ppl/explain_stats_by_span_non_bucket_nullable.json
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q9.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/big5/keyword_terms.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q8.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/big5/terms_significant_2.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/big5/multi_terms_keyword.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q31.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q32.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q37.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q38.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/big5/composite_terms_keyword.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q13.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q22.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/explain_patterns_agg_push.json
  • integ-test/src/test/resources/expectedOutput/ppl/big5/terms_significant_1.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q12.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/explain_stats_by_timespan.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/big5/date_histogram_hourly_agg.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q28.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q17.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/big5/composite_date_histogram_daily.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q11.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/explain_sort_agg_push.json
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q15.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/big5/keyword_terms_low_cardinality.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/explain_output.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q18.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/explain_stats_by_span.json
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q16.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/big5/date_histogram_minute_agg.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/explain_patterns_simple_pattern_agg_push.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/explain_stats_by_timespan2.yaml
  • integ-test/src/test/resources/expectedOutput/ppl/clickbench/q34.yaml
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanAggregationBuilder.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilder.java
  • opensearch/src/test/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilderTest.java
  • opensearch/src/test/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanOptimizationTest.java

⚙️ CodeRabbit configuration file

**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring

  • Flag classes >500 lines as needing organization review
  • Check for dead code, unused imports, and unused variables
  • Identify code reuse opportunities across similar implementations
  • Assess holistic maintainability - is code easy to understand and modify?
  • Flag code that appears AI-generated without sufficient human review
  • Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanAggregationBuilder.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilder.java
  • opensearch/src/test/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilderTest.java
  • opensearch/src/test/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanOptimizationTest.java
docs/**/*.rst

⚙️ CodeRabbit configuration file

docs/**/*.rst: - Verify examples use valid test data and indices

  • Check documentation follows existing patterns and structure
  • Validate syntax examples are complete and correct
  • Ensure code examples are tested and working
  • Check for consistent formatting and style

Files:

  • docs/user/optimization/optimization.rst
**/*Test.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*Test.java: All new business logic requires unit tests
Name unit tests with *Test.java suffix in OpenSearch SQL

Files:

  • opensearch/src/test/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilderTest.java
  • opensearch/src/test/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanOptimizationTest.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify NULL input tests for all new functions

  • Check boundary condition tests (min/max values, empty inputs)
  • Validate error condition tests (invalid inputs, exceptions)
  • Ensure multi-document tests for per-document operations
  • Flag smoke tests without meaningful assertions
  • Check test naming follows pattern: test
  • Verify test data is realistic and covers edge cases
  • Verify test coverage for new business logic
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • opensearch/src/test/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilderTest.java
  • opensearch/src/test/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanOptimizationTest.java
🧠 Learnings (2)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

  • integ-test/src/test/resources/expectedOutput/ppl/explain_stats_by_span_non_bucket_nullable.json
  • opensearch/src/test/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanOptimizationTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL

Applied to files:

  • opensearch/src/test/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanOptimizationTest.java
🧬 Code graph analysis (1)
opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.java (1)
common/src/main/java/org/opensearch/sql/common/setting/Settings.java (1)
  • Settings (17-105)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
  • GitHub Check: security-it-linux (25)
  • GitHub Check: security-it-linux (21)
  • GitHub Check: build-linux (21, integration)
  • GitHub Check: build-linux (25, integration)
  • GitHub Check: build-linux (25, unit)
  • GitHub Check: build-linux (21, unit)
  • GitHub Check: build-linux (21, doc)
  • GitHub Check: build-linux (25, doc)
  • GitHub Check: bwc-tests-full-restart (25)
  • GitHub Check: bwc-tests-rolling-upgrade (21)
  • GitHub Check: bwc-tests-rolling-upgrade (25)
  • GitHub Check: bwc-tests-full-restart (21)
  • GitHub Check: build-windows-macos (macos-14, 21, integration)
  • GitHub Check: build-windows-macos (macos-14, 21, doc)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
  • GitHub Check: security-it-windows-macos (windows-latest, 21)
  • GitHub Check: build-windows-macos (macos-14, 25, doc)
  • GitHub Check: build-windows-macos (macos-14, 25, integration)
  • GitHub Check: security-it-windows-macos (macos-14, 21)
  • GitHub Check: build-windows-macos (macos-14, 25, unit)
  • GitHub Check: security-it-windows-macos (macos-14, 25)
  • GitHub Check: security-it-windows-macos (windows-latest, 25)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 21, unit)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
  • GitHub Check: test-sql-cli-integration (21)
  • GitHub Check: CodeQL-Scan (java)
🔇 Additional comments (33)
integ-test/src/test/resources/expectedOutput/ppl/explain_sort_then_agg_push.json (1)

10-10: Test expectation files showing composite aggregation size of 10000 are new additions, but claims about "new default bucket size behavior" are unverified.

This commit adds new test expectation files including explain_sort_then_agg_push.json with a composite aggregation size of 10000. The test data itself is realistic and valid for the sort-then-aggregation pushdown scenario with state grouping and average balance calculation.

However, the original review's claim that this reflects a "new default bucket size behavior" cannot be verified:

  • AGGREGATION_BUCKET_SIZE constant remains 1000 (unchanged)
  • Test settings configure QUERY_BUCKET_SIZE as 1000
  • The source of the 10000 value in the test expectations is unclear from this commit alone

Verify that:

  1. The 10000 composite size matches the actual behavior produced by the query execution
  2. This value originates from the proper source (test configuration, queryBucketSize parameter, or bucket size capping logic)
  3. The test expectations align with the implementation changes in preceding commits
integ-test/src/test/resources/expectedOutput/ppl/big5/terms_significant_2.yaml (1)

19-19: Update test data size parameter to 10000 in composite aggregation fixtures.

Line 19 of this file contains expected output for a composite aggregation query with the bucket size set to 10000. This aligns with the PR objective to reference the plugins.query.buckets configuration setting.

The test data structure and format are correct. However, before approving this change, verify:

  1. Edge case test coverage: Ensure test scenarios include cases where result sets approach or exceed the 10000 bucket limit to validate behavior at boundaries.
  2. Bounds and validation: Confirm that the plugins.query.buckets setting has appropriate validation to prevent excessively large or invalid values.
  3. Performance testing: Conduct testing with the new 10000 bucket default to assess memory and performance impact on queries returning large result sets.

Per the test data guidelines for this path, confirm that the test data remains realistic and representative of actual query patterns, and that boundary conditions are adequately covered.

integ-test/src/test/resources/expectedOutput/ppl/explain_stats_by_timespan.yaml (1)

10-14: Clarify bucket size value in expected output and test coverage gap.

The expected output shows "size":10000 for composite aggregations, but the test does not explicitly configure bucket sizes—it uses the default plugins.query.buckets value of 1000. The origin of the 10000 value in the output needs clarification, as it does not align with the configured default in ExplainIT.

Additionally, the review comment claims the files represent an "update from 1000 to 10000," but git show indicates these are newly created files in commit 2711d23, not updates from previous values.

Regarding test coverage: Custom and edge case bucket size testing exists in CalciteExplainIT (using setQueryBucketSize()), but the specific test using this file (ExplainIT.testStatsByTimeSpan) does not vary bucket sizes. If the PR objective is to test the plugins.query.buckets setting behavior, consider adding parameterized tests with different bucket sizes to ExplainIT, or verify that SysLimit defaults are intentionally providing the 10000 value independently of QUERY_BUCKET_SIZE.

integ-test/src/test/resources/expectedOutput/ppl/clickbench/q18.yaml (1)

15-15: LGTM! Test expectation correctly updated.

The composite bucket size update to 10000 matches the new default and is consistent with related test files.

integ-test/src/test/resources/expectedOutput/ppl/clickbench/q34.yaml (1)

19-19: LGTM! Test expectation correctly updated.

The composite bucket size update to 10000 matches the new default and is consistent with related test files.

integ-test/src/test/resources/expectedOutput/ppl/explain_output.yaml (1)

28-28: LGTM! Test expectation correctly updated.

The composite bucket size update to 10000 matches the new default and is consistent with related test files.

integ-test/src/test/resources/expectedOutput/ppl/big5/composite_date_histogram_daily.yaml (1)

19-19: LGTM! Test expectation correctly updated.

The composite bucket size update to 10000 matches the new default and is consistent with related test files.

integ-test/src/test/resources/expectedOutput/ppl/clickbench/q10.yaml (1)

19-19: Correct the bucket size assumption: 10000 does not match the plugins.query.buckets default.

The plugins.query.buckets setting defaults to 1000, not 10000. The expected output files show composite aggregation size of 10000, which differs from the actual setting default documented in settings.md and configured in UnifiedQueryOpenSearchIT.java. Verify whether 10000 is intentional for the composite aggregation (which may be a different limit than plugins.query.buckets) or if it should be aligned with the actual default of 1000.

Likely an incorrect or invalid review comment.

integ-test/src/test/resources/expectedOutput/ppl/explain_sort_agg_push.json (1)

22-22: Expected output correctly updated to reflect new default bucket size.

The composite aggregation size change from 1000 to 10000 aligns with the PR's introduction of configurable bucket sizes and the updated default value.

integ-test/src/test/resources/expectedOutput/ppl/clickbench/q14.yaml (1)

21-21: Expected output correctly updated.

The change consistently reflects the new default composite aggregation bucket size of 10000.

integ-test/src/test/resources/expectedOutput/ppl/clickbench/q13.yaml (1)

21-21: Expected output correctly updated.

The composite bucket size change is consistent with the PR's objectives and matches updates in other test expectation files.

integ-test/src/test/resources/expectedOutput/ppl/clickbench/q42.yaml (1)

31-31: Expected output correctly updated.

The change appropriately reflects the new default composite aggregation bucket size.

integ-test/src/test/resources/expectedOutput/ppl/clickbench/q12.yaml (1)

21-21: Expected output correctly updated.

The composite bucket size change is consistent across all reviewed test expectation files and correctly reflects the new configurable bucket size behavior introduced in this PR.

integ-test/src/test/resources/expectedOutput/ppl/clickbench/q38.yaml (1)

33-33: LGTM! Composite aggregation size updated correctly.

The bucket size increase from 1000 to 10000 aligns with the PR objective to reference the plugins.query.buckets setting in V2. The test expectation accurately reflects the new default behavior.

integ-test/src/test/resources/expectedOutput/ppl/clickbench/q11.yaml (1)

21-21: LGTM! Bucket size updated consistently.

The composite aggregation size change to 10000 is consistent with the configurable bucket size functionality introduced in this PR.

integ-test/src/test/resources/expectedOutput/ppl/big5/composite_terms.yaml (1)

19-19: LGTM! Multi-source composite aggregation updated correctly.

The bucket size increase appropriately covers the multi-field composite aggregation scenario (process.name and cloud.region), ensuring comprehensive test coverage for the new configurable bucket size feature.

integ-test/src/test/resources/expectedOutput/ppl/big5/keyword_terms_low_cardinality.yaml (1)

19-19: LGTM! Low cardinality test updated appropriately.

The bucket size update to 10000 is appropriate for the low cardinality keyword terms scenario. This test case provides good coverage for edge cases where the actual result size may be significantly lower than the bucket limit.

integ-test/src/test/resources/expectedOutput/ppl/explain_stats_by_span.json (1)

11-11: LGTM! Histogram-based composite aggregation updated correctly.

The bucket size update to 10000 extends to histogram-based composite aggregations (span operation), demonstrating comprehensive test coverage across different aggregation types (terms and histogram).

integ-test/src/test/resources/expectedOutput/ppl/clickbench/q16.yaml (1)

19-19: LGTM: Composite bucket size updated to reflect new default.

The change from 1000 to 10000 correctly reflects the new default composite aggregation bucket size introduced by this PR's configurable bucket size feature.

integ-test/src/test/resources/expectedOutput/ppl/clickbench/q22.yaml (1)

24-24: LGTM: Composite bucket size updated to reflect new default.

The change from 1000 to 10000 correctly reflects the new default composite aggregation bucket size.

integ-test/src/test/resources/expectedOutput/ppl/clickbench/q28.yaml (1)

26-26: LGTM: Composite bucket size updated to reflect new default.

The change from 1000 to 10000 correctly reflects the new default composite aggregation bucket size.

integ-test/src/test/resources/expectedOutput/ppl/clickbench/q37.yaml (1)

33-33: LGTM: Composite bucket size updated to reflect new default.

The change from 1000 to 10000 correctly reflects the new default composite aggregation bucket size.

integ-test/src/test/resources/expectedOutput/ppl/explain_stats_by_span_non_bucket_nullable.json (1)

10-10: Remove misleading claim about "new default" bucket size.

The composite aggregation size of 10000 in this test file is the configured value for PPL integration tests (via DEFAULT_SUBSEARCH_MAXOUT), not a change to the default constant (AGGREGATION_BUCKET_SIZE = 1000). The AggregationQueryBuilder does support configurable bucket sizes, but this test expectation reflects PPL's specific test configuration, not a change to the codebase default. The test data itself is valid, realistic, and appropriately covers the non-bucket-nullable edge case.

Likely an incorrect or invalid review comment.

integ-test/src/test/resources/expectedOutput/ppl/clickbench/q33.yaml (1)

19-19: LGTM! Test expectation correctly updated.

The composite aggregation size has been updated from 1000 to 10000, correctly reflecting the new default bucket size behavior introduced in this PR.

opensearch/src/test/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilderTest.java (2)

716-750: LGTM! Comprehensive test coverage for custom bucket size.

The test properly validates that the new AggregationQueryBuilder(serializer, int bucketSize) constructor correctly applies the custom bucket size (500) to the generated composite aggregation. The test follows existing patterns and naming conventions.


783-797: LGTM! Helper method follows existing patterns.

The buildQueryWithCustomBuilder() helper appropriately mirrors the existing buildQuery() pattern, enabling test coverage for the custom builder constructor.

docs/user/optimization/optimization.rst (3)

283-283: LGTM! Documentation updated consistently.

The composite aggregation size has been updated from 1000 to 10000 across all documentation examples, correctly reflecting the new default bucket size behavior.


309-309: LGTM! Consistent with other documentation updates.

The composite size update is consistent across all examples in the documentation.


344-344: LGTM! Final documentation example updated consistently.

All three examples in the documentation now consistently show the new composite size of 10000.

opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanAggregationBuilder.java (1)

57-58: LGTM! Correct implementation of configurable bucket size.

The AggregationQueryBuilder instantiation now properly passes the bucket size from requestBuilder.getQueryBucketSize(), enabling dynamic configuration based on cluster settings instead of hard-coded values. This aligns with the PR objective to reference plugins.query.buckets in V2.

The underlying implementation correctly bounds the configured bucket size using Math.min() against the cluster's maximum buckets limit, with a fallback to DEFAULT_MAX_BUCKETS if the cluster setting is unavailable. Edge cases are properly handled.

opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.java (1)

12-12: LGTM!

The static import for DEFAULT_MAX_BUCKETS is appropriately used as a fallback value in the getMaxBuckets() method.

opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilder.java (2)

47-48: LGTM!

The comment update correctly reflects that AGGREGATION_BUCKET_SIZE now serves as the default value rather than a fixed constant.


106-106: LGTM!

The update correctly uses the configurable queryBucketSize field instead of the hard-coded constant, achieving the PR's objective of making bucket size configurable in V2.

Comment on lines +357 to +365
/**
* Get the configured query bucket size, which is the minimum of the plugin setting and the
* cluster's max buckets limit.
*
* @return the query bucket size to use for aggregations
*/
public Integer getQueryBucketSize() {
return Math.min(settings.getSettingValue(Settings.Key.QUERY_BUCKET_SIZE), getMaxBuckets());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add null safety and validate the returned bucket size.

The method has potential robustness issues:

  1. If settings.getSettingValue(Settings.Key.QUERY_BUCKET_SIZE) returns null, Math.min() will throw a NullPointerException during auto-unboxing.
  2. No validation ensures the result is positive.

Consider adding defensive checks to handle edge cases gracefully.

🛡️ Proposed fix with null safety and validation
  public Integer getQueryBucketSize() {
-   return Math.min(settings.getSettingValue(Settings.Key.QUERY_BUCKET_SIZE), getMaxBuckets());
+   Integer configuredSize = settings.getSettingValue(Settings.Key.QUERY_BUCKET_SIZE);
+   if (configuredSize == null || configuredSize <= 0) {
+     return getMaxBuckets();
+   }
+   return Math.min(configuredSize, getMaxBuckets());
  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/**
* Get the configured query bucket size, which is the minimum of the plugin setting and the
* cluster's max buckets limit.
*
* @return the query bucket size to use for aggregations
*/
public Integer getQueryBucketSize() {
return Math.min(settings.getSettingValue(Settings.Key.QUERY_BUCKET_SIZE), getMaxBuckets());
}
/**
* Get the configured query bucket size, which is the minimum of the plugin setting and the
* cluster's max buckets limit.
*
* @return the query bucket size to use for aggregations
*/
public Integer getQueryBucketSize() {
Integer configuredSize = settings.getSettingValue(Settings.Key.QUERY_BUCKET_SIZE);
if (configuredSize == null || configuredSize <= 0) {
return getMaxBuckets();
}
return Math.min(configuredSize, getMaxBuckets());
}
🤖 Prompt for AI Agents
In
@opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.java
around lines 357 - 365, The getQueryBucketSize method can NPE when
settings.getSettingValue(Settings.Key.QUERY_BUCKET_SIZE) returns null and also
may return non-positive values; update getQueryBucketSize to read the configured
value into a local Integer, treat null as a safe default (e.g., 1 or a
configured fallback), then compute the bucket size as the min between that safe
value and getMaxBuckets(), and finally ensure the returned value is at least 1
(or another validated minimum) before returning; reference
Settings.Key.QUERY_BUCKET_SIZE, settings.getSettingValue(...),
getQueryBucketSize(), and getMaxBuckets() when making the change.

Comment on lines +367 to +378
/**
* Get the maximum number of buckets allowed by the cluster.
*
* @return the max buckets setting value, or default if not available
*/
private Integer getMaxBuckets() {
try {
return settings.getSettingValue(Settings.Key.SEARCH_MAX_BUCKETS);
} catch (Exception e) {
return DEFAULT_MAX_BUCKETS;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Refine exception handling and add logging.

The current implementation has two issues:

  1. Catching bare Exception is overly broad and may hide unexpected errors.
  2. The exception is silently swallowed, which could obscure configuration problems during troubleshooting.

Consider catching specific exception types and logging the fallback behavior.

🔧 Proposed improvement
  private Integer getMaxBuckets() {
    try {
      return settings.getSettingValue(Settings.Key.SEARCH_MAX_BUCKETS);
-   } catch (Exception e) {
+   } catch (IllegalArgumentException | NullPointerException e) {
+     // Log at debug level since fallback to default is acceptable
+     // logger.debug("Unable to retrieve SEARCH_MAX_BUCKETS setting, using default: {}", DEFAULT_MAX_BUCKETS, e);
      return DEFAULT_MAX_BUCKETS;
    }
  }

Note: Add the appropriate logger if not already present.

🤖 Prompt for AI Agents
In
@opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.java
around lines 367 - 378, getMaxBuckets currently swallows all Exceptions; change
it to catch only the expected setting-related exceptions (e.g.,
IllegalArgumentException, NoSuchElementException, or the specific
Settings/Config exception thrown by settings.getSettingValue) instead of a bare
Exception, log a warning or info with the exception and that DEFAULT_MAX_BUCKETS
will be used, and return DEFAULT_MAX_BUCKETS; ensure a logger (e.g., private
static final Logger logger) is available in OpenSearchRequestBuilder and use
logger.warn("Using default max buckets due to error reading setting", e) (or
similar) to record the fallback.

@Swiddis Swiddis force-pushed the fix/bucket-count-v2 branch from 56ab7e0 to 526a1fb Compare January 8, 2026 00:56
@Swiddis Swiddis force-pushed the fix/bucket-count-v2 branch from 526a1fb to 2711d23 Compare January 8, 2026 01:00
@Swiddis Swiddis marked this pull request as draft January 8, 2026 01:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant