-
Notifications
You must be signed in to change notification settings - Fork 181
Make V2 also reference plugins.query.buckets #5029
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Simeon Widdis <[email protected]>
Signed-off-by: Simeon Widdis <[email protected]>
Signed-off-by: Simeon Widdis <[email protected]>
📝 WalkthroughSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughThis PR increases the default composite aggregation bucket size from 1000 to 10000 in docs and test expectations, and adds configurable bucket-size support via new methods in OpenSearchRequestBuilder and AggregationQueryBuilder, with related test adjustments. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (2)**/*.java📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Files:
⚙️ CodeRabbit configuration file
Files:
**/test/**/*.java⚙️ CodeRabbit configuration file
Files:
🧬 Code graph analysis (1)integ-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java (2)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
🔇 Additional comments (2)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🤖 Fix all issues with AI agents
In @integ-test/src/test/resources/expectedOutput/ppl/clickbench/q15.yaml:
- Line 21: Update the aggregation bucket size to 10000 everywhere: change the
AGGREGATION_BUCKET_SIZE constant in AggregationQueryBuilder.java from 1000 to
10000, and update the Calcite expected-output test files that still use
"size":1000 (e.g., explain_agg_counts_by2.yaml, explain_dedup_complex2.yaml,
explain_stats_by_span.json and other files under the calcite expectedOutput set)
to use "size":10000 so they match the PPL test files; after making these edits
regenerate or re-run the affected integration tests to refresh expectations.
In @integ-test/src/test/resources/expectedOutput/ppl/clickbench/q17.yaml:
- Line 19: The expected output contains a hardcoded composite aggregation size
of 10000; update the composite aggregation size value to 1000 to match the
default plugins.query.buckets setting (default defined in SQLIntegTestCase.java
/ UnifiedQueryOpenSearchIT.java / AggregationQueryBuilder.java) so the test
reflects the default behavior (change the "size":10000 entry under the
"composite" aggregation to "size":1000).
In @integ-test/src/test/resources/expectedOutput/ppl/clickbench/q32.yaml:
- Line 21: Update the hardcoded composite bucket size in the expected YAML back
to the configured default of 1000: find the occurrence of
"aggregations\":{\"composite_buckets\":{\"composite\":{\"size\":10000" in
integ-test/src/test/resources/expectedOutput/ppl/clickbench/q32.yaml and change
the size value from 10000 to 1000 so the expected output matches the configured
plugins.query.buckets default used by UnifiedQueryOpenSearchIT,
SQLIntegTestCase, and StandaloneIT.
In @integ-test/src/test/resources/expectedOutput/ppl/clickbench/q41.yaml:
- Line 33: The PPL test expected output for q41.yaml should use a composite
aggregation bucket size of 10000; update the composite aggregation JSON in
integ-test/src/test/resources/expectedOutput/ppl/clickbench/q41.yaml so the
"composite":{"size":10000,...} appears for the PPL/non-calcite variant while
leaving calcite-enabled test files at size 1000 (they are driven by
plugins.query.buckets default of 1000); ensure any other PPL expectedOutput
files that mirror this change are updated consistently and do not change
calcite-enabled fixtures.
In @integ-test/src/test/resources/expectedOutput/ppl/clickbench/q8.yaml:
- Line 19: The expected output contains a hardcoded composite bucket size of
10000; update the value to 1000 to match the default
plugins.query.buckets/DEFAULT_QUERY_BUCKET_SIZE used in the codebase. Locate the
string "aggregations\":{\"composite_buckets\":{\"composite\":{\"size\":10000" in
the expected YAML (ppl/clickbench/q8.yaml) and replace 10000 with 1000 so test
fixtures align with SQLIntegTestCase.DEFAULT_QUERY_BUCKET_SIZE.
In @integ-test/src/test/resources/expectedOutput/ppl/clickbench/q9.yaml:
- Line 19: The composite aggregation bucket size is hardcoded as "size":10000 in
the test expected output but should match the codebase default
AGGREGATION_BUCKET_SIZE and the integration test setting plugins.query.buckets
(1000); update the YAML entry that contains the composite aggregation size (the
fragment with "composite":{"size":10000,...) to use "size":1000 so the expected
output aligns with AggregationQueryBuilder.AGGREGATION_BUCKET_SIZE and the
plugins.query.buckets default.
In
@opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.java:
- Around line 357-365: The getQueryBucketSize method can NPE when
settings.getSettingValue(Settings.Key.QUERY_BUCKET_SIZE) returns null and also
may return non-positive values; update getQueryBucketSize to read the configured
value into a local Integer, treat null as a safe default (e.g., 1 or a
configured fallback), then compute the bucket size as the min between that safe
value and getMaxBuckets(), and finally ensure the returned value is at least 1
(or another validated minimum) before returning; reference
Settings.Key.QUERY_BUCKET_SIZE, settings.getSettingValue(...),
getQueryBucketSize(), and getMaxBuckets() when making the change.
- Around line 367-378: getMaxBuckets currently swallows all Exceptions; change
it to catch only the expected setting-related exceptions (e.g.,
IllegalArgumentException, NoSuchElementException, or the specific
Settings/Config exception thrown by settings.getSettingValue) instead of a bare
Exception, log a warning or info with the exception and that DEFAULT_MAX_BUCKETS
will be used, and return DEFAULT_MAX_BUCKETS; ensure a logger (e.g., private
static final Logger logger) is available in OpenSearchRequestBuilder and use
logger.warn("Using default max buckets due to error reading setting", e) (or
similar) to record the fallback.
🧹 Nitpick comments (2)
opensearch/src/test/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanOptimizationTest.java (1)
107-109: Consider whether lenient stubbing is necessary.The
lenient()stub forgetQueryBucketSize()allows the method to be called without strict verification. If all test scenarios are expected to call this method (which appears likely given the aggregation query building), consider using regularwhen()stubbing instead for stricter test assertions.Alternative with strict stubbing
- lenient() - .when(requestBuilder.getQueryBucketSize()) - .thenReturn(AggregationQueryBuilder.AGGREGATION_BUCKET_SIZE); + when(requestBuilder.getQueryBucketSize()) + .thenReturn(AggregationQueryBuilder.AGGREGATION_BUCKET_SIZE);opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilder.java (1)
56-69: Consider adding validation for the bucket size parameter.The new constructor and field correctly implement configurable bucket sizing with good backward compatibility via constructor delegation. However, there's no validation to ensure
queryBucketSizeis within reasonable bounds (e.g., positive and not excessively large).✨ Proposed validation
/** Aggregation Query Builder Constructor with custom bucket size. */ public AggregationQueryBuilder(ExpressionSerializer serializer, int queryBucketSize) { + if (queryBucketSize <= 0) { + throw new IllegalArgumentException( + String.format("Query bucket size must be positive, got: %d", queryBucketSize)); + } this.compositeBuilder = new CompositeAggregationBuilder(serializer); this.metricBuilder = new MetricAggregationBuilder(serializer); this.queryBucketSize = queryBucketSize; }Alternatively, consider documenting in JavaDoc the expected valid range for the
queryBucketSizeparameter.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (48)
docs/user/optimization/optimization.rstinteg-test/src/test/resources/expectedOutput/ppl/big5/composite_date_histogram_daily.yamlinteg-test/src/test/resources/expectedOutput/ppl/big5/composite_terms.yamlinteg-test/src/test/resources/expectedOutput/ppl/big5/composite_terms_keyword.yamlinteg-test/src/test/resources/expectedOutput/ppl/big5/date_histogram_hourly_agg.yamlinteg-test/src/test/resources/expectedOutput/ppl/big5/date_histogram_minute_agg.yamlinteg-test/src/test/resources/expectedOutput/ppl/big5/keyword_terms.yamlinteg-test/src/test/resources/expectedOutput/ppl/big5/keyword_terms_low_cardinality.yamlinteg-test/src/test/resources/expectedOutput/ppl/big5/multi_terms_keyword.yamlinteg-test/src/test/resources/expectedOutput/ppl/big5/terms_significant_1.yamlinteg-test/src/test/resources/expectedOutput/ppl/big5/terms_significant_2.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q10.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q11.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q12.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q13.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q14.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q15.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q16.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q17.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q18.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q22.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q23.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q28.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q31.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q32.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q33.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q34.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q37.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q38.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q41.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q42.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q8.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q9.yamlinteg-test/src/test/resources/expectedOutput/ppl/explain_filter_agg_push.yamlinteg-test/src/test/resources/expectedOutput/ppl/explain_output.yamlinteg-test/src/test/resources/expectedOutput/ppl/explain_patterns_agg_push.jsoninteg-test/src/test/resources/expectedOutput/ppl/explain_patterns_simple_pattern_agg_push.yamlinteg-test/src/test/resources/expectedOutput/ppl/explain_sort_agg_push.jsoninteg-test/src/test/resources/expectedOutput/ppl/explain_sort_then_agg_push.jsoninteg-test/src/test/resources/expectedOutput/ppl/explain_stats_by_span.jsoninteg-test/src/test/resources/expectedOutput/ppl/explain_stats_by_span_non_bucket_nullable.jsoninteg-test/src/test/resources/expectedOutput/ppl/explain_stats_by_timespan.yamlinteg-test/src/test/resources/expectedOutput/ppl/explain_stats_by_timespan2.yamlopensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanAggregationBuilder.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilder.javaopensearch/src/test/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanOptimizationTest.javaopensearch/src/test/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilderTest.java
🧰 Additional context used
📓 Path-based instructions (5)
integ-test/src/test/resources/**/*
⚙️ CodeRabbit configuration file
integ-test/src/test/resources/**/*: - Verify test data is realistic and representative
- Check data format matches expected schema
- Ensure test data covers edge cases and boundary conditions
Files:
integ-test/src/test/resources/expectedOutput/ppl/big5/composite_terms.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q33.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q10.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q42.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q23.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q14.yamlinteg-test/src/test/resources/expectedOutput/ppl/explain_filter_agg_push.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q41.yamlinteg-test/src/test/resources/expectedOutput/ppl/explain_sort_then_agg_push.jsoninteg-test/src/test/resources/expectedOutput/ppl/explain_stats_by_span_non_bucket_nullable.jsoninteg-test/src/test/resources/expectedOutput/ppl/clickbench/q9.yamlinteg-test/src/test/resources/expectedOutput/ppl/big5/keyword_terms.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q8.yamlinteg-test/src/test/resources/expectedOutput/ppl/big5/terms_significant_2.yamlinteg-test/src/test/resources/expectedOutput/ppl/big5/multi_terms_keyword.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q31.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q32.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q37.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q38.yamlinteg-test/src/test/resources/expectedOutput/ppl/big5/composite_terms_keyword.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q13.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q22.yamlinteg-test/src/test/resources/expectedOutput/ppl/explain_patterns_agg_push.jsoninteg-test/src/test/resources/expectedOutput/ppl/big5/terms_significant_1.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q12.yamlinteg-test/src/test/resources/expectedOutput/ppl/explain_stats_by_timespan.yamlinteg-test/src/test/resources/expectedOutput/ppl/big5/date_histogram_hourly_agg.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q28.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q17.yamlinteg-test/src/test/resources/expectedOutput/ppl/big5/composite_date_histogram_daily.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q11.yamlinteg-test/src/test/resources/expectedOutput/ppl/explain_sort_agg_push.jsoninteg-test/src/test/resources/expectedOutput/ppl/clickbench/q15.yamlinteg-test/src/test/resources/expectedOutput/ppl/big5/keyword_terms_low_cardinality.yamlinteg-test/src/test/resources/expectedOutput/ppl/explain_output.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q18.yamlinteg-test/src/test/resources/expectedOutput/ppl/explain_stats_by_span.jsoninteg-test/src/test/resources/expectedOutput/ppl/clickbench/q16.yamlinteg-test/src/test/resources/expectedOutput/ppl/big5/date_histogram_minute_agg.yamlinteg-test/src/test/resources/expectedOutput/ppl/explain_patterns_simple_pattern_agg_push.yamlinteg-test/src/test/resources/expectedOutput/ppl/explain_stats_by_timespan2.yamlinteg-test/src/test/resources/expectedOutput/ppl/clickbench/q34.yaml
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanAggregationBuilder.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilder.javaopensearch/src/test/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilderTest.javaopensearch/src/test/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanOptimizationTest.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanAggregationBuilder.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilder.javaopensearch/src/test/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilderTest.javaopensearch/src/test/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanOptimizationTest.java
docs/**/*.rst
⚙️ CodeRabbit configuration file
docs/**/*.rst: - Verify examples use valid test data and indices
- Check documentation follows existing patterns and structure
- Validate syntax examples are complete and correct
- Ensure code examples are tested and working
- Check for consistent formatting and style
Files:
docs/user/optimization/optimization.rst
**/*Test.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*Test.java: All new business logic requires unit tests
Name unit tests with*Test.javasuffix in OpenSearch SQL
Files:
opensearch/src/test/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilderTest.javaopensearch/src/test/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanOptimizationTest.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify NULL input tests for all new functions
- Check boundary condition tests (min/max values, empty inputs)
- Validate error condition tests (invalid inputs, exceptions)
- Ensure multi-document tests for per-document operations
- Flag smoke tests without meaningful assertions
- Check test naming follows pattern: test
- Verify test data is realistic and covers edge cases
- Verify test coverage for new business logic
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
opensearch/src/test/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilderTest.javaopensearch/src/test/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanOptimizationTest.java
🧠 Learnings (2)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/resources/expectedOutput/ppl/explain_stats_by_span_non_bucket_nullable.jsonopensearch/src/test/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanOptimizationTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Applied to files:
opensearch/src/test/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanOptimizationTest.java
🧬 Code graph analysis (1)
opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.java (1)
common/src/main/java/org/opensearch/sql/common/setting/Settings.java (1)
Settings(17-105)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: security-it-linux (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (21, unit)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (25, doc)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: test-sql-cli-integration (21)
- GitHub Check: CodeQL-Scan (java)
🔇 Additional comments (33)
integ-test/src/test/resources/expectedOutput/ppl/explain_sort_then_agg_push.json (1)
10-10: Test expectation files showing composite aggregation size of 10000 are new additions, but claims about "new default bucket size behavior" are unverified.This commit adds new test expectation files including
explain_sort_then_agg_push.jsonwith a composite aggregation size of 10000. The test data itself is realistic and valid for the sort-then-aggregation pushdown scenario with state grouping and average balance calculation.However, the original review's claim that this reflects a "new default bucket size behavior" cannot be verified:
AGGREGATION_BUCKET_SIZEconstant remains 1000 (unchanged)- Test settings configure
QUERY_BUCKET_SIZEas 1000- The source of the 10000 value in the test expectations is unclear from this commit alone
Verify that:
- The 10000 composite size matches the actual behavior produced by the query execution
- This value originates from the proper source (test configuration,
queryBucketSizeparameter, or bucket size capping logic)- The test expectations align with the implementation changes in preceding commits
integ-test/src/test/resources/expectedOutput/ppl/big5/terms_significant_2.yaml (1)
19-19: Update test data size parameter to 10000 in composite aggregation fixtures.Line 19 of this file contains expected output for a composite aggregation query with the bucket size set to 10000. This aligns with the PR objective to reference the
plugins.query.bucketsconfiguration setting.The test data structure and format are correct. However, before approving this change, verify:
- Edge case test coverage: Ensure test scenarios include cases where result sets approach or exceed the 10000 bucket limit to validate behavior at boundaries.
- Bounds and validation: Confirm that the
plugins.query.bucketssetting has appropriate validation to prevent excessively large or invalid values.- Performance testing: Conduct testing with the new 10000 bucket default to assess memory and performance impact on queries returning large result sets.
Per the test data guidelines for this path, confirm that the test data remains realistic and representative of actual query patterns, and that boundary conditions are adequately covered.
integ-test/src/test/resources/expectedOutput/ppl/explain_stats_by_timespan.yaml (1)
10-14: Clarify bucket size value in expected output and test coverage gap.The expected output shows
"size":10000for composite aggregations, but the test does not explicitly configure bucket sizes—it uses the defaultplugins.query.bucketsvalue of 1000. The origin of the 10000 value in the output needs clarification, as it does not align with the configured default in ExplainIT.Additionally, the review comment claims the files represent an "update from 1000 to 10000," but
git showindicates these are newly created files in commit 2711d23, not updates from previous values.Regarding test coverage: Custom and edge case bucket size testing exists in CalciteExplainIT (using
setQueryBucketSize()), but the specific test using this file (ExplainIT.testStatsByTimeSpan) does not vary bucket sizes. If the PR objective is to test theplugins.query.bucketssetting behavior, consider adding parameterized tests with different bucket sizes to ExplainIT, or verify that SysLimit defaults are intentionally providing the 10000 value independently of QUERY_BUCKET_SIZE.integ-test/src/test/resources/expectedOutput/ppl/clickbench/q18.yaml (1)
15-15: LGTM! Test expectation correctly updated.The composite bucket size update to 10000 matches the new default and is consistent with related test files.
integ-test/src/test/resources/expectedOutput/ppl/clickbench/q34.yaml (1)
19-19: LGTM! Test expectation correctly updated.The composite bucket size update to 10000 matches the new default and is consistent with related test files.
integ-test/src/test/resources/expectedOutput/ppl/explain_output.yaml (1)
28-28: LGTM! Test expectation correctly updated.The composite bucket size update to 10000 matches the new default and is consistent with related test files.
integ-test/src/test/resources/expectedOutput/ppl/big5/composite_date_histogram_daily.yaml (1)
19-19: LGTM! Test expectation correctly updated.The composite bucket size update to 10000 matches the new default and is consistent with related test files.
integ-test/src/test/resources/expectedOutput/ppl/clickbench/q10.yaml (1)
19-19: Correct the bucket size assumption: 10000 does not match theplugins.query.bucketsdefault.The
plugins.query.bucketssetting defaults to1000, not10000. The expected output files show composite aggregation size of10000, which differs from the actual setting default documented in settings.md and configured in UnifiedQueryOpenSearchIT.java. Verify whether10000is intentional for the composite aggregation (which may be a different limit thanplugins.query.buckets) or if it should be aligned with the actual default of1000.Likely an incorrect or invalid review comment.
integ-test/src/test/resources/expectedOutput/ppl/explain_sort_agg_push.json (1)
22-22: Expected output correctly updated to reflect new default bucket size.The composite aggregation size change from 1000 to 10000 aligns with the PR's introduction of configurable bucket sizes and the updated default value.
integ-test/src/test/resources/expectedOutput/ppl/clickbench/q14.yaml (1)
21-21: Expected output correctly updated.The change consistently reflects the new default composite aggregation bucket size of 10000.
integ-test/src/test/resources/expectedOutput/ppl/clickbench/q13.yaml (1)
21-21: Expected output correctly updated.The composite bucket size change is consistent with the PR's objectives and matches updates in other test expectation files.
integ-test/src/test/resources/expectedOutput/ppl/clickbench/q42.yaml (1)
31-31: Expected output correctly updated.The change appropriately reflects the new default composite aggregation bucket size.
integ-test/src/test/resources/expectedOutput/ppl/clickbench/q12.yaml (1)
21-21: Expected output correctly updated.The composite bucket size change is consistent across all reviewed test expectation files and correctly reflects the new configurable bucket size behavior introduced in this PR.
integ-test/src/test/resources/expectedOutput/ppl/clickbench/q38.yaml (1)
33-33: LGTM! Composite aggregation size updated correctly.The bucket size increase from 1000 to 10000 aligns with the PR objective to reference the
plugins.query.bucketssetting in V2. The test expectation accurately reflects the new default behavior.integ-test/src/test/resources/expectedOutput/ppl/clickbench/q11.yaml (1)
21-21: LGTM! Bucket size updated consistently.The composite aggregation size change to 10000 is consistent with the configurable bucket size functionality introduced in this PR.
integ-test/src/test/resources/expectedOutput/ppl/big5/composite_terms.yaml (1)
19-19: LGTM! Multi-source composite aggregation updated correctly.The bucket size increase appropriately covers the multi-field composite aggregation scenario (process.name and cloud.region), ensuring comprehensive test coverage for the new configurable bucket size feature.
integ-test/src/test/resources/expectedOutput/ppl/big5/keyword_terms_low_cardinality.yaml (1)
19-19: LGTM! Low cardinality test updated appropriately.The bucket size update to 10000 is appropriate for the low cardinality keyword terms scenario. This test case provides good coverage for edge cases where the actual result size may be significantly lower than the bucket limit.
integ-test/src/test/resources/expectedOutput/ppl/explain_stats_by_span.json (1)
11-11: LGTM! Histogram-based composite aggregation updated correctly.The bucket size update to 10000 extends to histogram-based composite aggregations (span operation), demonstrating comprehensive test coverage across different aggregation types (terms and histogram).
integ-test/src/test/resources/expectedOutput/ppl/clickbench/q16.yaml (1)
19-19: LGTM: Composite bucket size updated to reflect new default.The change from 1000 to 10000 correctly reflects the new default composite aggregation bucket size introduced by this PR's configurable bucket size feature.
integ-test/src/test/resources/expectedOutput/ppl/clickbench/q22.yaml (1)
24-24: LGTM: Composite bucket size updated to reflect new default.The change from 1000 to 10000 correctly reflects the new default composite aggregation bucket size.
integ-test/src/test/resources/expectedOutput/ppl/clickbench/q28.yaml (1)
26-26: LGTM: Composite bucket size updated to reflect new default.The change from 1000 to 10000 correctly reflects the new default composite aggregation bucket size.
integ-test/src/test/resources/expectedOutput/ppl/clickbench/q37.yaml (1)
33-33: LGTM: Composite bucket size updated to reflect new default.The change from 1000 to 10000 correctly reflects the new default composite aggregation bucket size.
integ-test/src/test/resources/expectedOutput/ppl/explain_stats_by_span_non_bucket_nullable.json (1)
10-10: Remove misleading claim about "new default" bucket size.The composite aggregation size of 10000 in this test file is the configured value for PPL integration tests (via
DEFAULT_SUBSEARCH_MAXOUT), not a change to the default constant (AGGREGATION_BUCKET_SIZE = 1000). TheAggregationQueryBuilderdoes support configurable bucket sizes, but this test expectation reflects PPL's specific test configuration, not a change to the codebase default. The test data itself is valid, realistic, and appropriately covers the non-bucket-nullable edge case.Likely an incorrect or invalid review comment.
integ-test/src/test/resources/expectedOutput/ppl/clickbench/q33.yaml (1)
19-19: LGTM! Test expectation correctly updated.The composite aggregation size has been updated from 1000 to 10000, correctly reflecting the new default bucket size behavior introduced in this PR.
opensearch/src/test/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilderTest.java (2)
716-750: LGTM! Comprehensive test coverage for custom bucket size.The test properly validates that the new
AggregationQueryBuilder(serializer, int bucketSize)constructor correctly applies the custom bucket size (500) to the generated composite aggregation. The test follows existing patterns and naming conventions.
783-797: LGTM! Helper method follows existing patterns.The
buildQueryWithCustomBuilder()helper appropriately mirrors the existingbuildQuery()pattern, enabling test coverage for the custom builder constructor.docs/user/optimization/optimization.rst (3)
283-283: LGTM! Documentation updated consistently.The composite aggregation size has been updated from 1000 to 10000 across all documentation examples, correctly reflecting the new default bucket size behavior.
309-309: LGTM! Consistent with other documentation updates.The composite size update is consistent across all examples in the documentation.
344-344: LGTM! Final documentation example updated consistently.All three examples in the documentation now consistently show the new composite size of 10000.
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexScanAggregationBuilder.java (1)
57-58: LGTM! Correct implementation of configurable bucket size.The
AggregationQueryBuilderinstantiation now properly passes the bucket size fromrequestBuilder.getQueryBucketSize(), enabling dynamic configuration based on cluster settings instead of hard-coded values. This aligns with the PR objective to referenceplugins.query.bucketsin V2.The underlying implementation correctly bounds the configured bucket size using
Math.min()against the cluster's maximum buckets limit, with a fallback toDEFAULT_MAX_BUCKETSif the cluster setting is unavailable. Edge cases are properly handled.opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.java (1)
12-12: LGTM!The static import for
DEFAULT_MAX_BUCKETSis appropriately used as a fallback value in thegetMaxBuckets()method.opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/aggregation/AggregationQueryBuilder.java (2)
47-48: LGTM!The comment update correctly reflects that
AGGREGATION_BUCKET_SIZEnow serves as the default value rather than a fixed constant.
106-106: LGTM!The update correctly uses the configurable
queryBucketSizefield instead of the hard-coded constant, achieving the PR's objective of making bucket size configurable in V2.
| /** | ||
| * Get the configured query bucket size, which is the minimum of the plugin setting and the | ||
| * cluster's max buckets limit. | ||
| * | ||
| * @return the query bucket size to use for aggregations | ||
| */ | ||
| public Integer getQueryBucketSize() { | ||
| return Math.min(settings.getSettingValue(Settings.Key.QUERY_BUCKET_SIZE), getMaxBuckets()); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add null safety and validate the returned bucket size.
The method has potential robustness issues:
- If
settings.getSettingValue(Settings.Key.QUERY_BUCKET_SIZE)returnsnull,Math.min()will throw aNullPointerExceptionduring auto-unboxing. - No validation ensures the result is positive.
Consider adding defensive checks to handle edge cases gracefully.
🛡️ Proposed fix with null safety and validation
public Integer getQueryBucketSize() {
- return Math.min(settings.getSettingValue(Settings.Key.QUERY_BUCKET_SIZE), getMaxBuckets());
+ Integer configuredSize = settings.getSettingValue(Settings.Key.QUERY_BUCKET_SIZE);
+ if (configuredSize == null || configuredSize <= 0) {
+ return getMaxBuckets();
+ }
+ return Math.min(configuredSize, getMaxBuckets());
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /** | |
| * Get the configured query bucket size, which is the minimum of the plugin setting and the | |
| * cluster's max buckets limit. | |
| * | |
| * @return the query bucket size to use for aggregations | |
| */ | |
| public Integer getQueryBucketSize() { | |
| return Math.min(settings.getSettingValue(Settings.Key.QUERY_BUCKET_SIZE), getMaxBuckets()); | |
| } | |
| /** | |
| * Get the configured query bucket size, which is the minimum of the plugin setting and the | |
| * cluster's max buckets limit. | |
| * | |
| * @return the query bucket size to use for aggregations | |
| */ | |
| public Integer getQueryBucketSize() { | |
| Integer configuredSize = settings.getSettingValue(Settings.Key.QUERY_BUCKET_SIZE); | |
| if (configuredSize == null || configuredSize <= 0) { | |
| return getMaxBuckets(); | |
| } | |
| return Math.min(configuredSize, getMaxBuckets()); | |
| } |
🤖 Prompt for AI Agents
In
@opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.java
around lines 357 - 365, The getQueryBucketSize method can NPE when
settings.getSettingValue(Settings.Key.QUERY_BUCKET_SIZE) returns null and also
may return non-positive values; update getQueryBucketSize to read the configured
value into a local Integer, treat null as a safe default (e.g., 1 or a
configured fallback), then compute the bucket size as the min between that safe
value and getMaxBuckets(), and finally ensure the returned value is at least 1
(or another validated minimum) before returning; reference
Settings.Key.QUERY_BUCKET_SIZE, settings.getSettingValue(...),
getQueryBucketSize(), and getMaxBuckets() when making the change.
| /** | ||
| * Get the maximum number of buckets allowed by the cluster. | ||
| * | ||
| * @return the max buckets setting value, or default if not available | ||
| */ | ||
| private Integer getMaxBuckets() { | ||
| try { | ||
| return settings.getSettingValue(Settings.Key.SEARCH_MAX_BUCKETS); | ||
| } catch (Exception e) { | ||
| return DEFAULT_MAX_BUCKETS; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refine exception handling and add logging.
The current implementation has two issues:
- Catching bare
Exceptionis overly broad and may hide unexpected errors. - The exception is silently swallowed, which could obscure configuration problems during troubleshooting.
Consider catching specific exception types and logging the fallback behavior.
🔧 Proposed improvement
private Integer getMaxBuckets() {
try {
return settings.getSettingValue(Settings.Key.SEARCH_MAX_BUCKETS);
- } catch (Exception e) {
+ } catch (IllegalArgumentException | NullPointerException e) {
+ // Log at debug level since fallback to default is acceptable
+ // logger.debug("Unable to retrieve SEARCH_MAX_BUCKETS setting, using default: {}", DEFAULT_MAX_BUCKETS, e);
return DEFAULT_MAX_BUCKETS;
}
}Note: Add the appropriate logger if not already present.
🤖 Prompt for AI Agents
In
@opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.java
around lines 367 - 378, getMaxBuckets currently swallows all Exceptions; change
it to catch only the expected setting-related exceptions (e.g.,
IllegalArgumentException, NoSuchElementException, or the specific
Settings/Config exception thrown by settings.getSettingValue) instead of a bare
Exception, log a warning or info with the exception and that DEFAULT_MAX_BUCKETS
will be used, and return DEFAULT_MAX_BUCKETS; ensure a logger (e.g., private
static final Logger logger) is available in OpenSearchRequestBuilder and use
logger.warn("Using default max buckets due to error reading setting", e) (or
similar) to record the fallback.
56ab7e0 to
526a1fb
Compare
526a1fb to
2711d23
Compare
Description
Straightforward update to use the setting in V2 as we do in Calcite. Lots of test changes though :/
Related Issues
N/A
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.