-
Notifications
You must be signed in to change notification settings - Fork 180
Prune old in operator push down rules #4992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
# Conflicts: # opensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateAnalyzer.java
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
📝 WalkthroughSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds PlanUtils helpers to detect aggregate-derived predicates and prune redundant RelNodes; updates many push-down rules to implement SubstitutionRule and call PlanUtils.tryPruneRelNodes after transforms; adapts pushdown APIs (nullable Project, ProjectDigest, PushDownContext.cloneForAggregate) and updates aggregate/scan analysis and tests to new pushdown ordering and bindings. Changes
Sequence Diagram(s)sequenceDiagram
participant Planner as CalcitePlanner
participant Rule as PushDownRule
participant PU as PlanUtils
participant VP as VolcanoPlanner
participant Rel as RelSet
Planner->>Rule: onMatch(call, rels...)
activate Rule
Rule->>Rule: compute transformedRel (push-down)
alt transform produced newRel
Rule->>Planner: transformTo(newRel)
Note right of Rule `#e6f7ff`: after successful transform call pruning helper
Rule->>PU: tryPruneRelNodes(call)
activate PU
PU->>VP: check planner type & auto-prune
alt eligible
PU->>Rel: identify top-down redundant/equivalent RelNodes
Rel-->>VP: remove/prune old RelNodes
VP-->>PU: pruning applied
else not eligible
PU-->>Rule: no-op
end
deactivate PU
end
Rule-->>Planner: onMatch complete
deactivate Rule
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (1)integ-test/src/test/resources/**/*⚙️ CodeRabbit configuration file
Files:
🧠 Learnings (3)📓 Common learnings📚 Learning: 2025-12-02T17:27:55.938ZApplied to files:
📚 Learning: 2025-12-11T05:27:39.856ZApplied to files:
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
🔇 Additional comments (1)
Comment |
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/AggregateIndexScanRule.java (1)
190-233: Add a comment explaining the predicate difference between BUCKET_NON_NULL_AGG configs.
BUCKET_NON_NULL_AGG(line 153) uses.predicate(aggIgnoreNullBucket), whileBUCKET_NON_NULL_AGG_WITH_UDF(line 190) uses.predicate(aggIgnoreNullBucket.or(maybeTimeSpanAgg)). Both configs are actively registered and used in different contexts (different pattern structures), but the reason for this design difference is not documented. Add a brief comment explaining why the WITH_UDF variant needs to additionally matchmaybeTimeSpanAggpredicates.
🧹 Nitpick comments (5)
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/ProjectDigest.java (1)
10-15: Add JavaDoc for public record.Per coding guidelines, public classes require proper JavaDoc. This record would benefit from documentation explaining its purpose and the relationship between
namesandselectedColumns.Suggested JavaDoc
+/** + * Represents a digest of a projection containing field names and their corresponding + * column indices in the underlying relation. + * + * @param names the list of projected field names + * @param selectedColumns the list of column indices corresponding to each projected field + */ public record ProjectDigest(List<String> names, List<Integer> selectedColumns) { @Override public String toString() { return names.toString(); } }opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/RareTopPushdownRule.java (1)
68-70: Consider logging or narrowing the catch-all exception handler.The catch block silently swallows all exceptions, which could mask bugs or unexpected conditions. Consider either:
- Logging the exception at DEBUG level for troubleshooting
- Narrowing to specific expected exceptions (e.g.,
ClassCastException,IndexOutOfBoundsException)🔎 Proposed logging improvement
} catch (Exception e) { + // Log at debug level for troubleshooting pushdown failures + // e.g., java.util.logging.Logger or slf4j return; }opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java (1)
402-424: Consider simplifying the limit reduction check.The logic for
canReduceEstimatedRowsCountis correct but could be simplified. The stream pipeline to find the previous limit digest is somewhat verbose.🔎 Proposed simplification
- boolean canReduceEstimatedRowsCount = true; - if (pushDownContext.isLimitPushed()) { - Optional<Integer> previousRowCount = - pushDownContext.getQueue().reversed().stream() - .takeWhile(operation -> operation.type() != PushDownType.AGGREGATION) - .filter(operation -> operation.type() == PushDownType.LIMIT) - .findFirst() - .map(operation -> (LimitDigest) operation.digest()) - .map(limitDigest -> limitDigest.offset() + limitDigest.limit()); - if (previousRowCount.isPresent()) { - canReduceEstimatedRowsCount = totalSize < previousRowCount.get(); - } - } + boolean canReduceEstimatedRowsCount = + !pushDownContext.isLimitPushed() + || pushDownContext.getQueue().reversed().stream() + .takeWhile(op -> op.type() != PushDownType.AGGREGATION) + .filter(op -> op.type() == PushDownType.LIMIT) + .findFirst() + .map(op -> (LimitDigest) op.digest()) + .map(d -> totalSize < d.offset() + d.limit()) + .orElse(true);core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java (1)
658-672: Interface field predicates are idiomatic but consider static method alternatives.The
aggIgnoreNullBucketandmaybeTimeSpanAggpredicates are defined as interface fields (implicitlypublic static final). This works but differs from the static method pattern used elsewhere in this interface. For consistency with other helpers likeisTimeSpan, consider converting these to static methods.🔎 Alternative as static methods
- Predicate<Aggregate> aggIgnoreNullBucket = - agg -> - agg.getHints().stream() - .anyMatch( - hint -> - hint.hintName.equals("stats_args") - && hint.kvOptions.get(Argument.BUCKET_NULLABLE).equals("false")); - - Predicate<Aggregate> maybeTimeSpanAgg = - agg -> - agg.getGroupSet().stream() - .allMatch( - group -> - isTimeBasedType( - agg.getInput().getRowType().getFieldList().get(group).getType())); + static boolean aggIgnoreNullBucket(Aggregate agg) { + return agg.getHints().stream() + .anyMatch( + hint -> + hint.hintName.equals("stats_args") + && "false".equals(hint.kvOptions.get(Argument.BUCKET_NULLABLE))); + } + + static boolean maybeTimeSpanAgg(Aggregate agg) { + return agg.getGroupSet().stream() + .allMatch( + group -> + isTimeBasedType( + agg.getInput().getRowType().getFieldList().get(group).getType())); + }Note: The current
hint.kvOptions.get(Argument.BUCKET_NULLABLE).equals("false")could throw NPE if the key is missing. Consider using"false".equals(...)for null-safety.opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java (1)
128-131: Commented-out assertion may indicate incomplete validation.The commented assertion on lines 128-130 suggests there was intent to validate that the input field count matches the group-by list size. The remaining assertion on line 131 only checks that the group set equals
newGroupByList, not the field count alignment.Consider either removing the commented code or implementing the intended validation.
🔎 Either remove or uncomment the assertion
- // assert aggregate.getInput().getRowType().getFieldCount() == groupByList.size() : - // String.format("The input's field size should be trimmed to equal to group list size %d, but - // got %d", groupByList.size(), aggregate.getInput().getRowType().getFieldCount()); assert aggregate.getGroupSet().asList().equals(newGroupByList);
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (109)
core/src/main/java/org/opensearch/sql/calcite/plan/PPLAggGroupMergeRule.javacore/src/main/java/org/opensearch/sql/calcite/plan/PPLAggregateConvertRule.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javainteg-test/src/test/resources/expectedOutput/calcite/agg_composite_date_range_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/cardinality_agg_high.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/cardinality_agg_high_2.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/cardinality_agg_low.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/composite_date_histogram_daily.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/composite_terms.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/composite_terms_keyword.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/date_histogram_minute_agg.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/keyword_terms.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/keyword_terms_low_cardinality.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/multi_terms_keyword.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/terms_significant_1.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/terms_significant_2.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_multiple_group_keys.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_single_group_key.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_timestamp_span_and_category.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_use_other.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_with_integer_span.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_with_limit.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q10.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q11.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q12.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q13.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q14.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q15.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q16.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q17.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q18.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q19.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q2.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q21.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q22.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q23.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q28.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q29.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q31.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q32.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q33.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q34.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q36.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q37.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q38.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q39.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q40.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q41.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q42.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q43.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q5.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q6.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q8.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q9.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_paginating_head_from.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_paginating_join4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_sort_on_measure2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_sort_on_measure_complex1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_with_script.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_no_expr_output_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_w_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push5.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push6.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_keepempty_false_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr3_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_agg_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_with_search.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_criteria_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_fields_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_limit_agg_pushdown3.jsoninteg-test/src/test/resources/expectedOutput/calcite/explain_limit_agg_pushdown_bucket_nullable1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_multiple_agg_with_sort_on_one_measure_not_push1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_multiple_agg_with_sort_on_one_measure_not_push2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_output.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_scalar_correlated_subquery_in_select.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_scalar_correlated_subquery_in_where.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_script_push_on_text.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_sort_then_agg_push.jsoninteg-test/src/test/resources/expectedOutput/calcite/explain_timechart.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_timechart_count.yamlopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/AggregateIndexScanRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/ExpandCollationOnProjectExprRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/FilterIndexScanRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/LimitIndexScanRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/OpenSearchIndexRules.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/ProjectIndexScanRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/RareTopPushdownRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/RelevanceFunctionPushdownRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/SortAggregateMeasureRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/SortExprIndexScanRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/SortIndexScanRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateAnalyzer.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/ProjectDigest.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/PushDownContext.javaopensearch/src/test/java/org/opensearch/sql/opensearch/request/AggregateAnalyzerTest.java
🧰 Additional context used
📓 Path-based instructions (6)
integ-test/src/test/resources/**/*
⚙️ CodeRabbit configuration file
integ-test/src/test/resources/**/*: - Verify test data is realistic and representative
- Check data format matches expected schema
- Ensure test data covers edge cases and boundary conditions
Files:
integ-test/src/test/resources/expectedOutput/calcite/clickbench/q13.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q14.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/cardinality_agg_high.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_keepempty_false_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/agg_composite_date_range_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q37.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q8.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_paginating_head_from.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex3.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/keyword_terms.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push5.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/keyword_terms_low_cardinality.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q5.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_timechart_count.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_script_push_on_text.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_sort_on_measure2.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/date_histogram_minute_agg.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q11.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_sort_on_measure_complex1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_fields_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_no_expr_output_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q39.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/cardinality_agg_low.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/composite_date_histogram_daily.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q29.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push2.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_with_limit.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/multi_terms_keyword.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/composite_terms_keyword.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_multiple_agg_with_sort_on_one_measure_not_push1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_scalar_correlated_subquery_in_select.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_sort_then_agg_push.jsoninteg-test/src/test/resources/expectedOutput/calcite/clickbench/q34.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q10.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_paginating_join4.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_timestamp_span_and_category.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q33.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_with_integer_span.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q36.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_single_group_key.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_limit_agg_pushdown_bucket_nullable1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_w_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q40.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_scalar_correlated_subquery_in_where.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q17.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q41.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q31.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_use_other.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr3_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q22.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q19.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q2.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q16.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push6.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q9.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_agg_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex1.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q43.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q15.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex4.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q32.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_multiple_agg_with_sort_on_one_measure_not_push2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q21.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_output.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/composite_terms.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_with_search.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/terms_significant_1.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/cardinality_agg_high_2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_limit_agg_pushdown3.jsoninteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_criteria_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_with_script.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_timechart.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_multiple_group_keys.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/terms_significant_2.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q6.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q23.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q42.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q18.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q38.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q28.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q12.yaml
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/SortExprIndexScanRule.javacore/src/main/java/org/opensearch/sql/calcite/plan/PPLAggGroupMergeRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/ProjectIndexScanRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/SortIndexScanRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/ExpandCollationOnProjectExprRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/RareTopPushdownRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/PushDownContext.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/SortAggregateMeasureRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/OpenSearchIndexRules.javacore/src/main/java/org/opensearch/sql/calcite/plan/PPLAggregateConvertRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/FilterIndexScanRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/ProjectDigest.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.javaopensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateAnalyzer.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/AggregateIndexScanRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/LimitIndexScanRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/RelevanceFunctionPushdownRule.javaopensearch/src/test/java/org/opensearch/sql/opensearch/request/AggregateAnalyzerTest.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/SortExprIndexScanRule.javacore/src/main/java/org/opensearch/sql/calcite/plan/PPLAggGroupMergeRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/ProjectIndexScanRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/SortIndexScanRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/ExpandCollationOnProjectExprRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/RareTopPushdownRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/PushDownContext.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/SortAggregateMeasureRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/OpenSearchIndexRules.javacore/src/main/java/org/opensearch/sql/calcite/plan/PPLAggregateConvertRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/FilterIndexScanRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/ProjectDigest.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.javaopensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateAnalyzer.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/AggregateIndexScanRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/LimitIndexScanRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/RelevanceFunctionPushdownRule.javaopensearch/src/test/java/org/opensearch/sql/opensearch/request/AggregateAnalyzerTest.java
core/src/main/java/**/*.java
⚙️ CodeRabbit configuration file
core/src/main/java/**/*.java: - New functions MUST have unit tests in the same commit
Files:
core/src/main/java/org/opensearch/sql/calcite/plan/PPLAggGroupMergeRule.javacore/src/main/java/org/opensearch/sql/calcite/plan/PPLAggregateConvertRule.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
core/src/main/java/org/opensearch/sql/calcite/plan/PPLAggGroupMergeRule.javacore/src/main/java/org/opensearch/sql/calcite/plan/PPLAggregateConvertRule.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
**/*Test.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*Test.java: All new business logic requires unit tests
Name unit tests with*Test.javasuffix in OpenSearch SQL
Files:
opensearch/src/test/java/org/opensearch/sql/opensearch/request/AggregateAnalyzerTest.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify NULL input tests for all new functions
- Check boundary condition tests (min/max values, empty inputs)
- Validate error condition tests (invalid inputs, exceptions)
- Ensure multi-document tests for per-document operations
- Flag smoke tests without meaningful assertions
- Check test naming follows pattern: test
- Verify test data is realistic and covers edge cases
- Verify test coverage for new business logic
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
opensearch/src/test/java/org/opensearch/sql/opensearch/request/AggregateAnalyzerTest.java
🧠 Learnings (6)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/resources/expectedOutput/calcite/clickbench/q13.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q14.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/cardinality_agg_high.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_keepempty_false_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/agg_composite_date_range_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q37.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q8.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_paginating_head_from.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex3.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/keyword_terms.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push5.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/keyword_terms_low_cardinality.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q5.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_timechart_count.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_script_push_on_text.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_sort_on_measure2.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/date_histogram_minute_agg.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q11.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_sort_on_measure_complex1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_fields_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_no_expr_output_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q39.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/cardinality_agg_low.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/composite_date_histogram_daily.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q29.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push2.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_with_limit.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/multi_terms_keyword.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/composite_terms_keyword.yamlopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/ExpandCollationOnProjectExprRule.javainteg-test/src/test/resources/expectedOutput/calcite/explain_multiple_agg_with_sort_on_one_measure_not_push1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_scalar_correlated_subquery_in_select.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_sort_then_agg_push.jsoninteg-test/src/test/resources/expectedOutput/calcite/clickbench/q34.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q10.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_paginating_join4.yamlopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/RareTopPushdownRule.javainteg-test/src/test/resources/expectedOutput/calcite/chart_timestamp_span_and_category.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q33.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_with_integer_span.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q36.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_single_group_key.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_limit_agg_pushdown_bucket_nullable1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_w_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q40.yamlopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/SortAggregateMeasureRule.javainteg-test/src/test/resources/expectedOutput/calcite/explain_scalar_correlated_subquery_in_where.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q17.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q41.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q31.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_use_other.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr3_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q22.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q19.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q2.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q16.yamlcore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javainteg-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push6.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q9.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_agg_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex1.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q43.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q15.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex4.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q32.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_multiple_agg_with_sort_on_one_measure_not_push2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q21.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_output.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/composite_terms.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_with_search.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/terms_significant_1.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/cardinality_agg_high_2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_limit_agg_pushdown3.jsoninteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_criteria_max_option.yamlopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.javainteg-test/src/test/resources/expectedOutput/calcite/explain_agg_with_script.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_timechart.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_multiple_group_keys.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/terms_significant_2.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q6.yamlopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/AggregateIndexScanRule.javainteg-test/src/test/resources/expectedOutput/calcite/clickbench/q23.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q42.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q18.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q38.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q28.yamlopensearch/src/test/java/org/opensearch/sql/opensearch/request/AggregateAnalyzerTest.javainteg-test/src/test/resources/expectedOutput/calcite/clickbench/q12.yaml
📚 Learning: 2025-12-11T05:27:39.856Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 0
File: :0-0
Timestamp: 2025-12-11T05:27:39.856Z
Learning: In opensearch-project/sql, for SEMI and ANTI join types in CalciteRelNodeVisitor.java, the `max` option has no effect because these join types only use the left side to filter records based on the existence of matches in the right side. The join results are identical regardless of max value (max=1, max=2, or max=∞). The early return for SEMI/ANTI joins before processing the `max` option is intentional and correct behavior.
Applied to files:
integ-test/src/test/resources/expectedOutput/calcite/big5/cardinality_agg_high.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_paginating_head_from.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/keyword_terms.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push5.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_fields_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q39.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q29.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push2.yamlopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/ExpandCollationOnProjectExprRule.javainteg-test/src/test/resources/expectedOutput/calcite/explain_multiple_agg_with_sort_on_one_measure_not_push1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_scalar_correlated_subquery_in_select.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_paginating_join4.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_timestamp_span_and_category.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_w_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_scalar_correlated_subquery_in_where.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q41.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_use_other.yamlcore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javainteg-test/src/test/resources/expectedOutput/calcite/explain_multiple_agg_with_sort_on_one_measure_not_push2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_limit_agg_pushdown3.jsoninteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_criteria_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q42.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q28.yaml
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/plan/PPLAggGroupMergeRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/ExpandCollationOnProjectExprRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/RareTopPushdownRule.javacore/src/main/java/org/opensearch/sql/calcite/plan/PPLAggregateConvertRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/RelevanceFunctionPushdownRule.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/RareTopPushdownRule.javainteg-test/src/test/resources/expectedOutput/calcite/chart_timestamp_span_and_category.yamlcore/src/main/java/org/opensearch/sql/calcite/plan/PPLAggregateConvertRule.javainteg-test/src/test/resources/expectedOutput/calcite/chart_use_other.yamlcore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Prefer `Optional<T>` for nullable returns in Java
Applied to files:
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.javaopensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateAnalyzer.java
🧬 Code graph analysis (2)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java (2)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/RareTopPushdownRule.java (1)
Value(26-107)opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/RelevanceFunctionPushdownRule.java (1)
Value(30-122)
core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java (1)
core/src/main/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactory.java (1)
OpenSearchTypeFactory(63-411)
Signed-off-by: Heng Qian <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java (2)
43-52: Consider adding a defensive length check for consistency.Other rules in the codebase (e.g.,
RelevanceFunctionPushdownRule,PPLAggGroupMergeRule) include an assertion to validatecall.rels.length. While the operand pattern guarantees 6 operands, adding a check improves defensive coding and debugging clarity.🔎 Proposed addition
@Override protected void onMatchImpl(RelOptRuleCall call) { + if (call.rels.length != 6) { + throw new AssertionError( + String.format("The length of rels should be 6 but got %s", call.rels.length)); + } final LogicalProject finalProject = call.rel(0); // TODO Used when number of duplication is more than 1 final LogicalFilter numOfDedupFilter = call.rel(1);
128-137: Pruning integration follows established pattern.The
PlanUtils.tryPruneRelNodes(call)invocation aftertransformTois consistent with otherSubstitutionRuleimplementations in the codebase (e.g.,RelevanceFunctionPushdownRule,RareTopPushdownRule).Note: The
assertstatement on lines 128-129 may be disabled in production JVMs. If this invariant is critical for correctness, consider using an explicit check that throws an appropriate exception.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
🧰 Additional context used
📓 Path-based instructions (1)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Applied to files:
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
🧬 Code graph analysis (1)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java (2)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/RareTopPushdownRule.java (1)
Value(26-107)opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/RelevanceFunctionPushdownRule.java (1)
Value(30-122)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (21, unit)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (25, integration)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: security-it-linux (25)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: test-sql-cli-integration (21)
- GitHub Check: CodeQL-Scan (java)
🔇 Additional comments (7)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java (3)
34-41: LGTM!The implementation of
SubstitutionRuleis consistent with the PR objectives and follows the same pattern used by other rules in the codebase (e.g.,RelevanceFunctionPushdownRule,RareTopPushdownRule,PPLAggGroupMergeRule). This enables the pruning behavior for removing redundant RelNodes after substitution.
88-113: LGTM!The projection building logic correctly prioritizes dedup columns first (for subsequent aggregation grouping) and appends remaining non-ROW_NUMBER columns. The dual source resolution (from
projectWithWindoworbottomProjectviaRexInputRef) handles both expression-based and reference-based dedup columns appropriately.
140-181: LGTM!The operand structure correctly matches the 6-level pattern required for dedup pushdown:
- Final project (no ROW_NUMBER)
- Filter (dedup number condition)
- Project with window (ROW_NUMBER)
- Filter (bucket non-null)
- Bottom project
- Index scan
The config rename to
DEFAULTand removal ofisProjectPushedconstraint fromtableScanCheckeralign with the PR's refactoring goals.opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java (4)
13-13: LGTM!The new imports are properly used:
@Nullablefor thepushDownAggregateparameter andProjectDigestfor the project push-down digest wrapper.Also applies to: 62-62
269-272: LGTM!The change to use
ProjectDigestprovides richer context by capturing both field names and selected column indices, which supports the new pruning logic by enabling more precise digest comparisons.
344-354: LGTM!The updated signature with
@Nullable Project projectand the use ofcloneForAggregate(aggregate, project)properly supports the aggregate-aware push-down path where project removal may occur during optimization.
401-419: No issues found. The code uses.reversed()which is a Java 21 method, and this is consistent with the project's build configuration that explicitly targets Java 21 (sourceCompatibility = JavaVersion.VERSION_21,targetCompatibility = JavaVersion.VERSION_21). The logic correctly determines whether a new limit can reduce the estimated row count by comparing against any previously pushed limit before the aggregation.
# Conflicts: # opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java
Signed-off-by: Heng Qian <[email protected]>
LantaoJin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to verify the checksum of clickbench result before being merged
| public class PPLAggGroupMergeRule extends RelRule<PPLAggGroupMergeRule.Config> | ||
| implements SubstitutionRule { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be wrapped in InterruptibleRelRule?
public class PPLAggGroupMergeRule extends InterruptibleRelRule<PPLAggGroupMergeRule.Config> implements SubstitutionRule
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
InterruptibleRelRule is in package opensearch and has dependency on OpenSearchTimeoutException while this rule is package core.
Therefore, we cannot make this extends InterruptibleRelRule unless move that from package opensearch to core and add library opensearch in core gradle.
On the other hand, if there is interrupt triggered in planning process, it should be detected in our push down rules in package opensearch.
core/src/main/java/org/opensearch/sql/calcite/plan/PPLAggGroupMergeRule.java
Show resolved
Hide resolved
integ-test/src/test/resources/expectedOutput/calcite/clickbench/q17.yaml
Show resolved
Hide resolved
integ-test/src/test/resources/expectedOutput/calcite/clickbench/q18.yaml
Show resolved
Hide resolved
integ-test/src/test/resources/expectedOutput/calcite/explain_agg_with_script.yaml
Show resolved
Hide resolved
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java
Show resolved
Hide resolved
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java
Show resolved
Hide resolved
| throw new IllegalStateException(String.format("Cannot infer value from RexNode %s", node)); | ||
| } | ||
|
|
||
| RexNode inferRexNodeFromIndex(int index, Project project) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add private
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should give the methods in AggregateBuilderHelper package level accessibility. I see all methods in this class are using default symbol
| return project == null ? RexInputRef.of(index, rowType) : project.getProjects().get(index); | ||
| } | ||
|
|
||
| String inferFieldNameFromIndex(int index, Project project) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
Description
This PR primally supports pruning the old operator if we get a better plan than before. Therefore, it will improve the efficiency of planning process by avoid exploring meaningless equivalent plans.
The performance gain is shown below:
Some positive cases on plan with this PR:
Implementation Details
As described in #4931 (comment), there is also many issues and bug spotted after pruning old. So there is additional change to fix them and make it compatible:
PPLAggregateConvertRule,PPLAggGroupMergeRule, 'RareTopPushdownRule', 'DedupPushDownRule' implementsSubstitutionRuleso they will get higher priority on rule match and then we can get optimized aggregates in RelSubset before pruning.project,sortand agg derivedfilterwhen doing aggregate push down.AggregateIndexScanRuleso it can support pushing down on more casesDedupPushDownRuleso it can get compatible with the current pruning mechanismAggregateAnalyzerwhen the project is null. See UT inAggregateAnalyzerTest, its expected results is wrong before.Related Issues
Resolves #4931
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.