-
Notifications
You must be signed in to change notification settings - Fork 181
RexCall and RelDataType standardization for script push down #4914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
📝 WalkthroughSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds RexNode standardization and numeric-type widening into script serialization, extends ScriptParameterHelper to accept and expose a RexBuilder and widening stack, updates its call sites, adapts tests/benchmarks, and refreshes many integration expected outputs to reflect modified serialized script payloads. Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (5)
🧰 Additional context used📓 Path-based instructions (5)**/*.java📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Files:
⚙️ CodeRabbit configuration file
Files:
integ-test/**/*IT.java📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Files:
⚙️ CodeRabbit configuration file
Files:
**/*IT.java📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Files:
**/test/**/*.java⚙️ CodeRabbit configuration file
Files:
**/calcite/**/*.java⚙️ CodeRabbit configuration file
Files:
🧠 Learnings (3)📚 Learning: 2025-12-02T17:27:55.938ZApplied to files:
📚 Learning: 2025-12-02T17:27:55.938ZApplied to files:
📚 Learning: 2025-12-02T17:27:55.938ZApplied to files:
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
🔇 Additional comments (5)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (5)
integ-test/src/test/resources/expectedOutput/calcite/explain_filter_function_script_push.yaml (1)
8-8: Consider adding inline documentation for complex embedded scripts.The base64-encoded script payloads make this test resource difficult to review and maintain. For future maintainability:
- Add a comment before the physical plan section explaining what transformations the scripts represent (e.g., "Script standardized to use RexDynamicParameter for field references and literals").
- Consider documenting the expected structure of
SOURCESandDIGESTSarrays to clarify how they map to the new parameter scheme.This will help reviewers and maintainers understand the intent of the standardization without requiring them to manually decode the base64 content.
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/RexStandardizer.java (3)
67-75: Silent exception swallowing may hide issues.The empty catch block silently ignores all exceptions from
RexUtil.expandSearch. While the fallback to use the original call is acceptable, logging the exception at DEBUG level would help with troubleshooting.try { return RexUtil.expandSearch(helper.rexBuilder, null, call).accept(this, helper); - } catch (Exception ignored) { + } catch (Exception e) { // Continue to use the original call if failing to expand. // We can downgrade to still use `Sarg` literal instead of replacing it with parameter. + // Log at debug level for troubleshooting }
107-107: Potential NPE when stack is empty during nested visitation.
helper.stack.peek()returnsnullwhen the stack is empty. This can occur whenvisitInputReforvisitLiteralis called outside of avisitCallcontext (e.g., top-level literal or input ref). ThewidenTypemethod handlesnullcorrectly withallowNumericTypeWiden == null || allowNumericTypeWiden, but this implicit behavior should be documented.Consider adding a brief comment explaining the null-safe behavior:
// stack.peek() returns null when outside RexCall context, which enables widening by default return new RexDynamicParam(widenType(field.getType(), helper.stack.peek()), newIndex);Also applies to: 129-129
248-248: Prefer throwing an exception over assertion for runtime invariant.Assertions can be disabled at runtime with
-daJVM flag. For a critical invariant that indicates a bug, throwing an explicit exception ensures the check is never bypassed.- assert helper.stack.isEmpty() : "[BUG]Stack should be empty before standardizing"; + if (!helper.stack.isEmpty()) { + throw new IllegalStateException("[BUG] Stack should be empty before standardizing"); + }opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/ScriptParameterHelper.java (1)
35-38: Consider encapsulating stack access.The
stackfield isfinalbut mutable, and Lombok's@Getterexposes it directly.RexStandardizeraccesseshelper.stack.push(),helper.stack.pop(), andhelper.stack.peek()directly. While this works, exposing the rawDequeallows callers to manipulate it arbitrarily, which could lead to misuse.Consider providing specific methods for stack operations and marking the field as package-private or excluding it from the getter:
@Getter(AccessLevel.NONE) final Deque<Boolean> stack = new ArrayDeque<>(); public void pushNumericWidenFlag(boolean allow) { stack.push(allow); } public void popNumericWidenFlag() { stack.pop(); } public Boolean peekNumericWidenFlag() { return stack.peek(); }This is optional since
RexStandardizeris in the same package and the current approach is simpler.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (37)
benchmarks/src/jmh/java/org/opensearch/sql/expression/operator/predicate/ExpressionScriptSerdeBenchmark.java(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/agg_case_cannot_push.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/agg_case_composite_cannot_push.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/agg_case_num_res_cannot_push.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/clickbench/q19.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/clickbench/q29.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/clickbench/q40.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_agg_counts_by6.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_agg_script_udt_arg_push.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_agg_with_script.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_no_expr_output_push.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_project_then_sort.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_push.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_single_expr_output_push.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_nested_expr.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push7.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_filter_function_script_push.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_filter_script_ip_push.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_filter_script_push.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_isblank.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_isempty_or_others.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_min_max_agg_on_derived_field.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_patterns_simple_pattern_agg_push.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_regex.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_regex_negated.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_regexp_match_in_where.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_script_push_on_text.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_skip_script_encoding.json(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_sort_complex_and_simple_expr.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_text_ilike_function.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_text_like_function.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_text_like_function_case_insensitive.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/udf_geoip_in_agg_pushed.yaml(1 hunks)opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java(1 hunks)opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/RexStandardizer.java(7 hunks)opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/ScriptParameterHelper.java(3 hunks)opensearch/src/test/java/org/opensearch/sql/opensearch/storage/serde/RelJsonSerializerTest.java(6 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
benchmarks/src/jmh/java/org/opensearch/sql/expression/operator/predicate/ExpressionScriptSerdeBenchmark.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/ScriptParameterHelper.javaopensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.javaopensearch/src/test/java/org/opensearch/sql/opensearch/storage/serde/RelJsonSerializerTest.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/RexStandardizer.java
**/*Test.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*Test.java: All new business logic requires unit tests
Name unit tests with*Test.javasuffix in OpenSearch SQL
Files:
opensearch/src/test/java/org/opensearch/sql/opensearch/storage/serde/RelJsonSerializerTest.java
🧠 Learnings (3)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/resources/expectedOutput/calcite/clickbench/q19.yamlinteg-test/src/test/resources/expectedOutput/calcite/agg_case_num_res_cannot_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_function_script_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_regexp_match_in_where.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_sort_complex_and_simple_expr.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_isempty_or_others.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_regex_negated.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_counts_by6.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_patterns_simple_pattern_agg_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_skip_script_encoding.jsoninteg-test/src/test/resources/expectedOutput/calcite/agg_case_cannot_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_text_like_function_case_insensitive.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q40.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_isblank.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_project_then_sort.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_no_expr_output_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q29.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_nested_expr.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_single_expr_output_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_min_max_agg_on_derived_field.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_with_script.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_text_ilike_function.yamlinteg-test/src/test/resources/expectedOutput/calcite/udf_geoip_in_agg_pushed.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push7.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_regex.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_script_push_on_text.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_script_udt_arg_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_script_ip_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/agg_case_composite_cannot_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_script_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_text_like_function.yaml
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Applied to files:
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/RexStandardizer.java
🧬 Code graph analysis (2)
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/ScriptParameterHelper.java (1)
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/SortExprDigest.java (1)
Getter(23-111)
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/RexStandardizer.java (1)
core/src/main/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactory.java (1)
OpenSearchTypeFactory(63-411)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (21, doc)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (21, unit)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: security-it-linux (21)
- GitHub Check: security-it-linux (25)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: CodeQL-Scan (java)
🔇 Additional comments (36)
integ-test/src/test/resources/expectedOutput/calcite/explain_filter_function_script_push.yaml (1)
8-8: Verify the embedded script payload transformations align with RexNode standardization logic.The physical plan's embedded script sources (base64-encoded) have changed, which is expected given the RexNode standardization implementation. However, without decoding these scripts and comparing them to the implementation changes, I cannot verify the correctness of the transformations.
Key concerns:
- The
SOURCESarray[0,2]andDIGESTSarray structure need verification to ensure they correctly reflect the standardized dynamic parameters.- Confirm that the parameter remapping from field references and literals to
RexDynamicParameteris correctly represented in the script payload.- Ensure this test output aligns with the changes in
ScriptParameterHelperand the RexBuilder integration mentioned in the PR context.Please confirm:
- The test passes after the RexNode standardization implementation changes.
- The base64-encoded scripts correctly represent the standardized RexNode expressions (e.g.,
RexInputRef→RexDynamicParameter, type narrowing applied).- The parameter arrays (
SOURCES,DIGESTS) correctly map to the new standardized parameters and are consistent withScriptParameterHelper's new behavior.Alternatively, if you can provide details on how these expected outputs were generated or validated, that would help verify the standardization is working as intended.
integ-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_nested_expr.yaml (1)
10-10: Verify whether line 10 is a duplicate or a legitimate expected-output change.The provided snippet shows only the final state of line 10 without context of removed/replaced code. This makes it unclear whether the CalciteEnumerableIndexScan represents:
- A legitimate update to expected output due to RexNode standardization changes, or
- An unintended duplication requiring removal.
Additionally, validate that the Base64-encoded script within the PushDownContext reflects the correct standardization transformations (e.g., dynamic parameters replacing literals, type widening to BIGINT).
integ-test/src/test/resources/expectedOutput/calcite/explain_patterns_simple_pattern_agg_push.yaml (1)
10-10: Test expected output updated with new script serialization.The base64-encoded script payload has changed, reflecting the RexNode standardization. The surrounding structure (PushDownContext, aggregation metadata, field references) is preserved correctly.
To validate the script content change is correct, please verify against the RexStandardizer implementation that the new base64 payload represents the expected standardized form of the expression. Since the actual source files (ScriptParameterHelper.java, RexStandardizer.java, PredicateAnalyzer.java) are not included in the review context, I cannot decode and validate the new script bytecode.
integ-test/src/test/resources/expectedOutput/calcite/clickbench/q40.yaml (1)
16-16: Aggregation script payload updated for RexNode standardization.The base64-encoded script in the multi-term aggregation for "Src" field has been updated. The PushDownContext (PROJECT, FILTER, AGGREGATION, SORT_AGG_METRICS, LIMIT) and overall request structure remain consistent.
Confirm that the new script payload represents the standardized form from RexStandardizer. Additionally, verify that the "SOURCES" and "DIGESTS" arrays in the script parameters match the expected field mappings and constant values after standardization.
integ-test/src/test/resources/expectedOutput/calcite/clickbench/q29.yaml (1)
18-18: Complex aggregation plan updated with standardized scripts.The plan includes multiple embedded scripts (REGEXP_REPLACE for "k" and CHAR_LENGTH for "l"). Both base64-encoded payloads have been updated, while the composite aggregation structure and nested aggregations remain intact.
Verify that both embedded scripts use standardized RexNode representations and that their parameter mappings (SOURCES/DIGESTS arrays) are correctly aligned with the composite aggregation structure. Confirm script parameter indices don't conflict across multiple scripts in the same request.
integ-test/src/test/resources/expectedOutput/calcite/agg_case_num_res_cannot_push.yaml (1)
9-9: CASE expression script updated with literal standardization.The script payload reflects a CASE expression with standardized literal parameters. The DIGESTS array shows "age" at index 0 and numeric literals (30, 30, 100) assigned to dynamicParam indices, consistent with literal standardization objectives.
Confirm that identical literals (e.g., the two
30values) are correctly mapped to the same dynamicParam index in SOURCES array [0,2,2,2], enabling script cache reuse across queries with different literal values.integ-test/src/test/resources/expectedOutput/calcite/clickbench/q19.yaml (1)
12-12: UDT (User-Defined Type) script updated for EXTRACT function.The script now includes a UDT field (
"udt": "EXPR_TIMESTAMP") for timestamp handling, which may relate to the RFC objective of addressing "timestamp encoding that prevents cache hits." The EXTRACT('minute') function is marked as deterministic.Confirm that the UDT standardization approach (including timestamp UDT metadata) enables script cache sharing across queries with different timestamp values. Verify that the deterministic flag is set correctly based on the function's idempotency.
integ-test/src/test/resources/expectedOutput/calcite/explain_sort_complex_and_simple_expr.yaml (1)
10-10: Sort expression script with potential parameter mapping concern.The script for sort operation (+($10, $7)) shows SOURCES=[0,0] with DIGESTS=["age","balance"]. Verify that this mapping correctly represents the arithmetic operation; SOURCES array may need indices [0,1] if operands reference different fields.
Confirm the SOURCES array values [0,0] are intentional. If both operands should reference distinct fields (age and balance), SOURCES should likely be [0,1]. Additionally, verify the "MISSING_MAX": false parameter is set correctly for the sort behavior.
integ-test/src/test/resources/expectedOutput/calcite/explain_skip_script_encoding.json (1)
1-1: JSON structure reformatted with standardized scripts.The file now uses a single top-level JSON object with nested "calcite" structure. Two embedded scripts within the AND condition both use standardized RexNode representations with SOURCES/DIGESTS parameter arrays.
Confirm that:
- The JSON structure change (single top-level object vs. separate blocks) is intentional and aligns with test infrastructure changes
- Both embedded scripts have non-conflicting parameter indices (Script 1: SOURCES=[1,2]; Script 2: SOURCES=[0,2,2])
- The escaped JSON content is complete and correctly represents the original script expressions
integ-test/src/test/resources/expectedOutput/calcite/agg_case_cannot_push.yaml (1)
9-9: Nested CASE expression script with comprehensive literal standardization.The script encodes a complex CASE with nested conditions and AND operators. The SOURCES array [0,2,2,0,2,0,2,2,2] demonstrates literal standardization where identical values (e.g., 30 at indices 1&2, 40, 100) are reused via consistent dynamicParam references.
Verify that:
- The nested CASE expression is correctly flattened and standardized
- Literal reuse across multiple branches (DIGESTS: age, 30, 30, 40, 100) reduces script size as intended
- The SOURCES indices correctly map operands to parameters for all nested conditions
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/RexStandardizer.java (3)
46-57: Good documentation of standardization steps.The updated JavaDoc clearly documents the standardization process including the type widening step. The TODO for constant folding is appropriately noted.
76-87: Good stack management with try-finally.The stack push/pop pattern using try-finally ensures proper cleanup even if exceptions occur during operand processing. The determination of
allowNumericTypeWidenbased on arithmetic/comparison operations is appropriate.
228-244: Type widening logic is correct and well-documented.The widening rules are appropriate:
- Integer types → BIGINT
- Floating-point/decimal types → DOUBLE
- Character types → VARCHAR
The nullable=true for all widened types ensures consistency across standardized expressions.
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/ScriptParameterHelper.java (1)
61-79: Constructor changes are appropriate.The addition of
RexBuilderparameter to both constructors and proper delegation is correct. The field assignments are consistent.integ-test/src/test/resources/expectedOutput/calcite/explain_regexp_match_in_where.yaml (1)
7-8: Test expectation updated for type standardization.The updated base64-encoded script reflects the new type widening (VARCHAR with
nullable: trueandprecision: -1). TheSOURCESandDIGESTSparams ([1,2]and["name","hello"]) correctly identify the field reference (source 1) and literal value (source 2).integ-test/src/test/resources/expectedOutput/calcite/explain_text_like_function_case_insensitive.yaml (1)
7-8: Test expectation correctly reflects ILIKE standardization.The updated script payload shows proper standardization of the ILIKE expression with three operands. The
SOURCESarray[1,2,2]correctly identifies: address field (source 1=SOURCE), and two literals (source 2=LITERAL). All parameters are widened to VARCHAR as expected.integ-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_single_expr_output_push.yaml (1)
9-10: Test expectation shows correct BIGINT widening for arithmetic expression.The sort expression
+($0, $1)correctly has both operands widened toBIGINT(nullable: true) since it's a binary arithmetic operation. TheSOURCESarray[0,0]correctly identifies both as DOC_VALUE sources, andDIGESTScontains the field names["age","balance"].integ-test/src/test/resources/expectedOutput/calcite/explain_filter_script_push.yaml (1)
8-8: Test expected output requires verification against implementation code.The embedded script payload (base64-encoded) has been updated, but comprehensive verification requires reviewing the corresponding implementation changes in ScriptParameterHelper, RexStandardizer, and PredicateAnalyzer to confirm the standardization logic correctly transformed the RexNode expressions.
integ-test/src/test/resources/expectedOutput/calcite/explain_regex_negated.yaml (1)
9-9: LGTM - Script payload updated for standardization.The embedded script content reflects the new RexNode standardization. The overall plan structure and parameters remain consistent.
integ-test/src/test/resources/expectedOutput/calcite/explain_agg_script_udt_arg_push.yaml (1)
11-11: LGTM - Aggregation script payload updated.The script standardization is applied consistently within the aggregation push-down context.
integ-test/src/test/resources/expectedOutput/calcite/explain_regex.yaml (1)
9-9: LGTM - Regex script standardization applied.The updated script payload is consistent with the standardization objectives.
integ-test/src/test/resources/expectedOutput/calcite/explain_isblank.yaml (1)
8-8: LGTM - IsBlank function script updated.The script payload change maintains the same logical behavior with standardized representation.
integ-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_project_then_sort.yaml (1)
9-9: LGTM - Sort expression script with type standardization.The script payload shows type widening (INTEGER → BIGINT) which aligns with the type standardization objectives. The sort expression push-down is correctly maintained.
benchmarks/src/jmh/java/org/opensearch/sql/expression/operator/predicate/ExpressionScriptSerdeBenchmark.java (1)
78-82: LGTM - Benchmark updated for new constructor signature.The benchmark correctly passes the
rexBuilderparameter toScriptParameterHelper, aligning with the constructor signature update. This ensures the benchmark tests the new RexNode standardization serialization path.integ-test/src/test/resources/expectedOutput/calcite/explain_isempty_or_others.yaml (1)
8-8: LGTM - Script standardization for composite filter.The script payload for the OR condition with IS NULL, EQUALS, and IS EMPTY is correctly updated with standardized representation.
integ-test/src/test/resources/expectedOutput/calcite/explain_text_like_function.yaml (1)
8-8: Verify test expectations reflect correct standardization.The base64-encoded script payload has been updated as part of RexNode standardization. Ensure this expected output was regenerated by running the actual integration tests, not manually edited.
integ-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_push.yaml (1)
10-10: Test expectation updated to reflect standardized script generation.The updated base64-encoded script payload is the expected result of RexNode standardization changes implemented in this PR.
integ-test/src/test/resources/expectedOutput/calcite/explain_text_ilike_function.yaml (1)
8-8: Test expectation updated to reflect standardized script generation.The updated base64-encoded script payload aligns with the RexNode standardization changes.
integ-test/src/test/resources/expectedOutput/calcite/explain_min_max_agg_on_derived_field.yaml (1)
8-8: Test expectation updated with standardized aggregation scripts.The updated MIN and MAX aggregation script payloads reflect the standardization changes while maintaining the same aggregation semantics.
opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java (1)
1472-1474: LGTM! RexBuilder correctly propagated to ScriptParameterHelper.The updated constructor call correctly passes
cluster.getRexBuilder()to enable RexNode standardization in script parameter handling.opensearch/src/test/java/org/opensearch/sql/opensearch/storage/serde/RelJsonSerializerTest.java (6)
64-65: LGTM! Test updated to use new ScriptParameterHelper constructor.The constructor call correctly includes
rexBuilderparameter to support RexNode standardization testing.
112-113: LGTM! Consistent constructor update.The constructor call correctly includes
rexBuilderparameter, consistent with other test methods.
128-129: LGTM! Exception test updated consistently.Constructor call properly updated to include
rexBuilderparameter even in the exception test path.
145-146: LGTM! Consistent constructor update.The constructor call correctly includes
rexBuilderparameter for out-of-scope function testing.
178-179: LGTM! Consistent constructor update.The constructor call correctly includes
rexBuilderparameter for index remapping test.
213-214: LGTM! All test methods consistently updated.The final constructor call correctly includes
rexBuilderparameter, completing the consistent update across all test methods in this file.
| physical: | | ||
| CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_weblogs]], PushDownContext=[[AGGREGATION->rel#:LogicalAggregate.NONE.[](input=RelSubset#,group={0},count()=COUNT()), PROJECT->[count(), info.city], LIMIT->10000], OpenSearchRequestBuilder(sourceBuilder={"from":0,"size":0,"timeout":"1m","aggregations":{"composite_buckets":{"composite":{"size":1000,"sources":[{"info.city":{"terms":{"script":{"source":"{\"langType\":\"calcite\",\"script\":\"rO0ABXQErXsKICAib3AiOiB7CiAgICAibmFtZSI6ICJJVEVNIiwKICAgICJraW5kIjogIklURU0iLAogICAgInN5bnRheCI6ICJTUEVDSUFMIgogIH0sCiAgIm9wZXJhbmRzIjogWwogICAgewogICAgICAib3AiOiB7CiAgICAgICAgIm5hbWUiOiAiR0VPSVAiLAogICAgICAgICJraW5kIjogIk9USEVSX0ZVTkNUSU9OIiwKICAgICAgICAic3ludGF4IjogIkZVTkNUSU9OIgogICAgICB9LAogICAgICAib3BlcmFuZHMiOiBbCiAgICAgICAgewogICAgICAgICAgImR5bmFtaWNQYXJhbSI6IDAsCiAgICAgICAgICAidHlwZSI6IHsKICAgICAgICAgICAgInR5cGUiOiAiVkFSQ0hBUiIsCiAgICAgICAgICAgICJudWxsYWJsZSI6IGZhbHNlLAogICAgICAgICAgICAicHJlY2lzaW9uIjogLTEKICAgICAgICAgIH0KICAgICAgICB9LAogICAgICAgIHsKICAgICAgICAgICJkeW5hbWljUGFyYW0iOiAxLAogICAgICAgICAgInR5cGUiOiB7CiAgICAgICAgICAgICJ1ZHQiOiAiRVhQUl9JUCIsCiAgICAgICAgICAgICJ0eXBlIjogIk9USEVSIiwKICAgICAgICAgICAgIm51bGxhYmxlIjogdHJ1ZQogICAgICAgICAgfQogICAgICAgIH0KICAgICAgXSwKICAgICAgImNsYXNzIjogIm9yZy5vcGVuc2VhcmNoLnNxbC5leHByZXNzaW9uLmZ1bmN0aW9uLlVzZXJEZWZpbmVkRnVuY3Rpb25CdWlsZGVyJDEiLAogICAgICAidHlwZSI6IHsKICAgICAgICAidHlwZSI6ICJNQVAiLAogICAgICAgICJudWxsYWJsZSI6IGZhbHNlLAogICAgICAgICJrZXkiOiB7CiAgICAgICAgICAidHlwZSI6ICJWQVJDSEFSIiwKICAgICAgICAgICJudWxsYWJsZSI6IGZhbHNlLAogICAgICAgICAgInByZWNpc2lvbiI6IC0xCiAgICAgICAgfSwKICAgICAgICAidmFsdWUiOiB7CiAgICAgICAgICAidHlwZSI6ICJBTlkiLAogICAgICAgICAgIm51bGxhYmxlIjogZmFsc2UsCiAgICAgICAgICAicHJlY2lzaW9uIjogLTEsCiAgICAgICAgICAic2NhbGUiOiAtMjE0NzQ4MzY0OAogICAgICAgIH0KICAgICAgfSwKICAgICAgImRldGVybWluaXN0aWMiOiB0cnVlLAogICAgICAiZHluYW1pYyI6IGZhbHNlCiAgICB9LAogICAgewogICAgICAiZHluYW1pY1BhcmFtIjogMiwKICAgICAgInR5cGUiOiB7CiAgICAgICAgInR5cGUiOiAiQ0hBUiIsCiAgICAgICAgIm51bGxhYmxlIjogZmFsc2UsCiAgICAgICAgInByZWNpc2lvbiI6IDQKICAgICAgfQogICAgfQogIF0KfQ==\"}","lang":"opensearch_compounded_script","params":{"utcTimestamp": 0,"SOURCES":[2,0,2],"DIGESTS":["my-datasource","host","city"]}},"missing_bucket":true,"missing_order":"first","order":"asc"}}}]}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)]) | ||
| CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_weblogs]], PushDownContext=[[AGGREGATION->rel#:LogicalAggregate.NONE.[](input=RelSubset#,group={0},count()=COUNT()), PROJECT->[count(), info.city], LIMIT->10000], OpenSearchRequestBuilder(sourceBuilder={"from":0,"size":0,"timeout":"1m","aggregations":{"composite_buckets":{"composite":{"size":1000,"sources":[{"info.city":{"terms":{"script":{"source":"{\"langType\":\"calcite\",\"script\":\"rO0ABXQEr3sKICAib3AiOiB7CiAgICAibmFtZSI6ICJJVEVNIiwKICAgICJraW5kIjogIklURU0iLAogICAgInN5bnRheCI6ICJTUEVDSUFMIgogIH0sCiAgIm9wZXJhbmRzIjogWwogICAgewogICAgICAib3AiOiB7CiAgICAgICAgIm5hbWUiOiAiR0VPSVAiLAogICAgICAgICJraW5kIjogIk9USEVSX0ZVTkNUSU9OIiwKICAgICAgICAic3ludGF4IjogIkZVTkNUSU9OIgogICAgICB9LAogICAgICAib3BlcmFuZHMiOiBbCiAgICAgICAgewogICAgICAgICAgImR5bmFtaWNQYXJhbSI6IDAsCiAgICAgICAgICAidHlwZSI6IHsKICAgICAgICAgICAgInR5cGUiOiAiVkFSQ0hBUiIsCiAgICAgICAgICAgICJudWxsYWJsZSI6IHRydWUsCiAgICAgICAgICAgICJwcmVjaXNpb24iOiAtMQogICAgICAgICAgfQogICAgICAgIH0sCiAgICAgICAgewogICAgICAgICAgImR5bmFtaWNQYXJhbSI6IDEsCiAgICAgICAgICAidHlwZSI6IHsKICAgICAgICAgICAgInVkdCI6ICJFWFBSX0lQIiwKICAgICAgICAgICAgInR5cGUiOiAiT1RIRVIiLAogICAgICAgICAgICAibnVsbGFibGUiOiB0cnVlCiAgICAgICAgICB9CiAgICAgICAgfQogICAgICBdLAogICAgICAiY2xhc3MiOiAib3JnLm9wZW5zZWFyY2guc3FsLmV4cHJlc3Npb24uZnVuY3Rpb24uVXNlckRlZmluZWRGdW5jdGlvbkJ1aWxkZXIkMSIsCiAgICAgICJ0eXBlIjogewogICAgICAgICJ0eXBlIjogIk1BUCIsCiAgICAgICAgIm51bGxhYmxlIjogZmFsc2UsCiAgICAgICAgImtleSI6IHsKICAgICAgICAgICJ0eXBlIjogIlZBUkNIQVIiLAogICAgICAgICAgIm51bGxhYmxlIjogZmFsc2UsCiAgICAgICAgICAicHJlY2lzaW9uIjogLTEKICAgICAgICB9LAogICAgICAgICJ2YWx1ZSI6IHsKICAgICAgICAgICJ0eXBlIjogIkFOWSIsCiAgICAgICAgICAibnVsbGFibGUiOiBmYWxzZSwKICAgICAgICAgICJwcmVjaXNpb24iOiAtMSwKICAgICAgICAgICJzY2FsZSI6IC0yMTQ3NDgzNjQ4CiAgICAgICAgfQogICAgICB9LAogICAgICAiZGV0ZXJtaW5pc3RpYyI6IHRydWUsCiAgICAgICJkeW5hbWljIjogZmFsc2UKICAgIH0sCiAgICB7CiAgICAgICJkeW5hbWljUGFyYW0iOiAyLAogICAgICAidHlwZSI6IHsKICAgICAgICAidHlwZSI6ICJWQVJDSEFSIiwKICAgICAgICAibnVsbGFibGUiOiB0cnVlLAogICAgICAgICJwcmVjaXNpb24iOiAtMQogICAgICB9CiAgICB9CiAgXQp9\"}","lang":"opensearch_compounded_script","params":{"utcTimestamp": 0,"SOURCES":[2,0,2],"DIGESTS":["my-datasource","host","city"]}},"missing_bucket":true,"missing_order":"first","order":"asc"}}}]}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explain the base64-encoded script changes and verify consistency across related test files.
The base64-encoded Calcite script payload on line 9 has been updated. Please clarify:
- What changed in the script content relative to the previous version and why
- Whether all related expected output YAML files in
expectedOutput/calcite/have been updated consistently for the standardization objectives
| // OpenSearch doesn't accept big decimal value in its script. | ||
| // So try to return its double value directly as we don't really support decimal literal. | ||
| if (SqlTypeUtil.isDecimal(literal.getType()) && literal.getValue() instanceof BigDecimal) { | ||
| return ((BigDecimal) literal.getValue()).doubleValue(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Precision loss converting BigDecimal to double.
Converting BigDecimal to double via doubleValue() can lose precision for values with more than ~15-17 significant digits. While the comment acknowledges OpenSearch's limitation and that decimal literals aren't truly supported, this could silently corrupt high-precision numeric literals.
Consider logging a warning when precision might be lost:
if (SqlTypeUtil.isDecimal(literal.getType()) && literal.getValue() instanceof BigDecimal) {
- return ((BigDecimal) literal.getValue()).doubleValue();
+ BigDecimal bd = (BigDecimal) literal.getValue();
+ // Note: Precision may be lost for values with > 15-17 significant digits
+ return bd.doubleValue();
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // OpenSearch doesn't accept big decimal value in its script. | |
| // So try to return its double value directly as we don't really support decimal literal. | |
| if (SqlTypeUtil.isDecimal(literal.getType()) && literal.getValue() instanceof BigDecimal) { | |
| return ((BigDecimal) literal.getValue()).doubleValue(); | |
| } | |
| // OpenSearch doesn't accept big decimal value in its script. | |
| // So try to return its double value directly as we don't really support decimal literal. | |
| if (SqlTypeUtil.isDecimal(literal.getType()) && literal.getValue() instanceof BigDecimal) { | |
| BigDecimal bd = (BigDecimal) literal.getValue(); | |
| // Note: Precision may be lost for values with > 15-17 significant digits | |
| return bd.doubleValue(); | |
| } |
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
# Conflicts: # integ-test/src/test/resources/expectedOutput/calcite/agg_case_cannot_push.yaml # integ-test/src/test/resources/expectedOutput/calcite/agg_case_composite_cannot_push.yaml # integ-test/src/test/resources/expectedOutput/calcite/agg_case_num_res_cannot_push.yaml # integ-test/src/test/resources/expectedOutput/calcite/clickbench/q29.yaml # integ-test/src/test/resources/expectedOutput/calcite/explain_agg_counts_by6.yaml # integ-test/src/test/resources/expectedOutput/calcite/explain_agg_script_udt_arg_push.yaml # integ-test/src/test/resources/expectedOutput/calcite/explain_agg_with_script.yaml # integ-test/src/test/resources/expectedOutput/calcite/explain_agg_with_sum_enhancement.yaml # integ-test/src/test/resources/expectedOutput/calcite/explain_patterns_simple_pattern_agg_push.yaml # integ-test/src/test/resources/expectedOutput/calcite/explain_script_push_on_text.yaml # integ-test/src/test/resources/expectedOutput/calcite/udf_geoip_in_agg_pushed.yaml
Signed-off-by: Heng Qian <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/RexStandardizer.java (1)
204-210: Precision loss acknowledged with comment.The comment now documents the precision limitation when converting
BigDecimaltodouble. This addresses the concern raised in previous reviews.
🧹 Nitpick comments (2)
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/RexStandardizer.java (1)
67-75: Overly broad exception handling may hide bugs.Catching
Exceptionis too broad and could mask programming errors (e.g.,NullPointerException,IllegalArgumentException). Consider catching only the specific exception(s) thatRexUtil.expandSearchcan throw when expansion isn't possible.if (call.op.kind == SqlKind.SEARCH) { try { return RexUtil.expandSearch(helper.rexBuilder, null, call).accept(this, helper); - } catch (Exception ignored) { + } catch (UnsupportedOperationException ignored) { // Continue to use the original call if failing to expand. // We can downgrade to still use `Sarg` literal instead of replacing it with parameter. } }Please verify which specific exceptions
RexUtil.expandSearchthrows when expansion fails and narrow the catch clause accordingly.opensearch/src/test/java/org/opensearch/sql/opensearch/storage/serde/RelJsonSerializerTest.java (1)
312-350: Minor: Inconsistent variable naming.The variable
rexUpperdoesn't reflect the test purpose (SEARCH expansion). Consider renaming torexSearchfor clarity.- RexNode rexUpper = + RexNode rexSearch = rexBuilder.makeCall( SqlStdOperatorTable.SEARCH, ... - String code = serializer.serialize(rexUpper, helper); + String code = serializer.serialize(rexSearch, helper);The test logic itself correctly verifies SEARCH/Sarg expansion to OR expressions with widened BIGINT types.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
integ-test/src/test/resources/expectedOutput/calcite/agg_case_cannot_push.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/agg_case_composite_cannot_push.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/agg_case_num_res_cannot_push.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/clickbench/q40.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_agg_counts_by6.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_agg_script_udt_arg_push.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_agg_with_script.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_agg_with_sum_enhancement.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_filter_function_script_push.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_min_max_agg_on_derived_field.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_patterns_simple_pattern_agg_push.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_script_push_on_text.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/udf_geoip_in_agg_pushed.yaml(1 hunks)opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/RexStandardizer.java(7 hunks)opensearch/src/test/java/org/opensearch/sql/opensearch/storage/serde/RelJsonSerializerTest.java(9 hunks)
✅ Files skipped from review due to trivial changes (1)
- integ-test/src/test/resources/expectedOutput/calcite/explain_agg_with_sum_enhancement.yaml
🚧 Files skipped from review as they are similar to previous changes (8)
- integ-test/src/test/resources/expectedOutput/calcite/explain_agg_with_script.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_agg_script_udt_arg_push.yaml
- integ-test/src/test/resources/expectedOutput/calcite/agg_case_cannot_push.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_filter_function_script_push.yaml
- integ-test/src/test/resources/expectedOutput/calcite/clickbench/q40.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_min_max_agg_on_derived_field.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_agg_counts_by6.yaml
- integ-test/src/test/resources/expectedOutput/calcite/udf_geoip_in_agg_pushed.yaml
🧰 Additional context used
📓 Path-based instructions (3)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
opensearch/src/test/java/org/opensearch/sql/opensearch/storage/serde/RelJsonSerializerTest.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/RexStandardizer.java
⚙️ CodeRabbit configuration file
**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure methods are under 20 lines with single responsibility
- Verify proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
opensearch/src/test/java/org/opensearch/sql/opensearch/storage/serde/RelJsonSerializerTest.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/RexStandardizer.java
**/*Test.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*Test.java: All new business logic requires unit tests
Name unit tests with*Test.javasuffix in OpenSearch SQL
Files:
opensearch/src/test/java/org/opensearch/sql/opensearch/storage/serde/RelJsonSerializerTest.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify test coverage for new business logic
- Check test naming follows conventions (*Test.java for unit, *IT.java for integration)
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
opensearch/src/test/java/org/opensearch/sql/opensearch/storage/serde/RelJsonSerializerTest.java
🧠 Learnings (3)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/resources/expectedOutput/calcite/explain_script_push_on_text.yamlopensearch/src/test/java/org/opensearch/sql/opensearch/storage/serde/RelJsonSerializerTest.javainteg-test/src/test/resources/expectedOutput/calcite/explain_patterns_simple_pattern_agg_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/agg_case_composite_cannot_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/agg_case_num_res_cannot_push.yaml
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL
Applied to files:
opensearch/src/test/java/org/opensearch/sql/opensearch/storage/serde/RelJsonSerializerTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Applied to files:
opensearch/src/test/java/org/opensearch/sql/opensearch/storage/serde/RelJsonSerializerTest.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/RexStandardizer.java
🧬 Code graph analysis (2)
opensearch/src/test/java/org/opensearch/sql/opensearch/storage/serde/RelJsonSerializerTest.java (1)
core/src/main/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactory.java (1)
OpenSearchTypeFactory(63-411)
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/RexStandardizer.java (1)
core/src/main/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactory.java (1)
OpenSearchTypeFactory(63-411)
🔇 Additional comments (10)
integ-test/src/test/resources/expectedOutput/calcite/explain_patterns_simple_pattern_agg_push.yaml (1)
10-10: Expected output updated to reflect standardized script payload.The embedded script payload in the PushDownContext has been updated as part of RexNode standardization. The identical base64-encoded script strings in both
patterns_fieldandpattern_countlocations correctly represent the same aggregation expression. The YAML structure is valid and maintains proper indentation.Please confirm that:
- Integration tests pass with these updated expectations
- The standardized scripts in the base64 payloads represent properly canonical/standardized RexNode expressions (with literals converted to dynamic parameters, types widened, etc. per the PR objectives)
If needed, I can generate a verification script to decode and inspect the base64 payload to validate the standardization.
integ-test/src/test/resources/expectedOutput/calcite/agg_case_composite_cannot_push.yaml (1)
9-9: Verify the base64-encoded script payload reflects RexNode standardization changes.Line 9 updates an embedded base64-encoded script payload in the composite aggregation context as part of broader RexNode standardization, type widening, and literal downgrading across test expected-output YAMLs.
To confirm the payload update is correct:
- Decode the base64 script (source:
"script":"{\"langType\":\"calcite\",\"script\":\"rO0ABXQDIXsK...\") and verify it reflects the intended RexCall standardization (operand reordering, type widening for CASE expressions, literal downgrades).- Confirm query semantics remain unchanged—the aggregation still produces correct
avg_balance,age_range, andstateoutput.- Cross-reference this change with related code modifications (RexBuilder parameter additions, RexNode standardization logic) to ensure consistency.
- Verify this pattern is applied consistently across all similar expected-output updates in the test suite.
integ-test/src/test/resources/expectedOutput/calcite/explain_script_push_on_text.yaml (1)
10-10: Verify test expectations and validation for standardized Calcite script payloads.This test expectations file updates the embedded OpenSearch scripts (base64-encoded) within the PushDownContext to reflect RexNode standardization. When decoded, the filter script should show the expected parametrization pattern with dynamic parameters replacing literals and field references, along with appropriate type widening.
To confirm the changes are correct:
- Run integration tests to validate the updated expectations pass
- Verify the standardization is correctly applied for both filter and aggregation scripts
- Confirm no regressions in script behavior
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/RexStandardizer.java (2)
223-252: Type widening logic is well-documented and implemented.The
widenTypemethod correctly implements the widening rules described in the JavaDoc:
- Integer types → BIGINT
- Floating-point/decimal → DOUBLE
- Character types → VARCHAR
- All results are made nullable
The conditional guard on
allowNumericTypeWidenproperly handles the null case.
254-258: Good defensive assertion.The assertion ensures the helper stack is in a clean state before starting standardization, which helps catch programming errors early.
integ-test/src/test/resources/expectedOutput/calcite/agg_case_num_res_cannot_push.yaml (1)
1-9: Test expected output updated to reflect standardization changes.The updated base64-encoded script payload correctly reflects the new type widening behavior (e.g.,
BIGINTfor comparison operands). This aligns with the standardization implementation inRexStandardizer.java.opensearch/src/test/java/org/opensearch/sql/opensearch/storage/serde/RelJsonSerializerTest.java (4)
54-56: Test schema extended with numeric field.The addition of the
Numberfield to the row type enables testing numeric type widening scenarios. This is a well-structured test setup change.
227-252: New test validates RexCall normalization behavior.The test correctly verifies that:
GREATER(Number, 10)normalizes toLESS(10, Number)(operand swap)INTEGERtypes widen toBIGINTfor comparison operations- Parameter digests reflect the swapped operand order
[10, "Number"]
254-279: New test validates character type widening.The test correctly verifies that
CHAR(2)is widened toVARCHARduring standardization, which improves script cache reuse across different character type literals.
281-310: Test validates selective type widening for non-arithmetic functions.This test correctly verifies that
SUBSTRINGdoes not widenINTEGERarguments toBIGINT, as widening is only enabled for binary arithmetic and comparison operations. This demonstrates proper scoping of theallowNumericTypeWidenflag.
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/RexStandardizer.java
Show resolved
Hide resolved
| CalciteLogicalIndexScan(table=[[OpenSearch, opensearch-sql_test_index_bank]]) | ||
| physical: | | ||
| CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_bank]], PushDownContext=[[AGGREGATION->rel#:LogicalAggregate.NONE.[](input=RelSubset#,group={0},avg_age=AVG($1)), PROJECT->[avg_age, age_range], LIMIT->10000], OpenSearchRequestBuilder(sourceBuilder={"from":0,"size":0,"timeout":"1m","aggregations":{"composite_buckets":{"composite":{"size":1000,"sources":[{"age_range":{"terms":{"script":{"source":"{\"langType\":\"calcite\",\"script\":\"rO0ABXQGFHsKICAib3AiOiB7CiAgICAibmFtZSI6ICJDQVNFIiwKICAgICJraW5kIjogIkNBU0UiLAogICAgInN5bnRheCI6ICJTUEVDSUFMIgogIH0sCiAgIm9wZXJhbmRzIjogWwogICAgewogICAgICAib3AiOiB7CiAgICAgICAgIm5hbWUiOiAiPCIsCiAgICAgICAgImtpbmQiOiAiTEVTU19USEFOIiwKICAgICAgICAic3ludGF4IjogIkJJTkFSWSIKICAgICAgfSwKICAgICAgIm9wZXJhbmRzIjogWwogICAgICAgIHsKICAgICAgICAgICJkeW5hbWljUGFyYW0iOiAwLAogICAgICAgICAgInR5cGUiOiB7CiAgICAgICAgICAgICJ0eXBlIjogIklOVEVHRVIiLAogICAgICAgICAgICAibnVsbGFibGUiOiB0cnVlCiAgICAgICAgICB9CiAgICAgICAgfSwKICAgICAgICB7CiAgICAgICAgICAiZHluYW1pY1BhcmFtIjogMSwKICAgICAgICAgICJ0eXBlIjogewogICAgICAgICAgICAidHlwZSI6ICJJTlRFR0VSIiwKICAgICAgICAgICAgIm51bGxhYmxlIjogZmFsc2UKICAgICAgICAgIH0KICAgICAgICB9CiAgICAgIF0KICAgIH0sCiAgICB7CiAgICAgICJkeW5hbWljUGFyYW0iOiAyLAogICAgICAidHlwZSI6IHsKICAgICAgICAidHlwZSI6ICJWQVJDSEFSIiwKICAgICAgICAibnVsbGFibGUiOiBmYWxzZSwKICAgICAgICAicHJlY2lzaW9uIjogLTEKICAgICAgfQogICAgfSwKICAgIHsKICAgICAgIm9wIjogewogICAgICAgICJuYW1lIjogIlNFQVJDSCIsCiAgICAgICAgImtpbmQiOiAiU0VBUkNIIiwKICAgICAgICAic3ludGF4IjogIklOVEVSTkFMIgogICAgICB9LAogICAgICAib3BlcmFuZHMiOiBbCiAgICAgICAgewogICAgICAgICAgImR5bmFtaWNQYXJhbSI6IDMsCiAgICAgICAgICAidHlwZSI6IHsKICAgICAgICAgICAgInR5cGUiOiAiSU5URUdFUiIsCiAgICAgICAgICAgICJudWxsYWJsZSI6IHRydWUKICAgICAgICAgIH0KICAgICAgICB9LAogICAgICAgIHsKICAgICAgICAgICJsaXRlcmFsIjogewogICAgICAgICAgICAicmFuZ2VTZXQiOiBbCiAgICAgICAgICAgICAgWwogICAgICAgICAgICAgICAgImNsb3NlZCIsCiAgICAgICAgICAgICAgICAiMzAiLAogICAgICAgICAgICAgICAgIjQwIgogICAgICAgICAgICAgIF0KICAgICAgICAgICAgXSwKICAgICAgICAgICAgIm51bGxBcyI6ICJVTktOT1dOIgogICAgICAgICAgfSwKICAgICAgICAgICJ0eXBlIjogewogICAgICAgICAgICAidHlwZSI6ICJJTlRFR0VSIiwKICAgICAgICAgICAgIm51bGxhYmxlIjogZmFsc2UKICAgICAgICAgIH0KICAgICAgICB9CiAgICAgIF0KICAgIH0sCiAgICB7CiAgICAgICJkeW5hbWljUGFyYW0iOiA0LAogICAgICAidHlwZSI6IHsKICAgICAgICAidHlwZSI6ICJWQVJDSEFSIiwKICAgICAgICAibnVsbGFibGUiOiBmYWxzZSwKICAgICAgICAicHJlY2lzaW9uIjogLTEKICAgICAgfQogICAgfSwKICAgIHsKICAgICAgImR5bmFtaWNQYXJhbSI6IDUsCiAgICAgICJ0eXBlIjogewogICAgICAgICJ0eXBlIjogIlZBUkNIQVIiLAogICAgICAgICJudWxsYWJsZSI6IGZhbHNlLAogICAgICAgICJwcmVjaXNpb24iOiAtMQogICAgICB9CiAgICB9CiAgXQp9\"}","lang":"opensearch_compounded_script","params":{"utcTimestamp": 0,"SOURCES":[0,2,2,0,2,2],"DIGESTS":["age",30,"u30","age","u40","u100"]}},"missing_bucket":true,"missing_order":"first","order":"asc"}}}]},"aggregations":{"avg_age":{"avg":{"field":"age"}}}}}}, requestedTotalSize=10000, pageSize=null, startFrom=0)]) | ||
| CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_bank]], PushDownContext=[[AGGREGATION->rel#:LogicalAggregate.NONE.[](input=RelSubset#,group={0},avg_age=AVG($1)), PROJECT->[avg_age, age_range], LIMIT->10000], OpenSearchRequestBuilder(sourceBuilder={"from":0,"size":0,"timeout":"1m","aggregations":{"composite_buckets":{"composite":{"size":1000,"sources":[{"age_range":{"terms":{"script":{"source":"{\"langType\":\"calcite\",\"script\":\"rO0ABXQITHsKICAib3AiOiB7CiAgICAibmFtZSI6ICJDQVNFIiwKICAgICJraW5kIjogIkNBU0UiLAogICAgInN5bnRheCI6ICJTUEVDSUFMIgogIH0sCiAgIm9wZXJhbmRzIjogWwogICAgewogICAgICAib3AiOiB7CiAgICAgICAgIm5hbWUiOiAiPCIsCiAgICAgICAgImtpbmQiOiAiTEVTU19USEFOIiwKICAgICAgICAic3ludGF4IjogIkJJTkFSWSIKICAgICAgfSwKICAgICAgIm9wZXJhbmRzIjogWwogICAgICAgIHsKICAgICAgICAgICJkeW5hbWljUGFyYW0iOiAwLAogICAgICAgICAgInR5cGUiOiB7CiAgICAgICAgICAgICJ0eXBlIjogIkJJR0lOVCIsCiAgICAgICAgICAgICJudWxsYWJsZSI6IHRydWUKICAgICAgICAgIH0KICAgICAgICB9LAogICAgICAgIHsKICAgICAgICAgICJkeW5hbWljUGFyYW0iOiAxLAogICAgICAgICAgInR5cGUiOiB7CiAgICAgICAgICAgICJ0eXBlIjogIkJJR0lOVCIsCiAgICAgICAgICAgICJudWxsYWJsZSI6IHRydWUKICAgICAgICAgIH0KICAgICAgICB9CiAgICAgIF0KICAgIH0sCiAgICB7CiAgICAgICJkeW5hbWljUGFyYW0iOiAyLAogICAgICAidHlwZSI6IHsKICAgICAgICAidHlwZSI6ICJWQVJDSEFSIiwKICAgICAgICAibnVsbGFibGUiOiB0cnVlLAogICAgICAgICJwcmVjaXNpb24iOiAtMQogICAgICB9CiAgICB9LAogICAgewogICAgICAib3AiOiB7CiAgICAgICAgIm5hbWUiOiAiQU5EIiwKICAgICAgICAia2luZCI6ICJBTkQiLAogICAgICAgICJzeW50YXgiOiAiQklOQVJZIgogICAgICB9LAogICAgICAib3BlcmFuZHMiOiBbCiAgICAgICAgewogICAgICAgICAgIm9wIjogewogICAgICAgICAgICAibmFtZSI6ICI8PSIsCiAgICAgICAgICAgICJraW5kIjogIkxFU1NfVEhBTl9PUl9FUVVBTCIsCiAgICAgICAgICAgICJzeW50YXgiOiAiQklOQVJZIgogICAgICAgICAgfSwKICAgICAgICAgICJvcGVyYW5kcyI6IFsKICAgICAgICAgICAgewogICAgICAgICAgICAgICJkeW5hbWljUGFyYW0iOiAzLAogICAgICAgICAgICAgICJ0eXBlIjogewogICAgICAgICAgICAgICAgInR5cGUiOiAiQklHSU5UIiwKICAgICAgICAgICAgICAgICJudWxsYWJsZSI6IHRydWUKICAgICAgICAgICAgICB9CiAgICAgICAgICAgIH0sCiAgICAgICAgICAgIHsKICAgICAgICAgICAgICAiZHluYW1pY1BhcmFtIjogNCwKICAgICAgICAgICAgICAidHlwZSI6IHsKICAgICAgICAgICAgICAgICJ0eXBlIjogIkJJR0lOVCIsCiAgICAgICAgICAgICAgICAibnVsbGFibGUiOiB0cnVlCiAgICAgICAgICAgICAgfQogICAgICAgICAgICB9CiAgICAgICAgICBdCiAgICAgICAgfSwKICAgICAgICB7CiAgICAgICAgICAib3AiOiB7CiAgICAgICAgICAgICJuYW1lIjogIjw9IiwKICAgICAgICAgICAgImtpbmQiOiAiTEVTU19USEFOX09SX0VRVUFMIiwKICAgICAgICAgICAgInN5bnRheCI6ICJCSU5BUlkiCiAgICAgICAgICB9LAogICAgICAgICAgIm9wZXJhbmRzIjogWwogICAgICAgICAgICB7CiAgICAgICAgICAgICAgImR5bmFtaWNQYXJhbSI6IDUsCiAgICAgICAgICAgICAgInR5cGUiOiB7CiAgICAgICAgICAgICAgICAidHlwZSI6ICJCSUdJTlQiLAogICAgICAgICAgICAgICAgIm51bGxhYmxlIjogdHJ1ZQogICAgICAgICAgICAgIH0KICAgICAgICAgICAgfSwKICAgICAgICAgICAgewogICAgICAgICAgICAgICJkeW5hbWljUGFyYW0iOiA2LAogICAgICAgICAgICAgICJ0eXBlIjogewogICAgICAgICAgICAgICAgInR5cGUiOiAiQklHSU5UIiwKICAgICAgICAgICAgICAgICJudWxsYWJsZSI6IHRydWUKICAgICAgICAgICAgICB9CiAgICAgICAgICAgIH0KICAgICAgICAgIF0KICAgICAgICB9CiAgICAgIF0KICAgIH0sCiAgICB7CiAgICAgICJkeW5hbWljUGFyYW0iOiA3LAogICAgICAidHlwZSI6IHsKICAgICAgICAidHlwZSI6ICJWQVJDSEFSIiwKICAgICAgICAibnVsbGFibGUiOiB0cnVlLAogICAgICAgICJwcmVjaXNpb24iOiAtMQogICAgICB9CiAgICB9LAogICAgewogICAgICAiZHluYW1pY1BhcmFtIjogOCwKICAgICAgInR5cGUiOiB7CiAgICAgICAgInR5cGUiOiAiVkFSQ0hBUiIsCiAgICAgICAgIm51bGxhYmxlIjogdHJ1ZSwKICAgICAgICAicHJlY2lzaW9uIjogLTEKICAgICAgfQogICAgfQogIF0KfQ==\"}","lang":"opensearch_compounded_script","params":{"utcTimestamp": 0,"SOURCES":[0,2,2,2,0,0,2,2,2],"DIGESTS":["age",30,"u30",30,"age","age",40,"u40","u100"]}},"missing_bucket":true,"missing_order":"first","order":"asc"}}}]},"aggregations":{"avg_age":{"avg":{"field":"age"}}}}}}, requestedTotalSize=10000, pageSize=null, startFrom=0)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a IT to verify the generated script with skip encoding via format=extended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have test testSkipScriptEncodingOnExtendedFormat. Do you mean set format=extended for agg_case_cannot_push?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in test case testRexStandardizationForScript
The script is formatted below, with several changes:
>=-><=- SEARCH -> AND(...)
- INT nullable=false -> BIGINT nullable=true
- CHAR[xxx] nullable=false -> VARCHAR nullable=true
{
"op": {
"name": "CASE",
"kind": "CASE",
"syntax": "SPECIAL"
},
"operands": [
{
"op": {
"name": "<",
"kind": "LESS_THAN",
"syntax": "BINARY"
},
"operands": [
{
"dynamicParam": 0,
"type": {
"type": "BIGINT",
"nullable": true
}
},
{
"dynamicParam": 1,
"type": {
"type": "BIGINT",
"nullable": true
}
}
]
},
{
"dynamicParam": 2,
"type": {
"type": "VARCHAR",
"nullable": true,
"precision": -1
}
},
{
"op": {
"name": "AND",
"kind": "AND",
"syntax": "BINARY"
},
"operands": [
{
"op": {
"name": "<=",
"kind": "LESS_THAN_OR_EQUAL",
"syntax": "BINARY"
},
"operands": [
{
"dynamicParam": 3,
"type": {
"type": "BIGINT",
"nullable": true
}
},
{
"dynamicParam": 4,
"type": {
"type": "BIGINT",
"nullable": true
}
}
]
},
{
"op": {
"name": "<=",
"kind": "LESS_THAN_OR_EQUAL",
"syntax": "BINARY"
},
"operands": [
{
"dynamicParam": 5,
"type": {
"type": "BIGINT",
"nullable": true
}
},
{
"dynamicParam": 6,
"type": {
"type": "BIGINT",
"nullable": true
}
}
]
}
]
},
{
"dynamicParam": 7,
"type": {
"type": "VARCHAR",
"nullable": true,
"precision": -1
}
},
{
"dynamicParam": 8,
"type": {
"type": "VARCHAR",
"nullable": true,
"precision": -1
}
}
]
}
Signed-off-by: Heng Qian <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_extended_for_standardization.json(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- integ-test/src/test/resources/expectedOutput/calcite/explain_extended_for_standardization.json
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
⚙️ CodeRabbit configuration file
**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure methods are under 20 lines with single responsibility
- Verify proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
integ-test/**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
End-to-end scenarios need integration tests in
integ-test/module
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
⚙️ CodeRabbit configuration file
integ-test/**/*IT.java: - Verify integration tests are in correct module (integ-test/)
- Check tests can be run with ./gradlew :integ-test:integTest
- Ensure proper test data setup and teardown
- Validate end-to-end scenario coverage
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Name integration tests with
*IT.javasuffix in OpenSearch SQL
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify test coverage for new business logic
- Check test naming follows conventions (*Test.java for unit, *IT.java for integration)
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Verify SQL generation and optimization paths
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: security-it-linux (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: build-linux (25, unit)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: build-linux (21, unit)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (21, integration)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: build-linux (25, integration)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: test-sql-cli-integration (21)
- GitHub Check: CodeQL-Scan (java)
| @Test | ||
| public void testRexStandardizationForScript() throws IOException { | ||
| enabledOnlyWhenPushdownIsEnabled(); | ||
| assertYamlEqualsIgnoreId( | ||
| loadExpectedPlan("explain_extended_for_standardization.json"), | ||
| explainQueryToString( | ||
| String.format( | ||
| "source=%s | eval age_range = case(age < 30, 'u30', age >= 30 and age <= 40, 'u40'" | ||
| + " else 'u100') | stats avg(age) as avg_age by age_range", | ||
| TEST_INDEX_BANK), | ||
| true)); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use JSON assertion helper for extended explain output
This test loads a *.json expected plan and uses explainQueryToString(..., true), which is JSON in all other tests, but asserts via assertYamlEqualsIgnoreId. For consistency with the rest of this class and to avoid format confusion, this should use the JSON matcher:
- assertYamlEqualsIgnoreId(
- loadExpectedPlan("explain_extended_for_standardization.json"),
- explainQueryToString(
+ assertJsonEqualsIgnoreId(
+ loadExpectedPlan("explain_extended_for_standardization.json"),
+ explainQueryToString(
String.format(
"source=%s | eval age_range = case(age < 30, 'u30', age >= 30 and age <= 40, 'u40'"
+ " else 'u100') | stats avg(age) as avg_age by age_range",
TEST_INDEX_BANK),
true));This keeps the new Rex standardization coverage while aligning with existing extended explain tests. Based on learnings, this strengthens Calcite SQL/script generation verification.
🤖 Prompt for AI Agents
In
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
around lines 2083 to 2094, the test constructs an extended JSON explain
(explainQueryToString(..., true)) but asserts with assertYamlEqualsIgnoreId;
replace the YAML assertion with the JSON assertion helper (use the existing
assertJsonEqualsIgnoreId or the project's JSON matcher) so the loaded expected
JSON plan is compared as JSON to the explain output, keeping the same expected
file and parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid reusing FORMAT parameter for EXPLAIN OPTION and add a new parameter for it. EXPLAIN OPTION affects the content of this API while FORMAT only affects the format like JSON, CSV, YAML.
@coderabbitai Could you create an issue to track this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for explainQueryToString method definition and usage to understand FORMAT parameter
rg -n "explainQueryToString" --type java -C 3
Length of output: 54012
🏁 Script executed:
#!/bin/bash
# Search for FORMAT and EXPLAIN related code in the codebase
rg -n "FORMAT.*EXPLAIN|EXPLAIN.*FORMAT" --type java -C 2
Length of output: 5954
🏁 Script executed:
#!/bin/bash
# Look for the explain query implementation to understand current parameter usage
ast-grep --pattern 'explainQueryToString($$$) {
$$$
}'
Length of output: 0
🏁 Script executed:
#!/bin/bash
# Find explain-related API endpoints and format handling
rg -n "explain.*format|format.*explain" --type java -i -C 2
Length of output: 53486
🏁 Script executed:
#!/bin/bash
# Create GitHub issue to track separation of FORMAT from EXPLAIN OPTION
gh issue create \
--repo opensearch-project/sql \
--title "Separate FORMAT parameter from EXPLAIN verbosity option in explain API" \
--body "## Description
The \`format\` query parameter in the explain API is currently overloaded to serve two distinct purposes:
1. **Output format control** (JSON, CSV, YAML) - controls how the response is serialized
2. **Explain verbosity level** (SIMPLE, STANDARD, EXTENDED, COST) - controls the content/detail level of the explain output
This violates the single responsibility principle and creates confusion, as FORMAT should only control the serialization format of the response, not the semantic content.
## Proposal
Introduce a new query parameter (e.g., \`explain_level\`, \`explain_option\`, or \`verbosity\`) specifically for controlling the explain verbosity, while keeping \`format\` strictly for output serialization format.
### Current behavior:
\`\`\`
/_plugins/_ppl/_explain?format=extended // controls BOTH verbosity AND output format
\`\`\`
### Proposed behavior:
\`\`\`
/_plugins/_ppl/_explain?format=json&explain_level=extended // separate concerns
/_plugins/_ppl/_explain?format=yaml&explain_level=standard
\`\`\`
## Context
This issue was identified during code review of PR #4914 (RexCall and RelDataType standardization for script push down).
**Related discussion:** https://github.com/opensearch-project/sql/pull/4914#discussion_r2608963825
## Benefits
1. **Clear separation of concerns:** FORMAT controls serialization, explain option controls content
2. **Better API design:** Each parameter has a single, well-defined responsibility
3. **Improved extensibility:** Easier to add new output formats or explain levels independently
4. **Reduced confusion:** Users understand what each parameter does
## Implementation considerations
- Maintain backward compatibility by supporting the old format parameter behavior
- Add deprecation warnings for the old usage pattern
- Update documentation to reflect the new parameter
- Update integration tests
**Reported by:** @qianheng-aws" \
--label "enhancement" \
--assignee "qianheng-aws"
Length of output: 53
@qianheng-aws Done! I've created issue #4940 to track the separation of the FORMAT parameter from the EXPLAIN verbosity option.
The issue proposes introducing a new query parameter (e.g., explain_level or verbosity) to control the explain detail level (SIMPLE, STANDARD, EXTENDED, COST), while keeping format strictly for output serialization (JSON, YAML, CSV). This will provide a cleaner separation of concerns where:
formatcontrols how the response is serialized- The new parameter controls what content is included in the explain output
The issue includes context from this PR and your comment, and is assigned to you with the enhancement label.
🧬 Code Graph Analysis Results
<no_relevant_code_snippets>
🧠 Learnings used
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Signed-off-by: Heng Qian <[email protected]>
| return call.clone(call.getType(), clonedOperands); | ||
| } else { | ||
| return call; | ||
| // Try to expand `SEARCH` before standardization since we cannot translate literal `Sarg`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I encountered a similar problem -- Sarg can be serialized but can not be properly deserialized. I addressed it by replacing Sarg sarg = sargFromJson((Map) literal) with Sarg sarg = sargFromJson((Map) literal, type) in ExtendedRelJson.java. Maybe it helps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem I met here is I cannot translate a Sarg literal into a java object by using RexToLixTranslator.translateLiteral. Thus, we cannot generate a parameter for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the implementation of RexToLixTranslator , it seems Sarg can only be used as literal in RexNode expression but not DynamicParams.
|
The backport to To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4914-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 bcfcd002d5ec402f257a92f9689097d1c9bf8979
# Push it to GitHub
git push --set-upstream origin backport/backport-4914-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-devThen, create a pull request where the |
…rch-project#4914) * RexCall & SqlType standardization Signed-off-by: Heng Qian <[email protected]> * Change script in IT Signed-off-by: Heng Qian <[email protected]> * Fix UT Signed-off-by: Heng Qian <[email protected]> * Fix IT Signed-off-by: Heng Qian <[email protected]> * Add UT Signed-off-by: Heng Qian <[email protected]> * Fix IT after merging main Signed-off-by: Heng Qian <[email protected]> * Address comments Signed-off-by: Heng Qian <[email protected]> * Address comments Signed-off-by: Heng Qian <[email protected]> --------- Signed-off-by: Heng Qian <[email protected]> (cherry picked from commit bcfcd00) Signed-off-by: Heng Qian <[email protected]>
…4951) * RexCall & SqlType standardization * Change script in IT * Fix UT * Fix IT * Add UT * Fix IT after merging main * Address comments * Address comments --------- (cherry picked from commit bcfcd00) Signed-off-by: Heng Qian <[email protected]>
Description
This PR continues #4795 work, it includes change:
RexNormalize.normalizeRexStandardizer::widenTypeSEARCHto conjunction expressions.Related Issues
Resolves #4757
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.