-
Notifications
You must be signed in to change notification settings - Fork 180
Mvexpand feature #4675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mvexpand feature #4675
Conversation
|
Attaching the manual test cases and its output |
|
Hi maintainers @penghuo , the "Enforce PR labels" check is failing because I can't add labels. Could you please add the required labels to this PR so the checks pass? Thank you! |
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Outdated
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
Outdated
Show resolved
Hide resolved
| super.init(); | ||
| enableCalcite(); | ||
| deleteIndexIfExists(INDEX); | ||
| createIndex( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we rather use Index.MVEXPAND_EDGE_CASES? (Is it duplicate?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks it is not fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to - Index.MVEXPAND_EDGE_CASES.getName()
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
Outdated
Show resolved
Hide resolved
|
Please check CI failure. |
@dai-chen The CI failures are not related to mvexpand. I merged the upstream changes to. my main. CI failure is pointing to - My local run shows those tests pass. But, CI failed. That implies either: |
| } | ||
|
|
||
| public static UnresolvedPlan mvexpand(UnresolvedPlan input, Field field, Integer limit) { | ||
| // attach the incoming child plan so the AST contains the pipeline link |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: unneeded comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is still there.
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Outdated
Show resolved
Hide resolved
core/src/test/java/org/opensearch/sql/calcite/CalciteRelNodeVisitorExpandTest.java
Outdated
Show resolved
Hide resolved
core/src/test/java/org/opensearch/sql/calcite/CalciteRelNodeVisitorExpandTest.java
Outdated
Show resolved
Hide resolved
| bulkInsert( | ||
| INDEX, | ||
| "{\"username\":\"happy\",\"skills\":[{\"name\":\"python\"},{\"name\":\"java\"},{\"name\":\"sql\"}]}", | ||
| "{\"username\":\"single\",\"skills\":[{\"name\":\"go\"}]}", | ||
| "{\"username\":\"empty\",\"skills\":[]}", | ||
| "{\"username\":\"nullskills\",\"skills\":null}", | ||
| "{\"username\":\"noskills\"}", | ||
| "{\"username\":\"missingattr\",\"skills\":[{\"name\":\"c\"},{\"level\":\"advanced\"}]}", | ||
| "{\"username\":\"complex\",\"skills\":[{\"name\":\"ml\",\"level\":\"expert\"},{\"name\":\"ai\"},{\"level\":\"novice\"}]}", | ||
| "{\"username\":\"duplicate\",\"skills\":[{\"name\":\"dup\"},{\"name\":\"dup\"}]}", | ||
| "{\"username\":\"large\",\"skills\":[{\"name\":\"s1\"},{\"name\":\"s2\"},{\"name\":\"s3\"},{\"name\":\"s4\"},{\"name\":\"s5\"},{\"name\":\"s6\"},{\"name\":\"s7\"},{\"name\":\"s8\"},{\"name\":\"s9\"},{\"name\":\"s10\"}]}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it addressed?
I thinks we should add test case for happy at least.
| super.init(); | ||
| enableCalcite(); | ||
| deleteIndexIfExists(INDEX); | ||
| createIndex( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks it is not fixed.
dai-chen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this in expand command code: https://github.com/opensearch-project/sql/blob/main/integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.java#L141.
Could you clarify why not make mvexpand an alias of expand command with enhancements? If we can do this, I think many new changes can be avoided.
Can you try rebasing to current main and make sure your local jdk version is the same as CI? Currently CI is failing |
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
Outdated
Show resolved
Hide resolved
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
Outdated
Show resolved
Hide resolved
@dai-chen Problems in alias attempt: column indices (RexInputRef) drift when reusing expand logic, causing all projected rows to read from the same column. In short, simply delegating to expand bypassed the per-element extraction logic, so the projection kept resolving to the same value on every row. Refactor: both EXPAND and MVEXPAND delegate to buildExpandCore(...), which implements the // shared correlate + uncollect + projection logic. visitExpand applies EXPAND semantics // (remove original array field, optional alias rename). visitMvExpand handles its two special // cases — missing field (returns an empty VALUES row with a nullable placeholder) and // limit= — then delegates to the shared helper. |
ykmr1224
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't reviewed the logics in CalciteRelNodeVisitor since those are too long to read and not organized. Please refactor it first.
Also, I suppose the code changes are mostly directly from coding agent. Please review by yourself first, and reflect our earlier comments as well.
| } | ||
|
|
||
| public static UnresolvedPlan mvexpand(UnresolvedPlan input, Field field, Integer limit) { | ||
| // attach the incoming child plan so the AST contains the pipeline link |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is still there.
| // Allow using INTERNAL_ITEM when the element type is unknown/undefined at planning time. | ||
| // Some datasets (or Calcite's type inference) may give the element an UNDEFINED type. | ||
| // Accept a "ignore" first-argument family so INTERNAL_ITEM(elem, 'key') can still be planned | ||
| // and resolved at runtime (fallback semantics handled at execution side). - Used in MVEXPAND | ||
| registerOperator( | ||
| INTERNAL_ITEM, | ||
| SqlStdOperatorTable.ITEM, | ||
| PPLTypeChecker.family(SqlTypeFamily.IGNORE, SqlTypeFamily.CHARACTER)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to define separate one from above one? (can we add or to above definition?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried. But, Calcite’s OR operator requires both operands to be composite checkers.
The fallback rule I added (IGNORE, CHARACTER) is still a SqlSingleOperandTypeChecker, so Calcite rejects the merge.
| /** | ||
| * Expand command visitor to handle array field expansion. 1. Unnest 2. Join with the original | ||
| * table to get all fields | ||
| * Portions of CalciteRelNodeVisitor related to EXPAND / MVEXPAND. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks there are very long code for expand/mvexpand. Please extract to a class instead of using code section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minimized it to use the same - buildExpandRelNode
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Outdated
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
Outdated
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
Outdated
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
Outdated
Show resolved
Hide resolved
| MVEXPAND_EDGE_CASES( | ||
| "mvexpand_edge_cases", | ||
| "mvexpand_edge_cases", | ||
| getMappingFile("mvexpand_edge_cases_mapping.json"), | ||
| "src/test/resources/mvexpand_edge_cases.json"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this (and json) is really needed. It seems used only in CalciteExplainIT, but the data does matter for ExplainIT.
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAbstractTest.java
Outdated
Show resolved
Hide resolved
bcbc23f to
2c0ea2c
Compare
📝 WalkthroughSummary by CodeRabbitRelease Notes
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds native PPL mvexpand: lexer/parser/AST node, visitor and analyzer hooks, Calcite planning with array validation and optional per-document limit, DSL/anonymizer updates, operator registration tweak, tests/fixtures, and documentation. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Lexer
participant Parser
participant AstBuilder
participant Analyzer
participant CalcitePlanner
participant Executor
User->>Lexer: "source=t | mvexpand field [limit=N]"
Lexer->>Parser: tokens (includes MVEXPAND)
Parser->>AstBuilder: parse mvexpandCommand
AstBuilder->>AstBuilder: build MvExpand(field, limit)
AstBuilder->>Analyzer: emit MvExpand node
Analyzer->>CalcitePlanner: route node for Calcite planning
CalcitePlanner->>CalcitePlanner: resolve field name & type
alt missing or non-array field
CalcitePlanner->>CalcitePlanner: produce empty projection or semantic error
else array field present
CalcitePlanner->>CalcitePlanner: create Expand RelNode, apply optional per-doc limit
end
CalcitePlanner->>Executor: return finalized RelNode / plan
Executor->>User: execute -> rows (one per array element)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (8)
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java (1)
93-93: Mvexpand parser wiring is correct; consider usinginternalVisitExpressionfor consistencyThe new
visitMvexpandCommandcorrectly buildsMvExpand(field, limit)from the grammar. For stylistic consistency with nearby visitors (e.g.,visitExpandCommand), you could useinternalVisitExpression(ctx.fieldExpression())instead of callingexpressionBuilder.visitdirectly, but functionally they are equivalent.Also applies to: 870-876
ppl/src/main/antlr/OpenSearchPPLParser.g4 (1)
81-82: mvexpand grammar looks good; consider usingintegerLiteralfor limit for consistencyThe new
mvexpandCommandis wired correctly intocommandsandcommandName, andLIMIT EQUAL ...matches other named-arg patterns.For consistency with options like
timechart/chart/binthat useLIMIT EQUAL integerLiteral, consider changing:mvexpandCommand : MVEXPAND fieldExpression (LIMIT EQUAL INTEGER_LITERAL)? ;to:
mvexpandCommand : MVEXPAND fieldExpression (LIMIT EQUAL integerLiteral)? ;so all numeric options share the same literal rule and future validation is uniform.
Also applies to: 120-121, 532-534
docs/user/ppl/index.rst (1)
51-109: Exposemvexpandin the commands table as wellThe new bullets (including
mvexpand command) are useful entry points, butmvexpandis not listed in the big commands table below, which summarizes version/status/descriptions.For consistency and discoverability, consider adding a row to that table for
mvexpand(with version introduced, status, and a short description) alongsideexpand/flatten.ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java (1)
82-83: Mvexpand anonymization behavior is correct; consider reusingvisitExpressionfor field maskingThe new
visitMvExpandcorrectly:
- delegates to the child plan, and
- emits
| mvexpand identifierwith optionallimit=***, matching the new tests.To stay consistent with
visitExpand/visitFlatten, you might prefer:@Override public String visitMvExpand(MvExpand node, String context) { String child = node.getChild().get(0).accept(this, context); String field = visitExpression(node.getField()); if (node.getLimit() != null) { return StringUtils.format("%s | mvexpand %s limit=%s", child, field, MASK_LITERAL); } return StringUtils.format("%s | mvexpand %s", child, field); }This keeps any special handling in
maskField/expressionAnalyzer(e.g., meta fields) consistent across commands while preserving the current output.Also applies to: 655-664
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (1)
117-163: Smoke tests validate planning succeeds but lack output assertions.These tests (
testMvExpandWithLimitthroughtestMvExpandPrimitiveArray) only verify thatgetRelNode(ppl)succeeds without checking the resulting plan. While they confirm no planning exceptions occur, consider adding at least basicverifyLogicalassertions for one or two key edge cases (e.g.,testMvExpandWithLimit) to ensure the generated plan includes expected operators.Example enhancement for
testMvExpandWithLimit:@Test public void testMvExpandWithLimit() { String ppl = "source=DEPT | mvexpand EMPNOS | head 1"; RelNode root = getRelNode(ppl); String expectedLogical = "LogicalSort(fetch=[1])\n" + " LogicalProject(DEPTNO=[$0], EMPNOS=[$2])\n" + " LogicalCorrelate(...)\n"; verifyLogical(root, expectedLogical); }integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (3)
61-70: Consider logging suppressed exceptions for debugging.The
cleanupAfterEachmethod silently ignores all exceptions. While this is acceptable for cleanup, logging at debug level could help diagnose test failures:@AfterEach public void cleanupAfterEach() throws Exception { try { deleteIndexIfExists(INDEX + "_not_array"); deleteIndexIfExists(INDEX + "_missing_field"); deleteIndexIfExists(INDEX + "_limit_test"); deleteIndexIfExists(INDEX + "_int_field"); } catch (Exception ignored) { + // Cleanup failures are expected if indices don't exist } }
227-251: Limit test validates row count but doesn't confirm per-document semantics.This test uses a single user document, so it cannot distinguish between per-document limiting (3 elements from this document) vs. global limiting (3 rows total). Consider adding a multi-document scenario to clarify the expected behavior:
@Test public void testMvexpandLimitWithMultipleDocuments() throws Exception { // Insert two users each with 5 skills bulkInsert(idx, "{\"username\":\"user1\",\"skills\":[{\"name\":\"a\"},{\"name\":\"b\"},{\"name\":\"c\"},{\"name\":\"d\"},{\"name\":\"e\"}]}", "{\"username\":\"user2\",\"skills\":[{\"name\":\"f\"},{\"name\":\"g\"},{\"name\":\"h\"},{\"name\":\"i\"},{\"name\":\"j\"}]}"); // With limit=3: // - If per-document: expect 6 rows (3 per user) // - If global: expect 3 rows total // Clarify expected behavior with assertion }
340-360: Quote_idvalue in bulk request JSON for correctness.The
_idfield in OpenSearch bulk requests should be a JSON string. While OpenSearch may auto-convert numeric values, explicit quoting is more correct:- bulk.append("{\"index\":{\"_id\":").append(id).append("}}\n"); + bulk.append("{\"index\":{\"_id\":\"").append(id).append("\"}}\n");This ensures proper JSON formatting regardless of whether the id is numeric or string-based.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (27)
core/src/main/java/org/opensearch/sql/analysis/Analyzer.java(2 hunks)core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java(2 hunks)core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java(2 hunks)core/src/main/java/org/opensearch/sql/ast/tree/MvExpand.java(1 hunks)core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java(2 hunks)core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java(1 hunks)docs/category.json(1 hunks)docs/user/dql/metadata.rst(3 hunks)docs/user/ppl/cmd/mvexpand.rst(1 hunks)docs/user/ppl/index.rst(1 hunks)doctest/test_data/mvexpand_logs.json(1 hunks)doctest/test_docs.py(1 hunks)doctest/test_mapping/mvexpand_logs.json(1 hunks)integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java(1 hunks)integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java(2 hunks)integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java(1 hunks)integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/explain_mvexpand.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_mvexpand.yaml(1 hunks)integ-test/src/test/resources/mvexpand_edge_cases.json(1 hunks)integ-test/src/test/resources/mvexpand_edge_cases_mapping.json(1 hunks)ppl/src/main/antlr/OpenSearchPPLLexer.g4(1 hunks)ppl/src/main/antlr/OpenSearchPPLParser.g4(3 hunks)ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java(2 hunks)ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java(2 hunks)ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java(1 hunks)ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java (1)
legacy/src/main/java/org/opensearch/sql/legacy/utils/StringUtils.java (1)
StringUtils(17-117)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)
integ-test/src/test/java/org/opensearch/sql/sql/IdentifierIT.java (1)
Index(218-262)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (2)
integ-test/src/test/java/org/opensearch/sql/ppl/PPLIntegTestCase.java (1)
PPLIntegTestCase(36-409)integ-test/src/test/java/org/opensearch/sql/sql/IdentifierIT.java (1)
Index(218-262)
🪛 Biome (2.1.2)
integ-test/src/test/resources/mvexpand_edge_cases.json
[error] 1-2: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 2-3: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 3-4: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 4-5: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 5-6: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 6-7: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 7-8: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 8-9: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 9-10: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 10-11: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 11-12: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 12-13: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 13-14: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 14-15: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 15-16: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 16-17: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 17-18: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
doctest/test_data/mvexpand_logs.json
[error] 1-2: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 2-3: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 3-4: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 4-5: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 5-6: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
🔇 Additional comments (33)
core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java (1)
65-66: Mvexpand DSL helper is consistent with other unary plan buildersThe new
mvexpandbuilder cleanly mirrorsexpand(attach-to-input, carry field and nullable limit) and fits the existing AST DSL patterns.Also applies to: 140-142
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java (1)
35-35: AddingCalciteMvExpandCommandITto the no-pushdown suite makes senseIncluding mvexpand in this suite keeps coverage consistent with other Calcite remote commands under the non-pushdown configuration.
doctest/test_mapping/mvexpand_logs.json (1)
1-24: Mvexpand doctest mapping looks appropriateThe nested
tags.valueandids.valueshapes are suitable for exercising mvexpand on multi-valued/nested fields, and the rest of the mapping is straightforward.core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java (1)
1072-1079: Confirm scope of the broadINTERNAL_ITEM(IGNORE, CHARACTER)overloadThis overload is a reasonable fallback for mvexpand when the container element type is
UNDEFINED, but it effectively accepts any first-argument type as long as the second is in the CHARACTER family. That’s fine ifINTERNAL_ITEMtruly remains internal-only and all call sites still guarantee an array/map-like first argument, but it would allow nonsensical combinations to pass type checking if it were ever user-exposed or reused elsewhere.Please double-check:
- That
INTERNAL_ITEMis not reachable as a normal PPL function, and- That runtime handling still validates the container type and fails fast for clearly invalid inputs.
If either assumption doesn’t hold, it may be worth tightening this overload further or guarding its use where mvexpand builds the calls.
core/src/main/java/org/opensearch/sql/ast/tree/MvExpand.java (1)
1-46: MvExpand AST node design matches existing unary plan patternsImmutable
field/limit, covariantattach,getChild()and visitor wiring all look consistent with otherUnresolvedPlannodes.integ-test/src/test/resources/mvexpand_edge_cases_mapping.json (1)
1-8: Edge-case mvexpand mapping is straightforward and suitable
usernameaskeywordplusskillsasnestedgives a clean shape for exercising mvexpand against nested/multivalue fields in integration tests.integ-test/src/test/resources/expectedOutput/calcite/explain_mvexpand.yaml (1)
1-7: Explain expected output for mvexpand fits existing Calcite fixture styleThe logical and physical plan snippets follow the existing explain-output conventions and surface the mvexpand projection via
VALUE, which should keep the explain tests stable.integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (2)
45-45: LGTM: Test index loading.The MVEXPAND_EDGE_CASES index is properly registered in the test initialization, consistent with other test indices.
319-327: The review comment references code that does not exist in the repository.The test method
testMvexpandExplain()at lines 319-327 is not present inCalciteExplainIT.java. Additionally, no mvexpand-related test data files (mvexpand_edge_cases.json, explain_mvexpand.yaml) or any references to "mvexpand" exist anywhere in the codebase. The review comment appears to be referencing code from a different version, branch, or context that is not part of the current PR.Likely an incorrect or invalid review comment.
ppl/src/main/antlr/OpenSearchPPLLexer.g4 (1)
53-53: LGTM: Lexer token addition.The MVEXPAND token is correctly defined following standard ANTLR patterns and is appropriately positioned in the COMMAND KEYWORDS section.
integ-test/src/test/resources/mvexpand_edge_cases.json (1)
1-18: LGTM: Comprehensive edge case test data.The test data covers essential mvexpand scenarios including empty arrays, null values, missing fields, and large arrays. The format follows the standard OpenSearch bulk indexing format (alternating index metadata and data lines).
Note: The static analysis errors from Biome are false positives—this file uses the correct bulk format for OpenSearch indexing, not standard JSON array format.
integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_mvexpand.yaml (1)
1-7: Expected output structure looks correct, but verify alignment with test query.The explain plan structure is well-formed with appropriate logical and physical sections. However, note that Line 4 references
VALUE=[$10]which aligns with the test query usingmvexpand VALUE. This should be verified along with the CalciteExplainIT test to ensure the field name is correct or intentionally testing missing-field behavior.core/src/main/java/org/opensearch/sql/analysis/Analyzer.java (2)
81-81: LGTM: Import addition.
705-708: LGTM: Calcite-only visitor implementation.The visitMvExpand method correctly delegates to getOnlyForCalciteException, consistent with other Calcite-only commands like Bin, Expand, and Flatten. This ensures mvexpand is only processed in the Calcite execution path.
doctest/test_docs.py (1)
50-51: LGTM: Test data registration.The mvexpand_logs test data is correctly registered in the TEST_DATA dictionary, enabling doctest support for the mvexpand documentation examples.
docs/user/ppl/cmd/mvexpand.rst (3)
1-24: LGTM: Clear documentation structure.The introduction, description, and syntax sections are well-written and provide clear guidance on the mvexpand command. The syntax specification correctly identifies the required field parameter and optional limit parameter.
25-29: Good approach: Doctest stability notes.The note explaining the use of
where case='<name>'for deterministic results is helpful for understanding the test structure.
30-103: LGTM: Comprehensive examples with good edge case coverage.The five examples effectively demonstrate:
- Basic expansion behavior
- Limit parameter usage
- Empty and null array handling
- Single-element arrays
- Missing field behavior
The doctest format with expected output makes these examples testable and maintainable. Each example uses targeted queries for reproducible results.
docs/user/dql/metadata.rst (1)
38-66: LGTM: Metadata updated for new test index.The fetched row count correctly increases from 23 to 24, and the mvexpand_logs table entry is properly added to the SHOW TABLES output, maintaining consistent formatting with existing entries.
ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java (1)
35-37: Updated search anonymization expectation is consistentThe updated expectation for
testSearchCommand("source=table a:***") aligns with the current anonymizer output style. No action needed here.core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java (1)
69-69: MvExpand integration intoAbstractNodeVisitoris consistentThe added
MvExpandimport andvisitMvExpandmethod delegate tovisitChildrenin the same way as other plan nodes, which is exactly what downstream visitors (Analyzer, anonymizer, Calcite planner) need.No changes suggested.
Also applies to: 456-458
doctest/test_data/mvexpand_logs.json (1)
1-6: mvexpand doctest data shape looks appropriate (NDJSON-style)The file uses one JSON object per line to cover basic/empty/null/single/ids/missing cases, which is a reasonable shape for doctest data loaders that iterate line by line. The Biome “End of file expected” hints can be ignored here unless your tooling explicitly requires a single JSON array wrapper.
docs/category.json (1)
71-73: Category entry formvexpanddocs is correctly wiredAdding
user/ppl/cmd/mvexpand.rst(and the identifiers general doc) underppl_cli_calcitealigns with how other PPL command docs are exposed to the docs build.Looks good as-is.
integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java (1)
702-706: NewMVEXPAND_EDGE_CASEStest index is consistent with existing patternsThe added enum constant uses the standard
(name, type, mappingFile, dataSet)pattern and will integrate cleanly withloadIndex(Index.MVEXPAND_EDGE_CASES, ...)in ITs.No changes suggested, assuming the referenced mapping/data files exist in
src/test/resources(which test runs will validate).core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (5)
81-85: Wildcard imports are acceptable here for reducing import clutter.These wildcard imports consolidate multiple specific imports from the expression and tree packages. This aligns with the existing code style in the file.
2848-2851: Verify:limitapplies globally, not per-document.The PR objectives state the limit should "cap how many values per document are expanded" (per-document limit). However, the current implementation applies
relBuilder.limit(0, limit)after the expand, which creates a global LIMIT clause—limiting total output rows rather than per-document.For example, with 2 documents each having 5 array elements and
limit=3:
- Current behavior: returns 3 rows total
- Expected per-document behavior: returns 6 rows (3 per document)
If global limit is the intended behavior, please clarify in documentation. Otherwise, per-document limiting requires a window function with ROW_NUMBER partitioned by document.
2806-2828: Missing-field handling returns empty result with schema including the missing field.The approach of projecting all existing fields plus a NULL placeholder for the missing field, then filtering to 0 rows, preserves the expected output schema while returning no data. This is a valid strategy for handling missing fields gracefully.
2833-2840: Type validation for non-ARRAY fields is correctly implemented.The code properly checks
SqlTypeName.ARRAYand throws a clearSemanticCheckExceptionwith an informative message including both the field name and the actual type found.
2856-2865: Based on my comprehensive verification, I can now provide the final assessment:The review comment references code that does not exist in this repository.
After thorough investigation, I found that:
- No
extractFieldName(Field f)method exists at lines 2856-2865 inCalciteRelNodeVisitor.javaor anywhere else in the codebase- No mvexpand functionality exists anywhere in the repository
CalciteRelNodeVisitor.javahas 3,237 lines, but the section at lines 2856-2865 contains code related to field replacement, not anextractFieldNamemethod- The QualifiedName class (which the review references) does support multi-part names via
getParts(), and other code in the repository demonstrates that multi-part qualified names are already handled correctly elsewhereThe review comment appears to be based on incorrect information or references a different codebase/branch. Since the code being reviewed does not exist, there is no actionable suggestion to verify or implement.
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (2)
43-82: TableWithArray implementation is correct and mirrors existing pattern.The inner class properly implements the Calcite
Tableinterface with a row type containing an INTEGER array field (EMPNOS), which is suitable for testing mvexpand functionality.
97-115: Test includes both logical plan and SparkSQL verification.This test properly verifies both the logical plan structure and the equivalent SparkSQL, addressing prior review feedback about adding
verifyPPLToSparkSQLassertions.integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (2)
23-59: Comprehensive test setup with good edge case coverage.The test initialization creates a well-structured index with nested mapping and inserts diverse test documents covering:
- Multiple elements, single element, empty arrays
- Null values and missing fields
- Partial/mixed element shapes
- Duplicate elements and large arrays
- Heterogeneous subfield types
This aligns well with the edge cases documented in the PR objectives.
72-134: Integration tests properly verify data rows and ordering.The tests appropriately use
verifyDataRowswith expected values and include sorting (sort skills.name) to ensure deterministic ordering for assertions. The coverage of single element, empty array, null array, missing field, duplicates, and happy path with multiple elements is thorough.
ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)
321-328: Clarify or remove the misleading comment.The comment "script pushdown" on line 324 is unclear or potentially misleading. The mvexpand command expands multivalue/array fields into separate rows; while the explain plan may show script operations, the primary purpose isn't script pushdown. Consider removing this comment or clarifying what aspect of the plan you're highlighting.
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)
81-81: Consider specific imports over wildcards.While wildcard imports work, specific imports are generally preferred for clarity and to avoid potential naming conflicts. However, if the codebase consistently uses wildcards, this is acceptable.
Also applies to: 85-85
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java(2 hunks)integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)
integ-test/src/test/java/org/opensearch/sql/sql/IdentifierIT.java (1)
Index(218-262)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: WhiteSource Security Check
🔇 Additional comments (2)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)
46-46: LGTM: Test index loaded for mvexpand edge cases.The addition of the MVEXPAND_EDGE_CASES index follows the established pattern for test data initialization.
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)
2800-2862: Consider adding post-materialization type validation.Based on past review comments, you mentioned validating the field type "both at resolve-time and after materialization (to catch rename/alias cases)." Currently, type validation only occurs at line 2843 before calling
buildExpandRelNode. Consider whether additional validation is needed after field resolution to catch edge cases involving renames or aliases.If post-materialization validation is necessary, you could add checks after the field has been fully resolved through any renames or transformations. Otherwise, please confirm that the current pre-resolution validation is sufficient for all cases.
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)
2773-2785: mvexpandlimitis applied globally instead of per document
visitMvExpandcurrently appliesrelBuilder.limit(0, limit)afterbuildExpandRelNode, i.e., on the fully expanded relation. That caps the total number of rows across the entire result set, not the number of expanded values per input document. With multiple input rows, this will drop expansions from later rows instead of limiting each document independently (e.g., two docs with large arrays andlimit=3would yield at most 3 rows total, not 3 per doc), which diverges from the PR objective thatlimitbe per document.Because
buildExpandRelNodeconstructs a correlated right-hand side (LogicalValues→project→uncollect), the natural place for a per-document limit is inside that right branch so it is evaluated once per outer row.Consider refactoring to thread the (validated) limit into
buildExpandRelNodeand apply it on the right side beforebuild():@@ @Override public RelNode visitExpand(Expand expand, CalcitePlanContext context) { @@ - buildExpandRelNode(arrayFieldRex, arrayField.getField().toString(), alias, context); + buildExpandRelNode(arrayFieldRex, arrayField.getField().toString(), alias, null, context); @@ public RelNode visitPatterns(Patterns node, CalcitePlanContext context) { @@ - buildExpandRelNode( - context.relBuilder.field(node.getAlias()), node.getAlias(), node.getAlias(), context); + buildExpandRelNode( + context.relBuilder.field(node.getAlias()), + node.getAlias(), + node.getAlias(), + null, + context); @@ @Override public RelNode visitMvExpand(MvExpand mvExpand, CalcitePlanContext context) { @@ - // 2C. Valid array → expand - int index = matched.getIndex(); - RexInputRef fieldRef = context.rexBuilder.makeInputRef(type, index); - - buildExpandRelNode(fieldRef, fieldName, fieldName, context); - - Integer limit = mvExpand.getLimit(); - if (limit != null) { - if (limit <= 0) { - throw new SemanticCheckException( - String.format("mvexpand limit must be positive, but got %d", limit)); - } - relBuilder.limit(0, limit); - } + // 2C. Valid array → expand (with optional per-document limit) + int index = matched.getIndex(); + RexInputRef fieldRef = context.rexBuilder.makeInputRef(type, index); + + Integer limit = mvExpand.getLimit(); + if (limit != null && limit <= 0) { + throw new SemanticCheckException( + String.format("mvexpand limit must be positive, but got %d", limit)); + } + + buildExpandRelNode(fieldRef, fieldName, fieldName, limit, context); @@ - private void buildExpandRelNode( - RexInputRef arrayFieldRex, String arrayFieldName, String alias, CalcitePlanContext context) { + private void buildExpandRelNode( + RexInputRef arrayFieldRex, + String arrayFieldName, + String alias, + @Nullable Integer perDocLimit, + CalcitePlanContext context) { @@ - // 5. Build join right node and expand the array field using uncollect - RelNode rightNode = - context - .relBuilder - // fake input, see convertUnnest and convertExpression in Calcite SqlToRelConverter - .push(LogicalValues.createOneRow(context.relBuilder.getCluster())) - .project(List.of(correlArrayFieldAccess), List.of(arrayFieldName)) - .uncollect(List.of(), false) - .build(); + // 5. Build join right node and expand the array field using uncollect + context + .relBuilder + // fake input, see convertUnnest and convertExpression in Calcite SqlToRelConverter + .push(LogicalValues.createOneRow(context.relBuilder.getCluster())) + .project(List.of(correlArrayFieldAccess), List.of(arrayFieldName)) + .uncollect(List.of(), false); + + if (perDocLimit != null) { + context.relBuilder.limit(0, perDocLimit); + } + + RelNode rightNode = context.relBuilder.build(); @@ - context - .relBuilder - .push(leftNode) - .push(rightNode) - .correlate(JoinRelType.INNER, correlVariable.get().id, List.of(arrayFieldRex)) - // 7. Remove the original array field from the output. - // TODO: RFC: should we keep the original array field when alias is present? - .projectExcept(arrayFieldRex); + context + .relBuilder + .push(leftNode) + .push(rightNode) + .correlate(JoinRelType.INNER, correlVariable.get().id, List.of(arrayFieldRex)) + // 7. Remove the original array field from the output. + // TODO: RFC: should we keep the original array field when alias is present? + .projectExcept(arrayFieldRex);This way the limit is applied inside the correlated right branch and is naturally per-document while keeping existing
expandandpatternsbehaviour unchanged (they just passnull). As per the PR objectives, this aligns Calcite mvexpand with the intended per-document limit semantics.Also applies to: 2799-2866, 791-793, 3125-3171
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)
2927-2939: Fix nested field path handling.The
extractFieldNamemethod currently returns only the first segment of aQualifiedName(line 2933:parts.get(0)), which drops nested path components. For example,user.skillswould become justuser, causing field resolution to fail for nested arrays.This issue was previously identified in review comments. To support nested fields like
user.skills, the full dotted path must be preserved.Apply this diff to fix nested field resolution:
if (inner instanceof QualifiedName) { List<String> parts = ((QualifiedName) inner).getParts(); if (!parts.isEmpty()) { - return parts.get(0); + return String.join(".", parts); } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java(4 hunks)integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
🔇 Additional comments (1)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)
2846-2925: LGTM with observations.The visitMvExpand implementation correctly handles the three key cases:
- Missing field → returns empty result with correct schema (lines 2874-2896)
- Non-array type → throws clear exception (lines 2902-2907)
- Valid array → expands and applies validated limit (lines 2909-2922)
The limit validation (lines 2917-2920) properly checks for positive values and provides a clear error message, addressing previous review feedback.
Minor observation: The missing field handling creates a full projection with NULL before filtering to zero rows (lines 2874-2894). While this preserves the schema correctly, it adds overhead compared to directly creating an empty result. However, this appears intentional based on the PR objectives to handle missing fields gracefully.
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java(3 hunks)ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java(2 hunks)
🔇 Additional comments (2)
ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java (2)
36-36: Updated search anonymization expectation looks correctThe new
"source=table a:***"format aligns with the anonymizer’s current behavior and keeps the test in sync; no issues here.
848-858: Mvexpand anonymizer tests are well‑aligned and restore full coverageBoth mvexpand tests match the anonymization conventions (identifier masking and limit redaction), and with
@TestontestMvdedupstill present, the earlier lost‑annotation issue is fully resolved.
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
ppl/src/main/antlr/OpenSearchPPLParser.g4 (1)
81-81: Parser wiring for MVEXPAND looks correct; consider aligning LIMIT literal with other commands
- Adding
mvexpandCommandintocommandsandMVEXPANDintocommandNamecleanly integrates the new command into the PPL pipeline and keyword set.mvexpandCommand : MVEXPAND fieldExpression (LIMIT EQUAL INTEGER_LITERAL)?matches the intendedmvexpand <field> limit=<int>syntax and addresses prior feedback to use theEQUALtoken.Two minor nits:
Other LIMIT-like parameters (e.g.,
chartOptions,timechartParameter) useintegerLiteralinstead of bareINTEGER_LITERAL. For consistency and to match existing patterns, you might switch to:mvexpandCommand : MVEXPAND fieldExpression (LIMIT EQUAL integerLiteral)? ;This keeps grammar uniform and leaves sign/valid-range checks to semantic validation. If you intentionally want to forbid signed values here, the current
INTEGER_LITERALis fine; just ensure the behavior is documented.Given the new command, double-check that
AstBuilder.visitMvexpandCommandand related components (e.g.,MvExpandnode, anonymizer, planner) stay in lockstep with this rule shape (especially the optional LIMIT) and that tests cover both with/without LIMIT. Based on learnings, keeping AST generation in sync with grammar changes is important.Also applies to: 120-120, 532-534
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)
321-327: NewtestMvexpandExplainis concise and aligned with existing explain testsThis test follows the existing pattern (load expected plan +
explainQueryYaml) and gives mvexpand a Calcite explain coverage point. If there isn’t one already elsewhere, consider adding an additional explain IT that coversmvexpandwith alimit=<n>option to lock in the planning behavior for the limit semantics as well.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java(1 hunks)integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java(2 hunks)ppl/src/main/antlr/OpenSearchPPLLexer.g4(1 hunks)ppl/src/main/antlr/OpenSearchPPLParser.g4(3 hunks)ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
- core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
⚙️ CodeRabbit configuration file
**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure methods are under 20 lines with single responsibility
- Verify proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
integ-test/**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
End-to-end scenarios need integration tests in
integ-test/module
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
⚙️ CodeRabbit configuration file
integ-test/**/*IT.java: - Verify integration tests are in correct module (integ-test/)
- Check tests can be run with ./gradlew :integ-test:integTest
- Ensure proper test data setup and teardown
- Validate end-to-end scenario coverage
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Name integration tests with
*IT.javasuffix in OpenSearch SQL
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify test coverage for new business logic
- Check test naming follows conventions (*Test.java for unit, *IT.java for integration)
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Verify SQL generation and optimization paths
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify AST generation for new PPL parser syntax
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
🧬 Code graph analysis (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)
integ-test/src/test/java/org/opensearch/sql/sql/IdentifierIT.java (1)
Index(218-262)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: WhiteSource Security Check
🔇 Additional comments (2)
ppl/src/main/antlr/OpenSearchPPLLexer.g4 (1)
53-53: MVEXPAND lexer token is correctly defined and aligned with existing commandsPlacement next to
EXPAND, simple literal pattern, andcaseInsensitive = trueall look consistent with existing command tokens.integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)
30-48: The code referenced in this review comment (Index.MVEXPAND_EDGE_CASES and its integration into CalciteExplainIT.init()) does not exist in the current repository. The actual CalciteExplainIT.java loads 11 indices ending with Index.DATA_TYPE_ALIAS, not Index.MVEXPAND_EDGE_CASES. Manual verification is required by reviewing the actual pull request or merged code to confirm that the Index.MVEXPAND_EDGE_CASES constant correctly maps to the "mvexpand_edge_cases" index name used in test queries.
|
@dai-chen I believe the test failures in CI are unrelated to my changes. |
Not sure what's going wrong. Could you see if it goes away after merging from |
c9e2767 to
90ee47c
Compare
Description
This pull request adds native support for the mvexpand command in PPL to OpenSearch SQL, enabling users to expand multivalue fields (arrays) into separate rows directly within queries. This functionality is analogous to Splunk's mvexpand command and streamlines analytics, dashboarding, and data preparation involving arrays or multivalue fields.
Key features introduced:
Native mvexpand command for PPL queries to expand array fields into separate rows/events.
Optional limit parameter to restrict the number of expanded values per event/document.
Robust handling of empty/null arrays, large arrays (with memory/resource limits), and non-array fields.
Streaming/distributable execution for performance and scalability.
Comprehensive documentation and edge case coverage.
This feature makes OpenSearch SQL more powerful and user-friendly for log analytics, data exploration, and migration from platforms like Splunk.
Related Issues
Resolves #4439
#4439
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.