-
Notifications
You must be signed in to change notification settings - Fork 181
Adding PPL convert command with numeric and memory conversion functions
#5034
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Adding PPL convert command with numeric and memory conversion functions
#5034
Conversation
📝 WalkthroughSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds a new PPL Changes
Sequence Diagram(s)sequenceDiagram
participant Parser
participant AstBuilder
participant Analyzer
participant FieldResolver
participant Calcite
participant UDFs
participant Store
Parser->>AstBuilder: parse "CONVERT ..." -> Convert node
AstBuilder->>AstBuilder: build Let conversions (function, source field, target/alias)
AstBuilder-->>Analyzer: emit Convert AST
Analyzer->>FieldResolver: visitConvert(Convert)
FieldResolver->>FieldResolver: compute inputFields & outputFields
Analyzer->>Calcite: visitConvert(Convert)
Calcite->>Calcite: iterate conversions
alt field-copy conversion
Calcite->>Calcite: record replacement or alias mapping
else function conversion
Calcite->>UDFs: reference conversion UDF (AUTO/NUM/...)
UDFs-->>Calcite: provide Rex expression / conversion logic
end
Calcite->>Calcite: build projection applying replacements + appended converted aliases
Calcite-->>Store: execute planned query
Store-->>Calcite: return rows
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ast-grep (0.40.5)core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javaComment |
ea04424 to
c283b39
Compare
- Added convert command syntax and AST nodes (Convert, ConvertFunction) - Implemented 5 conversion functions: auto, num, rmcomma, rmunit, none - Full Calcite pushdown support via CalciteRelNodeVisitor - Logical and physical operators (LogicalConvert, ConvertOperator) - Comprehensive test coverage (26 tests total): * 13 unit tests (CalcitePPLConvertTest) * 8 integration tests with pushdown (ConvertCommandIT) * 8 non-pushdown tests (CalciteConvertCommandIT) * 3 explain tests (ExplainIT) * 2 cross-cluster tests (CrossClusterSearchIT) * Anonymizer and v2 unsupported tests - User documentation (docs/user/ppl/cmd/convert.md) - Code cleanup: removed dead code, simplified javadocs - Version: 3.5 (experimental) All tests compile and pass successfully. Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Centralized common conversion function logic into BaseConversionUDF base class, reducing code from 219 to 162 lines (26% reduction) while maintaining all functionality. Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
c379ad2 to
27e4bb9
Compare
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
`@integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteConvertCommandIT.java`:
- Around line 20-21: Update CalciteConvertCommandIT to add
NULL/empty/boundary/multi-doc coverage for the new convert functions: add test
cases that exercise rmcomma, rmunit, memk and other convert functions with (1)
NULL/missing fields, (2) empty and whitespace-only strings, (3) very large
numeric values (boundary/overflow candidates), (4) invalid units specifically
for memk (e.g., unsupported unit strings), and (5) multi-document scenarios
where some docs contain valid values and others contain null/empty/invalid
values to ensure aggregate/row behavior; implement these as new test methods in
the CalciteConvertCommandIT class using the existing PPLIntegTestCase helpers
and name them to reflect the case (e.g., testRmcommaNullEmptyBoundary,
testRmunitMultiDoc, testMemkInvalidUnitAndBoundary) so they run with the
existing test suite.
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteConvertCommandIT.java
Show resolved
Hide resolved
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In
`@core/src/test/java/org/opensearch/sql/expression/function/udf/ConversionFunctionsTest.java`:
- Around line 138-159: Add unit tests in ConversionFunctionsTest to cover null,
empty, and whitespace-only inputs for RmunitConvertFunction: call
RmunitConvertFunction.convert(null), RmunitConvertFunction.convert(""), and
RmunitConvertFunction.convert(" ") and assert they return null. Place these
assertions alongside the existing rmunit tests (e.g., in testRmunitConvert or a
new test method) so the test suite verifies the function correctly handles null
and empty-string edge cases.
- Around line 81-111: Add unit tests for null and empty-string inputs to the
NumConvertFunction coverage: in ConversionFunctionsTest add assertions that
NumConvertFunction.convert(null) returns null and that
NumConvertFunction.convert("") (and optionally strings with only whitespace like
" ") also return null; implement these as new `@Test` cases or extend an
existing test (e.g., testNumConvertWithUnits) to call NumConvertFunction.convert
and assertNull on the results so the behavior matches AutoConvertFunction and
RmcommaConvertFunction tests.
In
`@integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteConvertCommandIT.java`:
- Around line 63-73: The test method testConvertWithAlias in
CalciteConvertCommandIT currently only selects balance_num and thus doesn't
verify that convert auto(balance) AS balance_num preserves the original balance
field; update the query string to include both the original field and the
aliased field in the fields clause (e.g., fields balance, balance_num), then
adjust verifySchema to assert both columns (balance with its original type and
balance_num as double) and update verifyDataRows to include corresponding paired
values for each row so the test asserts preservation of the original balance
alongside the converted balance_num.
core/src/test/java/org/opensearch/sql/expression/function/udf/ConversionFunctionsTest.java
Show resolved
Hide resolved
core/src/test/java/org/opensearch/sql/expression/function/udf/ConversionFunctionsTest.java
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteConvertCommandIT.java
Show resolved
Hide resolved
…tCommand Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
5452d31 to
f7d2f1b
Compare
Signed-off-by: Aaron Alvarez <900908alvarezaaron@gmail.com>
Add testNumConvertNullAndEmpty() to ensure NumConvertFunction.convert() properly returns null for null, empty string, and whitespace-only inputs, matching the coverage pattern of other conversion functions. Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Add testRmunitConvertNullAndEmpty() to ensure RmunitConvertFunction.convert() properly returns null for null, empty string, and whitespace-only inputs, completing the test coverage pattern across all conversion functions. Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Update testConvertWithAlias to verify that using AS clause preserves the original field alongside the converted field. Test now checks both balance (bigint) and balance_num (double) are present in results. Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java`:
- Around line 205-214: The test testConvertCommand only calls verifyQuery when
executeQuery throws ResponseException, so when executeQuery succeeds the result
is never asserted; update the testConvertCommand to capture the successful
response as a JSONObject and call verifyQuery in the success path as well: call
executeQuery(String.format("search source=%s | convert auto(balance)",
TEST_INDEX_BANK)) and convert its response body to a JSONObject (same as done
for TestUtils.getResponseBody(e.getResponse()) in the catch), assign it to
result, and invoke verifyQuery(result) in both the try and catch branches so
both success and error paths are validated; use the same variable names result,
executeQuery, ResponseException, and verifyQuery to locate the change.
- Around line 205-214: Extend NewAddedCommandsIT (e.g., testConvertCommand) to
add focused integration tests that assert successful per-document conversions
and edge cases when Calcite is enabled: add new `@Test` cases that call
executeQuery and on success parse the JSON response (instead of only catching
ResponseException) to verify converted values (correct types/values), handle
null/empty source fields, run multi-document queries to ensure per-document
behavior, test boundary values (large/small/precision cases) and invalid inputs
that should return errors; reference existing helpers like executeQuery,
verifyQuery, TEST_INDEX_BANK and ConvertCommandIT to structure tests and reuse
setup/teardown so each scenario validates actual conversion correctness rather
than only the Calcite-disabled error path.
In `@ppl/src/main/antlr/OpenSearchPPLParser.g4`:
- Around line 531-533: The convertCommand grammar currently enforces mandatory
commas between convertFunction entries; update the rule for convertCommand to
allow optional commas so space-separated function lists are accepted by changing
the repetition from (COMMA convertFunction)* to (COMMA? convertFunction)*;
locate the convertCommand rule and modify that repetition (keeping the existing
TIMEFORMAT EQUAL timeformatValue = stringLiteral optional part and
convertFunction references intact) to match the same COMMA? pattern used by
wcFieldList for consistency.
In `@ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java`:
- Around line 1204-1217: handleNoneConversion currently ignores extra arguments;
add a validation that fieldArgs has exactly one element and fail fast if not. In
handleNoneConversion(List<UnresolvedExpression> fieldArgs, Field targetField)
check if (fieldArgs.size() != 1) and throw an appropriate exception (e.g.,
IllegalArgumentException or a project-specific parse/semantic exception) with a
message like "none() requires exactly one field argument"; keep the subsequent
logic (source/target name check and returning new Let(targetField,
fieldArgs.get(0))) unchanged.
🧹 Nitpick comments (1)
ppl/src/main/antlr/OpenSearchPPLParser.g4 (1)
535-537: Consider usingfieldExpressioninstead offieldListfor single-field semantics.The
convertFunctionrule usesfieldListwhich allows multiple fields, but all current usage patterns show single-field conversions:auto(balance),rmcomma(price),memk(virtual_memory). While the implementation correctly handles multiple fields, restricting tofieldExpressionat the grammar level would enforce single-field constraint and provide clearer intent. Compare withtakeAggFunction(line 815) which explicitly acceptsfieldExpressionfor its field argument.♻️ Suggested change if single-field is intended
convertFunction - : functionName = ident LT_PRTHS fieldList RT_PRTHS (AS alias = fieldExpression)? + : functionName = ident LT_PRTHS fieldExpression RT_PRTHS (AS alias = fieldExpression)? ;
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
`@core/src/test/java/org/opensearch/sql/expression/function/udf/ConversionFunctionsTest.java`:
- Around line 217-260: Add explicit min/max boundary tests in
ConversionFunctionsTest covering zero and numeric extremes for each UDF
(NumConvertFunction, AutoConvertFunction, RmcommaConvertFunction,
RmunitConvertFunction): add assertions for "0", "+0", "-0" and for extreme
values such as String.valueOf(Double.MAX_VALUE), "-" + Double.MAX_VALUE,
String.valueOf(Double.MIN_VALUE), and an overflow string like "1e309" (or
similarly huge value) and assert the actual behavior of the implementation
(e.g., assertEquals(Double.MAX_VALUE, ...), assertEquals(0.0, ...), or
assertNull/assertEquals(Double.POSITIVE_INFINITY, ...) to document
overflow/precision behavior) so tests explicitly capture min/max/overflow
semantics for each function.
core/src/test/java/org/opensearch/sql/expression/function/udf/ConversionFunctionsTest.java
Show resolved
Hide resolved
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
core/src/main/java/org/opensearch/sql/ast/analysis/FieldResolutionVisitor.java
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Show resolved
Hide resolved
integ-test/src/test/resources/expectedOutput/calcite/explain_convert_command.yaml
Show resolved
Hide resolved
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Show resolved
Hide resolved
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
`@core/src/main/java/org/opensearch/sql/ast/analysis/FieldResolutionVisitor.java`:
- Around line 638-658: Add JavaDoc to the public override method visitConvert in
FieldResolutionVisitor: document the purpose (resolving field requirements
across Convert nodes), include `@param` for the Convert node parameter and the
FieldResolutionContext parameter, an `@return` describing that the same Node is
returned, and an `@throws` if the method can propagate any runtime exceptions (or
explicitly state none are thrown). Mention key behavior: that it computes
inputFields/outputFields, adjusts upstream FieldResolutionResult (using
FieldResolutionResult) on the context, pushes and pops requirements, and visits
children. Ensure the JavaDoc follows project style and is placed directly above
the visitConvert method signature.
Description
Implement a
convertcommand in PPL for numeric and memory data type conversions. This command provides a streamlined interface for converting field values to numeric types, reducing the need for multiple eval statements and improving query readability.Key Features:
auto(),num(),rmcomma(),rmunit(),memk()functionsSyntax:
Example Usage:
Future Enhancements:
This initial implementation focuses on numeric and memory conversions. Time conversions (
ctime(),dur2sec(), etc.) and wildcard support will be implemented in subsequent PRs.This PR partially addresses the current limitation where users must chain multiple evaluation functions or write complex expressions for common conversion operations.
Related Issues
Resolves partially (only numeric and memory conversions in this PR) #5001
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.