-
Notifications
You must be signed in to change notification settings - Fork 180
Fix ClassCastException for value-storing aggregates on nested PPL fields
#4360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ClassCastException in first/last aggregates for nested fieldsClassCastException for aggregates on deeply nested PPL fields
ClassCastException for aggregates on deeply nested PPL fieldsClassCastException for value-storing aggregates on nested PPL fields
Signed-off-by: Kai Huang <[email protected]> # Conflicts: # integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.java
Signed-off-by: Kai Huang <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
24905db to
6d001f7
Compare
| "source=%s | stats min(`resource.attributes.telemetry.sdk.language`) as min_lang," | ||
| + " max(`resource.attributes.telemetry.sdk.language`) as max_lang", | ||
| TEST_INDEX_TELEMETRY)); | ||
| verifySchema(actual, schema("min_lang", "string"), schema("max_lang", "string")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the type of nested language is an integer, can the ppl work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in ObjectContent.java, first the extractFinalPrimitiveValue method will extract the value as Object and then we'll add toString() to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I mean what if the mapping contains
"properties": {
"version": {
"type": "integer"
}
and query with | stats min(`resource.attributes.telemetry.sdk.version`).
Can you update the mapping and data, and test above query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the approach - moved the fix from ObjectContent.java to OpenSearchExprValueFactory.construct() for better universal handling. Updated the telemetry mapping to test for types like integer and boolean
Signed-off-by: Kai Huang <[email protected]>
| Map<String, Object> map = (Map<String, Object>) value; | ||
| if (map.size() == 1) { | ||
| Object singleValue = map.values().iterator().next(); | ||
| return extractFinalPrimitiveValue(singleValue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there risk of a RecursionError here?
If this is only loaded from an index, it's maybe fine (I think you can't index massively nested structures, not sure). But if there's a path where this is executed on user input, it'd be very easy to cause a crash with this.
LantaoJin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if the CI rerun passed.
|
The backport to To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4360-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 e92bf68f67577ba42112e47232ffe86f6c32c909
# Push it to GitHub
git push --set-upstream origin backport/backport-4360-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-devThen, create a pull request where the |
|
@ahkcs please manually backport it, thanks |
…ields (opensearch-project#4360) Signed-off-by: Kai Huang <[email protected]> (cherry picked from commit e92bf68)
…ates on nested PPL fields (#4360) (#4395) * Fix `ClassCastException` for value-storing aggregates on nested PPL fields (#4360) Signed-off-by: Kai Huang <[email protected]> (cherry picked from commit e92bf68) * remove unrelated tests Signed-off-by: Kai Huang <[email protected]> --------- Signed-off-by: Kai Huang <[email protected]>
Summary
Value-storing aggregate functions (
first,last,min,max) failed with aClassCastExceptionwhen targeting deeply nested field paths such asresource.attributes.telemetry.sdk.language.This PR fixes nested-field value extraction in OpenSearch SQL by handling
Map/HashMapobjects that represent nested structures, eliminating the unsafe cast and restoring expected behavior for value-storing aggregates.Problem
These queries previously failed with
ClassCastExceptionfor value-storing aggregates:While simple fields worked:
Observed error:
{ "error": { "reason": "There was internal problem at backend", "details": "class java.util.HashMap cannot be cast to class java.lang.String", "type": "ClassCastException" } }Root Cause
Execution-path analysis showed the failure at the final type-conversion layer in value-storing aggregates:
Mapstructures (e.g.,{attributes={telemetry={sdk={language=java}}}}) instead of the terminal primitive ("java").first,last,min,max) stored theseMapobjects in their accumulators.(String) valueinObjectContent.stringValue()wherevaluewas aMap.Why calculation-only aggregates work: functions like
count()anddc()perform calculations without storing the actual field values, so they never hit this conversion path.Solution
Implement nested-map handling in
ObjectContent.stringValue()so value-storing aggregate outputs can be safely converted to strings.Key changes
Mapvalues and recursively extract the terminal primitive viaextractFinalPrimitiveValue(...).Mapvalues (backward compatible).Before
After
Example extraction
Input:
{attributes={telemetry={sdk={language=java}}}}→
{telemetry={sdk={language=java}}}→
{sdk={language=java}}→
{language=java}→
"java"Verification (now succeeds)
Resolves