-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Coral-schema] Generalize operand schema inference on ordinal return type UDF calls #548
[Coral-schema] Generalize operand schema inference on ordinal return type UDF calls #548
Conversation
Did the regression tests introduce any changes to view text? |
The regression tests did not introduce any changes. FYI I only ran the regression for avro schema translations since that is the only possible translation path that this change could introduce changes in. |
return rexCall; | ||
} | ||
operand.accept(this); | ||
return rexCall; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The earlier return here causes the view doc string to no longer be produce. Discussed this with @KevinGe00 offline and it is a reasonable change given it is now more complex to construct the doc string when we recurse into the UDF.
appendRexInputRefField((RexInputRef) operand); | ||
return rexCall; | ||
} | ||
operand.accept(this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed a change in the capitalization in the field name. This is because the string are now being fetched from the underlying schema instead of being generated programmatically by Coral.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Secondary changes to the schema noted in other comments.
What changes are proposed in this pull request, and why are they necessary?
Currently, coral-schema is only set up to correctly infer schemas for UDFs with an ordinal return type, only if the ordinal operand is a reference to an input column. However, this is limiting and we have since encountered views with nested calls on UDFs with an ordinal return type. For example,
UDF_A('foo', UDF_A('foo', colA))
, where the return type ofUDF_A
is whatever the return type of it's second operand is.We correct this by using the
SchemaRexShuttle
to visit the ordinal operand, which will infer the schema of the operand and add it to the final schema as a field. This way, we leverage the shuttle to categorize and process the operand for us so we don't have to manually categorize the operand.How was this patch tested?
./gradlew clean build