-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-39488][SQL] Simplify the error handling of TempResolvedColumn #36809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fb4fe23 to
adc464f
Compare
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
Outdated
Show resolved
Hide resolved
| val DATA_TYPE_MISMATCH_ERROR_MESSAGE = TreeNodeTag[String]("dataTypeMismatchError") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to make sure this isn't related to TempResolvedColumn, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was used to do error handling of TempResolvedColumn, but we don't need it now as the logic is simplified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to make sure this isn't related to
TempResolvedColumn, right?
Yes, I added this in #36746. It is not needed after this pr
…ysis/Analyzer.scala Co-authored-by: Liang-Chi Hsieh <[email protected]>
|
thanks for review, merging to master! |
What changes were proposed in this pull request?
This is a followup of #35404 and #36746 , to simplify the error handling of
TempResolvedColumn.The idea is:
ResolveAggregationFunctionsin the main resolution batch createsTempResolvedColumnand only removes it if the aggregate expression is fully resolved. It either stripsTempResolvedColumnif it's inside aggregate function or group expression, or restoresTempResolvedColumntoUnresolvedAttributeotherwise, hoping other rules can resolve it.RemoveTempResolvedColumnin a latter batch can still hitTempResolvedColumnif the aggregate expression is unresolved (due to input type mismatch for example, e.g.avg(bool_col),date_add(int_group_col, 1)). At this stage, there is no way to restoreTempResolvedColumntoUnresolvedAttributeand resolve it differently. The query will fail and we should blindly stripTempResolvedColumnto provide better error message.Why are the changes needed?
code cleanup
Does this PR introduce any user-facing change?
no
How was this patch tested?
existing tests