-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-46539][SQL] SELECT * EXCEPT(all fields from a struct) results in an assertion failure #44527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a14ecef to
23b5cfb
Compare
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala
Outdated
Show resolved
Hide resolved
|
@milastdbx Are you ok with the changes? |
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala
Show resolved
Hide resolved
|
Does it mean we will hit this bug if an empty struct is returned? |
yes, i think we will @cloud-fan |
| "each serializer expression should contain at least one `BoundReference`") | ||
| assert(boundRefs.nonEmpty || isEmptyStruct(ser), | ||
| "each serializer expression should contain at least one `BoundReference` " + | ||
| "or be an empty struct") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error msg is not clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated! let me know if you think it's clearer now
|
@stefankandic Just to double check, up to which version should the changes be ported? The ticket SPARK-46539 points out 3.0.0, is it correct? |
|
+1, LGTM. Merging to master. |
|
@stefankandic The PR conflicts with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @stefankandic and @MaxGekk and all. Unfortunately, this seems to break master branch. Could you make a follow-up first before backporting?
- https://github.com/apache/spark/actions/runs/7401236452/job/20136573641
- https://github.com/apache/spark/actions/runs/7401841987/job/20138486695
- https://github.com/apache/spark/actions/runs/7403100751/job/20142273016
[info] - selectExcept.sql_analyzer_test *** FAILED *** (136 milliseconds)
[info] selectExcept.sql_analyzer_test
[info] Expected "... (`tbl_view`, [id#x,[name#x,]data#x])
[info] +- Pr...", but got "... (`tbl_view`, [id#x,[ name#x, ]data#x])
[info] +- Pr..." Result did not match for query #9
[info] SELECT * EXCEPT (data.f1, data.s2) FROM tbl_view (SQLQueryTestSuite.scala:902)
|
As of now, this failure blocks all PRs like the following. |
|
I'm fixing it at #44585 |
### What changes were proposed in this pull request? a followup of #44527 to fix golden files. ### Why are the changes needed? fix tests ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? N/A ### Was this patch authored or co-authored using generative AI tooling? no Closes #44585 from cloud-fan/golden. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Kent Yao <[email protected]>
|
Thank you, @cloud-fan ! |
since this PR depends on a new feature introduced in 43843 there is no need to backport this |
What changes were proposed in this pull request?
Fixing the assertion error which occurs when we do SELECT .. EXCEPT(every field from a struct) by adding a check for an empty struct
Why are the changes needed?
Because this is a valid query that should just return an empty struct rather than fail during serialization.
Does this PR introduce any user-facing change?
Yes, users should no longer see this error and instead get an empty struct '{ }'
How was this patch tested?
By adding new UT to existing selectExcept tests
Was this patch authored or co-authored using generative AI tooling?
No