-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-24341][SQL] Support only IN subqueries with the same number of items per row #21403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from 1 commit
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
c9a36e0
[SPARK-24313][SQL] Support In subqueries which are valid in other RDBMS
mgaido91 65ff49a
introduce InValues
mgaido91 268307f
add analyzer rule to add InValues
mgaido91 d3e39ed
fix ut failures
mgaido91 a5771b8
Merge branch 'master' into SPARK-24313
mgaido91 7c898a5
Merge branch 'master' of github.com:apache/spark into SPARK-24313
mgaido91 60b57d2
fix merging
mgaido91 22f77ae
fix UT error
mgaido91 0412829
fix OptimizeIn merge
mgaido91 bd008fe
Merge branch 'master' of github.com:apache/spark into SPARK-24313
mgaido91 f9b7536
move tests
mgaido91 f5fa2c4
fix error message according to comment
mgaido91 571b273
revert to Seq[Expression]
mgaido91 423e93e
Merge branch 'master' into SPARK-24313
mgaido91 3af5b78
introduce InSubquery
mgaido91 0f00a06
simplify diff
mgaido91 53e3d96
remove unneeded changes
mgaido91 45a91fc
fix test error
mgaido91 cb3467b
remove ListQuery
mgaido91 a6114a6
Revert "remove ListQuery"
mgaido91 eb1dfb7
address comment
mgaido91 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we have an analyzer rule to deal with
In(CreateStruct(...), ListQuery(...)), to unpack theCreateStruct, or pack theListQuery? Then we don't need to changeIn.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, as the
valuecan be replaced later by other rules. So we do need to have aSeq[Expression]here, instead of a single expression. Another possible option which I haven't checked, but I think it may be feasible is to create a new kind ofExpression(eg.InValues) we can use only for this specific case. What do you think?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on
InValues. Maybe call itInSubqueryThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not a subquery, this is the "left part" of IN, so I don't really agree on
InSubquery, but if you have another suggestion I am happy to follow it. Thanks.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean
case class InSubquery(values: Seq[Expression], subquery: ListSubquery), it's not just the left part.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, I think right behavior is the one which both Postgres and Hive have (and it is also the same of Oracle/MySQL, in which we don't have structs). What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we should treat
(...)specially if it's in front ofIn, but I'm wondering if we need to do the same thing for=.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure. The behavior when comparing structs in not uniform among different DBs. Hive doesn't allow
=on structs. Postgres and Presto does, but their behavior with nulls is not consistent and it is different from ours. In particular, comparing a struct containing anullreturnsnullon Postgres and causes an exception in Presto (we returnfalseinstead). This is causing another problem which has been reported in another JIRA for which we can return results different from Postgres and Oracle (SPARK-24395).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this specific case, instead, I'll update this PR creating the new ad-hoc expression for the values in front of IN if you agree, as we have to deal not only with the subquery case. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM