-
Notifications
You must be signed in to change notification settings - Fork 29.3k
[SPARK-24276][SQL] Order of literals in IN should not affect semantic equality #21331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
9595237
6cd34d9
ccbdd11
56420f2
0c484f1
a0af525
5c88d86
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -17,6 +17,8 @@ | |
|
|
||
| package org.apache.spark.sql.catalyst.expressions | ||
|
|
||
| import org.apache.spark.sql.catalyst.util.TypeUtils | ||
|
|
||
| /** | ||
| * Rewrites an expression using rules that are guaranteed preserve the result while attempting | ||
| * to remove cosmetic variations. Deterministic expressions that are `equal` after canonicalization | ||
|
|
@@ -85,6 +87,14 @@ object Canonicalize { | |
| case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r) | ||
| case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r) | ||
|
|
||
| // order the list in the In operator | ||
| // we can do this only if all the elements in the list are literals with the same datatype | ||
| case i @ In(value, list) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can't we just reorder elements in list by
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nice idea, thanks. I am doing it. It will allow also to cover more cases than simple literals. Thank you. |
||
| if i.inSetConvertible && list.map(_.dataType.asNullable).distinct.size == 1 => | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you for pinging me, @mgaido91 .
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. thanks for your comment @dongjoon-hyun, but I am not sure I agree with you. What if we have something like
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let me check again. I might forget something here. :)
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep. Right. I confused those cases. |
||
| val literals = list.map(_.asInstanceOf[Literal]) | ||
| val ordering = TypeUtils.getInterpretedOrdering(literals.head.dataType) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For non-ordering type, this will throw match error. |
||
| In(value, literals.sortBy(_.value)(ordering)) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For complex literals like
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. actually it works. The problem in your example is that
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks. BTW, it comes from your example. Anyway, my bad. :)
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you're right, it's my fault. Anyway, with @viirya's suggestion, also this case works properly. I updated the UT too. Thanks. |
||
|
|
||
| case _ => e | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Useless?