-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-25714] Fix Null Handling in the Optimizer rule BooleanSimplification #22702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -276,15 +276,31 @@ object BooleanSimplification extends Rule[LogicalPlan] with PredicateHelper { | |
| case a And b if a.semanticEquals(b) => a | ||
| case a Or b if a.semanticEquals(b) => a | ||
|
|
||
| case a And (b Or c) if Not(a).semanticEquals(b) => And(a, c) | ||
| case a And (b Or c) if Not(a).semanticEquals(c) => And(a, b) | ||
| case (a Or b) And c if a.semanticEquals(Not(c)) => And(b, c) | ||
| case (a Or b) And c if b.semanticEquals(Not(c)) => And(a, c) | ||
|
|
||
| case a Or (b And c) if Not(a).semanticEquals(b) => Or(a, c) | ||
| case a Or (b And c) if Not(a).semanticEquals(c) => Or(a, b) | ||
| case (a And b) Or c if a.semanticEquals(Not(c)) => Or(b, c) | ||
| case (a And b) Or c if b.semanticEquals(Not(c)) => Or(a, c) | ||
| // The following optimization is applicable only when the operands are nullable, | ||
| // since the three-value logic of AND and OR are different in NULL handling. | ||
| // See the chart: | ||
| // +---------+---------+---------+---------+ | ||
| // | p | q | p OR q | p AND q | | ||
| // +---------+---------+---------+---------+ | ||
| // | TRUE | TRUE | TRUE | TRUE | | ||
| // | TRUE | FALSE | TRUE | FALSE | | ||
| // | TRUE | UNKNOWN | TRUE | UNKNOWN | | ||
| // | FALSE | TRUE | TRUE | FALSE | | ||
| // | FALSE | FALSE | FALSE | FALSE | | ||
| // | FALSE | UNKNOWN | UNKNOWN | FALSE | | ||
| // | UNKNOWN | TRUE | TRUE | UNKNOWN | | ||
| // | UNKNOWN | FALSE | UNKNOWN | FALSE | | ||
| // | UNKNOWN | UNKNOWN | UNKNOWN | UNKNOWN | | ||
| // +---------+---------+---------+---------+ | ||
| case a And (b Or c) if !a.nullable && Not(a).semanticEquals(b) => And(a, c) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. assuming a is null, then b is also null. So yes this is a bug, and we should rewrite it to
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since this is complicated, shall we put a comment to explain it?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. after more thoughts, |
||
| case a And (b Or c) if !a.nullable && Not(a).semanticEquals(c) => And(a, b) | ||
| case (a Or b) And c if !a.nullable && a.semanticEquals(Not(c)) => And(b, c) | ||
| case (a Or b) And c if !b.nullable && b.semanticEquals(Not(c)) => And(a, c) | ||
|
|
||
| case a Or (b And c) if !a.nullable && Not(a).semanticEquals(b) => Or(a, c) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. these shouldn't be a problem, since if
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the problem is when a is null, c is true
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see now, sorry. Thanks.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, it is the other case where the change is not needed, right?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This is not always the case,
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oh, yes you're right, this might be a problem indeed if the expression is inside a |
||
| case a Or (b And c) if !a.nullable && Not(a).semanticEquals(c) => Or(a, b) | ||
| case (a And b) Or c if !a.nullable && a.semanticEquals(Not(c)) => Or(b, c) | ||
| case (a And b) Or c if !b.nullable && b.semanticEquals(Not(c)) => Or(a, c) | ||
|
|
||
| // Common factor elimination for conjunction | ||
| case and @ (left And right) => | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo:
only when the operands are not nullableThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed