Skip to content

Conversation

@gatorsmile
Copy link
Member

What changes were proposed in this pull request?

This PR is to fix the null handling in BooleanSimplification. In the rule BooleanSimplification, there are two cases that do not properly handle null values. The optimization is not right if either side is null. This PR is to fix them.

How was this patch tested?

Added test cases

@gatorsmile
Copy link
Member Author

cc @cloud-fan @adrian-ionescu

case TrueLiteral Or _ => TrueLiteral
case _ Or TrueLiteral => TrueLiteral

case a And b if Not(a).semanticEquals(b) => FalseLiteral
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about

case a And b if Not(a).semanticEquals(b) || a.semanticEquals(Not(b)) =>
  If(IsNull(a), null, FalseLiteral)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

      case a And b if Not(a).semanticEquals(b) || a.semanticEquals(Not(b)) =>
        if (!a.nullable && !b.nullable) {
          FalseLiteral
        } else {
          If(IsNull(a), Literal.create(null, a.dataType), FalseLiteral)
        }

Copy link
Member

@gengliangwang gengliangwang Sep 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be

      case a And b if Not(a).semanticEquals(b) || a.semanticEquals(Not(b)) =>
        if (!a.nullable && !b.nullable) {
          FalseLiteral
        } else if (a.nullable) {
          If(IsNull(a), Literal.create(null, a.dataType), FalseLiteral)
        } else {
          If(IsNull(b), Literal.create(null, b.dataType), FalseLiteral)
       }

But the current code should work. As the condition Not(a).semanticEquals(b) || a.semanticEquals(Not(b)) is satisfied, in current code it is impossible that a is not nullable and b is nullable.

Copy link
Member

@gengliangwang gengliangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM 👍

testRelation.output, Seq(Row(1, 2, 3, "abc"))
)

val testNotnullableRelation = LocalRelation('a.int.notNull, 'b.int.notNull, 'c.int.notNull,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: testNotnullableRelation => testNotNullableRelation

'd.string.notNull, 'e.boolean.notNull, 'f.boolean.notNull, 'g.boolean.notNull,
'h.boolean.notNull)

val testNotnullableRelationWithData = LocalRelation.fromExternalRows(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testNotnullableRelationWithData => testNotNullableRelationWithData

@SparkQA
Copy link

SparkQA commented Sep 11, 2018

Test build #95923 has finished for PR 22390 at commit cf863bb.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 11, 2018

Test build #95955 has finished for PR 22390 at commit b05052e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

LGTM

Copy link
Member

@gengliangwang gengliangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@adrian-ionescu adrian-ionescu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SparkQA
Copy link

SparkQA commented Sep 12, 2018

Test build #95976 has finished for PR 22390 at commit 61b2d55.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gengliangwang
Copy link
Member

retest this please.

@SparkQA
Copy link

SparkQA commented Sep 12, 2018

Test build #95981 has finished for PR 22390 at commit 61b2d55.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Sep 12, 2018
## What changes were proposed in this pull request?
This PR is to fix the null handling in BooleanSimplification. In the rule BooleanSimplification, there are two cases that do not properly handle null values. The optimization is not right if either side is null. This PR is to fix them.

## How was this patch tested?
Added test cases

Closes #22390 from gatorsmile/fixBooleanSimplification.

Authored-by: gatorsmile <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit 79cc597)
Signed-off-by: Wenchen Fan <[email protected]>
@asfgit asfgit closed this in 79cc597 Sep 12, 2018
asfgit pushed a commit that referenced this pull request Sep 12, 2018
## What changes were proposed in this pull request?
This PR is to fix the null handling in BooleanSimplification. In the rule BooleanSimplification, there are two cases that do not properly handle null values. The optimization is not right if either side is null. This PR is to fix them.

## How was this patch tested?
Added test cases

Closes #22390 from gatorsmile/fixBooleanSimplification.

Authored-by: gatorsmile <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit 79cc597)
Signed-off-by: Wenchen Fan <[email protected]>
@cloud-fan
Copy link
Contributor

thanks, merging to master/2.4/2.3!

@cloud-fan
Copy link
Contributor

can you send a new PR for 2.2? thanks

gatorsmile added a commit to gatorsmile/spark that referenced this pull request Sep 12, 2018
This PR is to fix the null handling in BooleanSimplification. In the rule BooleanSimplification, there are two cases that do not properly handle null values. The optimization is not right if either side is null. This PR is to fix them.

Added test cases

Closes apache#22390 from gatorsmile/fixBooleanSimplification.

Authored-by: gatorsmile <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit 79cc597)
Signed-off-by: Wenchen Fan <[email protected]>
fjh100456 pushed a commit to fjh100456/spark that referenced this pull request Sep 13, 2018
## What changes were proposed in this pull request?
This PR is to fix the null handling in BooleanSimplification. In the rule BooleanSimplification, there are two cases that do not properly handle null values. The optimization is not right if either side is null. This PR is to fix them.

## How was this patch tested?
Added test cases

Closes apache#22390 from gatorsmile/fixBooleanSimplification.

Authored-by: gatorsmile <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants