Skip to content

[SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax #25074

Closed
beliefer wants to merge 30 commits intoapache:masterfrom
beliefer:ansi-sql-boolean-test
Closed

[SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax #25074
beliefer wants to merge 30 commits intoapache:masterfrom
beliefer:ansi-sql-boolean-test

Conversation

@beliefer
Copy link
Contributor

@beliefer beliefer commented Jul 8, 2019

What changes were proposed in this pull request?

This PR aims to support ANSI SQL Boolean-Predicate syntax.

expression IS [NOT] TRUE
expression IS [NOT] FALSE
expression IS [NOT] UNKNOWN

There are some mainstream database support this syntax.

For example:

spark-sql> select null is true, null is not true;
false	true

spark-sql> select false is true, false is not true;
false	true

spark-sql> select true is true, true is not true;
true	false

spark-sql> select null is false, null is not false;
false	true

spark-sql> select false is false, false is not false;
true	false

spark-sql> select true is false,  true is not false;
false	true

spark-sql> select null is unknown, null is not unknown;
true	false

spark-sql> select false is unknown, false is not unknown;
false	true

spark-sql> select true is unknown, true is not unknown;
false	true

Note: A null input is treated as the logical value "unknown".

How was this patch tested?

Pass the Jenkins with the newly added test cases.

@SparkQA
Copy link

SparkQA commented Jul 8, 2019

Test build #107335 has finished for PR 25074 at commit b2fc258.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 10, 2019

Test build #107430 has finished for PR 25074 at commit fa148db.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@beliefer
Copy link
Contributor Author

Retest this please.

@beliefer
Copy link
Contributor Author

@ueshin Could you take a look at this PR? Thanks a lot!

@SparkQA
Copy link

SparkQA commented Jul 10, 2019

Test build #107449 has finished for PR 25074 at commit fa148db.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 11, 2019

Test build #107500 has finished for PR 25074 at commit c22f12d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Retest this please.

@SparkQA
Copy link

SparkQA commented Jul 16, 2019

Test build #107720 has finished for PR 25074 at commit c22f12d.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@beliefer
Copy link
Contributor Author

Retest this please.

@SparkQA
Copy link

SparkQA commented Jul 16, 2019

Test build #107733 has finished for PR 25074 at commit c22f12d.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-27924]Support ANSI SQL Boolean-Predicate syntax [SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax Jul 16, 2019
def isFalse: Predicate = IsFalse(expr)
def isNotFalse: Predicate = IsNotFalse(expr)
def isUnknown: Predicate = IsUnknown(expr)
def isNotUnknown: Predicate = IsNotUnknown(expr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for adding a comment for the late review rounds, but shall we avoid adding DSL since the whole picture is to support PostgreSQL feature parity? How do you think about that, @maropu ? Can we remove these DSL addition?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, we should revert this and test("is true | false | unknown expressions").

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opnion on this though, +1 for your opnion. I don't think all the syntaxes need to be listed up here (It is enough to list up a basic part of them).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun This is not only a PostgreSQL feature, but a ANSI SQL. I also want to confirm if I really want to delete this and test("is true | false | unknown expressions").

Copy link
Member

@dongjoon-hyun dongjoon-hyun Jul 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, @beliefer . You are confused with Scala DSL and SQL.
This is sql/catalyst/dsl which providing Scala API. This is never ANSI SQL.
SQL Standard doesn't have def isNotUnknown. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, @beliefer . You are confused with Scala DSL and SQL.
This is sql/catalyst/dsl which providing Scala API. This is never ANSI SQL.
SQL Standard doesn't have def isNotUnknown. :)

Thanks for your prompt. I am confused indeed and clear now.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Jul 29, 2019

For the others, this PR looks fine, @beliefer .
I think we can merge this if we can remove the DSL part and its test part.

@dongjoon-hyun
Copy link
Member

Also, I slightly updated the PR description, @beliefer .

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix #25074 (comment) , @beliefer .

@beliefer
Copy link
Contributor Author

Also, I slightly updated the PR description, @beliefer .

Thanks very much.

@beliefer beliefer closed this Jul 30, 2019
@beliefer beliefer reopened this Jul 30, 2019
@beliefer
Copy link
Contributor Author

Please fix #25074 (comment) , @beliefer .

I have reverted that code.

@dongjoon-hyun
Copy link
Member

Thank you for updating again, @beliefer .

@SparkQA
Copy link

SparkQA commented Jul 30, 2019

Test build #108368 has finished for PR 25074 at commit 4794b67.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Retest this please.

@SparkQA
Copy link

SparkQA commented Jul 30, 2019

Test build #108373 has finished for PR 25074 at commit 4794b67.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 31, 2019

Test build #108435 has finished for PR 25074 at commit ae126e7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you so much, @beliefer and @maropu !
Merged to master.

@beliefer
Copy link
Contributor Author

@dongjoon-hyun @maropu Thanks for all your continuous review and help.

@maropu
Copy link
Member

maropu commented Jul 31, 2019

thanks for your hard work!

case SqlBaseParser.NULL =>
IsNull(e)
case SqlBaseParser.TRUE => ctx.NOT match {
case null => IsTrue(e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it simply EqualNullSafe(e, Literal(true))? Why do we create a bunch of new expressions?

Copy link
Contributor Author

@beliefer beliefer Feb 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan It looks more reasonable.
But there exists a different the input expression must be of a boolean type. The input of EqualNullSafe is any DataType.
Do I need to adjust? cc @dongjoon-hyun @maropu

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EqualNullSafe requires the left type to be the same as right type, so EqualNullSafe(e, Literal(true)) requires e to be boolean.

IsNull is a problem so I think that change makes sense.

Copy link
Member

@maropu maropu Feb 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I missed that approach and the @cloud-fan one sounds more reasonable. If @dongjoon-hyun is not against the idea, can you make a PR for that? @beliefer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan @maropu Let's wait @dongjoon-hyun , then I will make a PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being late. +1 for that approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the check, @dongjoon-hyun ! Plz go ahead, @beliefer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

cloud-fan pushed a commit that referenced this pull request Feb 27, 2020
### What changes were proposed in this pull request?
This PR follows #25074 and improves the implement.

### Why are the changes needed?
Improve code.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Exists UT

Closes #27699 from beliefer/improve-boolean-test.

Authored-by: beliefer <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
cloud-fan pushed a commit that referenced this pull request Feb 27, 2020
### What changes were proposed in this pull request?
This PR follows #25074 and improves the implement.

### Why are the changes needed?
Improve code.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Exists UT

Closes #27699 from beliefer/improve-boolean-test.

Authored-by: beliefer <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 1515d45)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
sjincho pushed a commit to sjincho/spark that referenced this pull request Apr 15, 2020
### What changes were proposed in this pull request?
This PR follows apache#25074 and improves the implement.

### Why are the changes needed?
Improve code.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Exists UT

Closes apache#27699 from beliefer/improve-boolean-test.

Authored-by: beliefer <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants