Skip to content

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

This PR improves the test to make sure all the SQL keywords are documented correctly. It fixes several issues:

  1. some keywords are not documented
  2. some keywords are not ANSI SQL keywords but documented as reserved/non-reserved.

Why are the changes needed?

To make sure the implementation matches the doc.

Does this PR introduce any user-facing change?

No

How was this patch tested?

new test

@cloud-fan
Copy link
Contributor Author

cc @maropu @dongjoon-hyun

Copy link
Member

@maropu maropu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks for the updates! LGTM except for the minor comment.

|DISTINCT|reserved|non-reserved|reserved|
|DISTRIBUTE|non-reserved|non-reserved|non-reserved|
|DIV|non-reserved|non-reserved|non-reserved|
|DIV|non-reserved|non-reserved|not a keyword|
Copy link
Member

@maropu maropu Jul 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. We cannot remove this line: non-reserved|non-reserved|not a keyword? I think users don't need to care about the words.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But is DIV a keyword?

Copy link
Member

@dongjoon-hyun dongjoon-hyun Jul 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DIV is a non-reserved keyword in Apache Spark. I guess we need to have this here.

@SparkQA
Copy link

SparkQA commented Jul 9, 2020

Test build #125480 has finished for PR 29055 at commit f595689.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • trait SQLKeywordUtils extends SQLHelper
  • class SQLKeywordSuite extends SparkFunSuite with SQLKeywordUtils
  • class TableIdentifierParserSuite extends SparkFunSuite with SQLKeywordUtils

@dongjoon-hyun
Copy link
Member

Retest this please.

@dongjoon-hyun
Copy link
Member

Thank you for pinging me, @cloud-fan !

@SparkQA
Copy link

SparkQA commented Jul 9, 2020

Test build #125489 has started for PR 29055 at commit f595689.

@dongjoon-hyun
Copy link
Member

Retest this please

}
}
// DIV is an operator but is also a keyword
keywords + "DIV"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan . Instead of adding exceptions here and line 136, can we switch the original lines in SqlBase.g4? I guess you can move DIV: 'DIV'; into somewhere before //--SPARK-KEYWORD-LIST-END. WDYT?

//--SPARK-KEYWORD-LIST-END
//============================
// End of the keywords list
//============================

EQ  : '=' | '==';
NSEQ: '<=>';
NEQ : '<>';
NEQJ: '!=';
LT  : '<';
LTE : '<=' | '!>';
GT  : '>';
GTE : '>=' | '!<';

PLUS: '+';
MINUS: '-';
ASTERISK: '*';
SLASH: '/';
PERCENT: '%';
DIV: 'DIV';

@maropu maropu changed the title [SPARK-32251][SQL][DOC] fix SQL keyword document [SPARK-32251][SQL][DOC][TEST] fix SQL keyword document Jul 9, 2020
@HyukjinKwon HyukjinKwon changed the title [SPARK-32251][SQL][DOC][TEST] fix SQL keyword document [SPARK-32251][SQL][DOCS][TESTS] Fix SQL keyword document Jul 10, 2020
@SparkQA
Copy link

SparkQA commented Jul 10, 2020

Test build #125514 has finished for PR 29055 at commit f595689.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • trait SQLKeywordUtils extends SQLHelper
  • class SQLKeywordSuite extends SparkFunSuite with SQLKeywordUtils
  • class TableIdentifierParserSuite extends SparkFunSuite with SQLKeywordUtils

@maropu
Copy link
Member

maropu commented Jul 10, 2020

retest this please

@SparkQA
Copy link

SparkQA commented Jul 10, 2020

Test build #125559 has finished for PR 29055 at commit 71e0c13.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 10, 2020

Test build #125557 has finished for PR 29055 at commit f595689.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • trait SQLKeywordUtils extends SQLHelper
  • class SQLKeywordSuite extends SparkFunSuite with SQLKeywordUtils
  • class TableIdentifierParserSuite extends SparkFunSuite with SQLKeywordUtils

Below is a list of all the keywords in Spark SQL.

|Keyword|Spark SQL<br/>ANSI Mode|Spark SQL<br/>Default Mode|SQL-2011|
|Keyword|Spark SQL<br/>ANSI Mode|Spark SQL<br/>Default Mode|SQL-2016|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any consensus to decide which version of ANSI SQL standard that we actually try to comply with?

Or any protocol to define the compatibility between Spark versions and ANSI SQL standards?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the latest ANSI SQL standard is preferred.

Also to note that: ANSI compliant doesn't mean to follow everything in ANSI SQL. e.g. a keyword reserved in ANSI SQL can be non-reserved in Spark.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the history of this standard, the trend of it seems to add more and more keywords and change the non-reserved to reserved. I guess it's ok to follow the latest one here as literally we will add migration guide for every newly added keyword.

@dongjoon-hyun
Copy link
Member

Retest this please

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @cloud-fan .

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Jul 10, 2020

Merged to master/3.0 because all Scala/Java UT passed in the current run. PySpark and SparkR UT is irrelevant to this PR.
Thank you, @cloud-fan and all.

dongjoon-hyun pushed a commit that referenced this pull request Jul 10, 2020
### What changes were proposed in this pull request?

This PR improves the test to make sure all the SQL keywords are documented correctly. It fixes several issues:
1. some keywords are not documented
2. some keywords are not ANSI SQL keywords but documented as reserved/non-reserved.

### Why are the changes needed?

To make sure the implementation matches the doc.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

new test

Closes #29055 from cloud-fan/keyword.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 84db660)
Signed-off-by: Dongjoon Hyun <[email protected]>
@SparkQA
Copy link

SparkQA commented Jul 10, 2020

Test build #125622 has finished for PR 29055 at commit 71e0c13.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants