Skip to content

Conversation

@maropu
Copy link
Member

@maropu maropu commented Jun 10, 2020

What changes were proposed in this pull request?

This PR intends to extract SQL keywords (allCandidateKeywords in TableIdentifierParserSuite) from the generated parser class.

Why are the changes needed?

It is hard to maintain a full set of SQL keywords in TableIdentifierParserSuite, so it would be nice if we could update allCandidateKeywords automatically.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing tests.

"except",
"false",
"fetch",
"filter",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that we forgot to add this word in this list by this change.

"with",
"year")
// All the SQL keywords defined in `SqlBase.g4`
val allCandidateKeywords = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I've checked that (the existing allCandidateKeywords -- this new allCandidateKeywords) is empty.

@maropu maropu changed the title [SPARK-31950][SQL] Extract SQL keywords from the generated parser class [SPARK-31950][SQL][TESTS] Extract SQL keywords from the generated parser class Jun 10, 2020
@maropu
Copy link
Member Author

maropu commented Jun 10, 2020

How about this update? @cloud-fan @viirya @dilipbiswal

@SparkQA
Copy link

SparkQA commented Jun 10, 2020

Test build #123740 has finished for PR 28779 at commit e4f417d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • // we can not extract the literals from the generated parser class (SqlBaseParser).

@cloud-fan
Copy link
Contributor

I'm thinking about a text-based solution: we add a special line before the keyword section, and another special line after the key work section, then we just read the sqlbase.g4, find the keword section.

We can use it to get ansi reserverd/non-reserved keywords as well.

@maropu
Copy link
Member Author

maropu commented Jun 10, 2020

Ah, I see. I'll update based on the suggestion tomorrow.

@viirya
Copy link
Member

viirya commented Jun 11, 2020

Text-based solution sounds good.

@maropu
Copy link
Member Author

maropu commented Jun 11, 2020

#28802

@maropu maropu closed this Jun 11, 2020
dbtsai pushed a commit to dbtsai/spark that referenced this pull request Jun 12, 2020
### What changes were proposed in this pull request?

This PR intends to extract SQL reserved/non-reserved keywords from the ANTLR grammar file (`SqlBase.g4`) directly.

This approach is based on the cloud-fan suggestion: apache#28779 (comment)

### Why are the changes needed?

It is hard to maintain a full set of the keywords in `TableIdentifierParserSuite`, so it would be nice if we could extract them from the `SqlBase.g4` file directly.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing tests.

Closes apache#28802 from maropu/SPARK-31950-2.

Authored-by: Takeshi Yamamuro <[email protected]>
Signed-off-by: Takeshi Yamamuro <[email protected]>
maropu added a commit that referenced this pull request Jun 15, 2020
### What changes were proposed in this pull request?

This PR intends to extract SQL reserved/non-reserved keywords from the ANTLR grammar file (`SqlBase.g4`) directly.

This approach is based on the cloud-fan suggestion: #28779 (comment)

### Why are the changes needed?

It is hard to maintain a full set of the keywords in `TableIdentifierParserSuite`, so it would be nice if we could extract them from the `SqlBase.g4` file directly.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing tests.

Closes #28802 from maropu/SPARK-31950-2.

Authored-by: Takeshi Yamamuro <[email protected]>
Signed-off-by: Takeshi Yamamuro <[email protected]>
holdenk pushed a commit to holdenk/spark that referenced this pull request Jun 25, 2020
### What changes were proposed in this pull request?

This PR intends to extract SQL reserved/non-reserved keywords from the ANTLR grammar file (`SqlBase.g4`) directly.

This approach is based on the cloud-fan suggestion: apache#28779 (comment)

### Why are the changes needed?

It is hard to maintain a full set of the keywords in `TableIdentifierParserSuite`, so it would be nice if we could extract them from the `SqlBase.g4` file directly.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing tests.

Closes apache#28802 from maropu/SPARK-31950-2.

Authored-by: Takeshi Yamamuro <[email protected]>
Signed-off-by: Takeshi Yamamuro <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants