[SPARK-12689][SQL] Migrate DDL parsing to the newly absorbed parser#10723
[SPARK-12689][SQL] Migrate DDL parsing to the newly absorbed parser#10723viirya wants to merge 20 commits intoapache:masterfrom
Conversation
|
Test build #49237 has finished for PR 10723 at commit
|
|
cc @hvanhovell @viirya can you rename the title to "migrate describe table parsing to ..." |
|
Because SQLContext still uses DDLParser, looks like I can't simply remove |
|
Test build #49317 has finished for PR 10723 at commit
|
|
@cloud-fan Can you also take a look? It is related to the work of adding DDL support for creating bucketed tables. |
There was a problem hiding this comment.
Why change this? You didn't touch the describe stuff in SparkSqlParser.g right?
There was a problem hiding this comment.
Yes. I think it is incorrect from beginning but not be tested it out because we don't reach here before. I've tested it locally. Once all three commands are migrated, we can see this passing tests.
There was a problem hiding this comment.
if we parse the following SQL using the parse driver org.apache.spark.sql.catalyst.parser.ParseDriver.parsePlan("DESCRIBE EXTENDED tbl.a", null)
We would end up with the following AST:
TOK_DESCTABLE 1, 0, 6, 18
:- TOK_TABTYPE 1, 4, 6, 18
: +- TOK_TABNAME 1, 4, 6, 18
: :- tbl 1, 4, 4, 18
: +- a 1, 6, 6, 22
+- EXTENDED 1, 2, 2, 9
This change would pick this up, and old code didn't (I am sure I tested this though :S ). You can disable this in the DDL parser, to see if it works now.
There was a problem hiding this comment.
Could we add a test for this? The Hive test suite apparently misses this one. I could also address in a different PR.
There was a problem hiding this comment.
Actually we have test for describe table command in HiveQuerySuite. Do we need another test?
Conflicts: sql/catalyst/src/main/antlr3/org/apache/spark/sql/catalyst/parser/SparkSqlLexer.g
|
Test build #49585 has finished for PR 10723 at commit
|
There was a problem hiding this comment.
How about gradually moving functionality from the DLL parser to SparkQl? That would allow us to test this in the meantime.
There was a problem hiding this comment.
DDLParser is still used in SQLContext. Do we want to completely remove it? Because I already migrate three commands. I think we can test them all together.
There was a problem hiding this comment.
Why not use unquoteString this does the same and is easier to read?
There was a problem hiding this comment.
Don't know there is unquoteString. Thanks.
|
Test build #49591 has finished for PR 10723 at commit
|
Conflicts: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DDLParser.scala
There was a problem hiding this comment.
We are allowed to use From and To in CreateTableUsing command's options (actually seems we can use any string as the option key). But we can't simply add them into nonReserved because by doing that we mess other existing rules. So we create a looseIdentifier and looseNonReserved here.
There was a problem hiding this comment.
Why not add this to the option rule directly?
There was a problem hiding this comment.
Because I don't know if we will add other reserved words later. If so, the option rule might be too long. I don't count if any keywords are not included in nonReserved.
There was a problem hiding this comment.
Both (current approach or adding it to the option rule) are okay for me.
There was a problem hiding this comment.
Could add your initial line commentaar as a comment in the code?
There was a problem hiding this comment.
Thanks for reminding. I've added it.
|
Test build #49673 has finished for PR 10723 at commit
|
|
Test build #49669 has finished for PR 10723 at commit
|
|
Test build #49675 has finished for PR 10723 at commit
|
|
@viirya I have done another round. Most things are minor, but I would to know why you want to change the treatment of quoted identifiers? |
…Parser commands to new Parser This PR moves all the functionality provided by the SparkSQLParser/ExtendedHiveQlParser to the new Parser hierarchy (SparkQl/HiveQl). This also improves the current SET command parsing: the current implementation swallows ```set role ...``` and ```set autocommit ...``` commands, this PR respects these commands (and passes them on to Hive). This PR and #10723 end the use of Parser-Combinator parsers for SQL parsing. As a result we can also remove the ```AbstractSQLParser``` in Catalyst. The PR is marked WIP as long as it doesn't pass all tests. cc rxin viirya winningsix (this touches #10144) Author: Herman van Hovell <hvanhovell@questtec.nl> Closes #10905 from hvanhovell/SPARK-12866.
Conflicts: sql/catalyst/src/main/antlr3/org/apache/spark/sql/catalyst/parser/SparkSqlParser.g sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala
|
@hvanhovell Thanks for reviewing this. I've updated this to address your comments. Please see if it is proper for you. |
|
Test build #50258 has finished for PR 10723 at commit
|
|
Test build #50276 has finished for PR 10723 at commit
|
|
LGTM |
Conflicts: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala
|
Test build #50355 has finished for PR 10723 at commit
|
|
retest this please. |
|
Test build #50366 has finished for PR 10723 at commit
|
|
It's weird. |
|
retest this please. |
|
It is, can't make sense of this either. Are tests passing locally? |
|
Yeah, I think so. And I don't update codes since last successful test. |
|
See how another round of test shows. |
|
Many unrelated failures like can't find hive jar file. |
|
Test build #50377 has finished for PR 10723 at commit
|
|
ping @rxin |
|
@viirya I am gonna trigger another test to make sure things keep working. |
|
retest this please |
|
@hvanhovell ok, thanks. |
|
Test build #50382 has finished for PR 10723 at commit
|
|
cc @rxin |
|
Thanks - merging this in master. |
JIRA: https://issues.apache.org/jira/browse/SPARK-12689
DDLParser processes three commands: createTable, describeTable and refreshTable.
This patch migrates the three commands to newly absorbed parser.