Skip to content

Conversation

@wangyum
Copy link
Member

@wangyum wangyum commented Jul 11, 2019

What changes were proposed in this pull request?

How to reproduce this issue:

spark-sql> set spark.sql.parser.ansi.enabled=true;
spark.sql.parser.ansi.enabled	true
spark-sql> select 1 as false;
Error in query:
no viable alternative at input 'false'(line 1, pos 12)

== SQL ==
select 1 as false
------------^^^

spark-sql> select 1 as minus;
Error in query:
no viable alternative at input 'minus'(line 1, pos 12)

== SQL ==
select 1 as minus
------------^^^

Other databases behaviour:
PostgreSQL:

postgres=# select 1 as false, 1 as minus;
 false | minus
-------+-------
     1 |     1
(1 row)

Vertica:

dbadmin=> select 1 as false, 1 as minus;
 false | minus
-------+-------
     1 |     1
(1 row)

SQL Server:

1> select 1 as false, 1 as minus
2> go
false       minus
----------- -----------
          1           1

DB2:
image

Oracle:

SQL> select 1 as false, 1 as minus from dual;
select 1 as false, 1 as minus from dual
                        *
ERROR at line 1:
ORA-00923: FROM keyword not found where expected

Teradata:
image

This pr add FALSE and SETMINUS to ansiNonReserved to fix this issue.

How was this patch tested?

unit tests and manual tests:

spark-sql> select 1 as false, 1 as minus;
1	1
spark-sql>

@SparkQA
Copy link

SparkQA commented Jul 11, 2019

Test build #107530 has finished for PR 25114 at commit 3efbda6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member Author

wangyum commented Jul 11, 2019

cc @maropu @dongjoon-hyun

@maropu
Copy link
Member

maropu commented Jul 11, 2019

Is this correct? I don't check carefully check yet, but I think the 'reserved' meaning is essentially different between spark and the other systems. For example, these keyworkds below are reserved in postgresql, but they can be used as alias names;

postgres=# select 1 as select;
 select 
--------
      1
(1 row)

postgres=# select 1 as from;
 from 
------
    1
(1 row)

@gatorsmile
Copy link
Member

First, FALSE is reserved word.

Second, see https://www.postgresql.org/docs/7.3/sql-keywords-appendix.html

In the PostgreSQL parser life is a bit more complicated. There are several different classes of tokens ranging from those that can never be used as an identifier to those that have absolutely no special status in the parser as compared to an ordinary identifier. (The latter is usually the case for functions specified by SQL.) Even reserved key words are not completely reserved in PostgreSQL, but can be used as column labels (for example, SELECT 55 AS CHECK, even though CHECK is a reserved key word).

We do not need to follow PostgreSQL to support reserved words in column alias.

@wangyum
Copy link
Member Author

wangyum commented Jul 13, 2019

Thank you @gatorsmile

@wangyum wangyum closed this Jul 13, 2019
@wangyum wangyum deleted the SPARK-28349 branch July 13, 2019 00:44
@wangyum
Copy link
Member Author

wangyum commented Jul 17, 2019

spark-sql> set spark.sql.parser.ansi.enabled=true;
spark.sql.parser.ansi.enabled	true
spark-sql> select extract(year from timestamp '2001-02-16 20:38:40')  ;
Error in query:
no viable alternative at input 'year'(line 1, pos 15)

== SQL ==
select extract(year from timestamp '2001-02-16 20:38:40')
---------------^^^

spark-sql> set spark.sql.parser.ansi.enabled=false;
spark.sql.parser.ansi.enabled	false
spark-sql> select extract(year from timestamp '2001-02-16 20:38:40')  ;
2001

Do we need to add YEAR, MONTH, DAY , HOUR, MINUTE and SECOND to ansiNonReserved? These are all non-reserved for PostgreSQL and Oracle.

https://github.com/antlr/grammars-v4/blob/b7f5b6d8c4da4d45d3dcb8146df310b89cd58114/plsql/PlSqlParser.g4#L4959-L6703

@maropu
Copy link
Member

maropu commented Jul 19, 2019

I think we don't need to do so cuz we basically follow SQL-2011 now.

@wangyum
Copy link
Member Author

wangyum commented Jul 19, 2019

or add a regular_id? Some UDFs are also not available in ansi mode:

spark-sql> select left('12345', 2);
12
spark-sql> set spark.sql.parser.ansi.enabled=true;
spark.sql.parser.ansi.enabled	true
spark-sql> select left('12345', 2);
Error in query:
no viable alternative at input 'left'(line 1, pos 7)

== SQL ==
select left('12345', 2)
-------^^^

@maropu
Copy link
Member

maropu commented Jul 19, 2019

Rather, itd be better to list up these functions and rename them if necessary? IMO making these keywords non-reserved worsens parser error messages...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants