[SPARK-30724][SQL] Support 'LIKE ANY' and 'LIKE ALL' operators#27477
[SPARK-30724][SQL] Support 'LIKE ANY' and 'LIKE ALL' operators#27477wangyum wants to merge 10 commits intoapache:masterfrom wangyum:SPARK-30724
Conversation
|
Test build #117992 has finished for PR 27477 at commit
|
|
Test build #117994 has finished for PR 27477 at commit
|
| | NOT? kind=IN '(' expression (',' expression)* ')' | ||
| | NOT? kind=IN '(' query ')' | ||
| | NOT? kind=RLIKE pattern=valueExpression | ||
| | NOT? kind=LIKE quantifier=(ANY | ALL) '(' expression (',' expression)* ')' |
There was a problem hiding this comment.
So, we don't need to support ESCAPE. Did I understand correctly?
There was a problem hiding this comment.
| }.getOrElse('\\') | ||
| invertIfNotDefined(Like(e, expression(ctx.pattern), Literal(escapeChar))) | ||
| Option(ctx.quantifier).map(_.getType) match { | ||
| case Some(SqlBaseParser.ANY) if !ctx.expression.isEmpty => |
There was a problem hiding this comment.
Do we need this ctx.expression.isEmpty? It seems that the parser rule guarantee at least one expression.
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Outdated
Show resolved
Hide resolved
|
|
||
| def getLikeQuantifierExps(expressions: java.util.List[ExpressionContext]): Seq[Expression] = { | ||
| if (expressions.isEmpty) { | ||
| throw new ParseException("Syntax error: expected something between '(' and ')'.", ctx) |
There was a problem hiding this comment.
I think should remove Syntax error: , because ParseException could replace it.
| | NOT? kind=IN '(' expression (',' expression)* ')' | ||
| | NOT? kind=IN '(' query ')' | ||
| | NOT? kind=RLIKE pattern=valueExpression | ||
| | NOT? kind=LIKE quantifier=(ANY | ALL) ('('')' | '(' expression (',' expression)* ')') |
There was a problem hiding this comment.
What happened previously when we didn't have '('')' | here? I guessed that it was also a Parse Exception.
There was a problem hiding this comment.
Otherwise it will throw AnalysisException:
-- !query
select company from like_any_table where company like any ()
-- !query schema
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
Invalid number of arguments for function any. Expected: 1; Found: 0; line 1 pos 54
There was a problem hiding this comment.
Oh, it's considered as function. I got it.
|
Test build #118015 has finished for PR 27477 at commit
|
|
Test build #118025 has finished for PR 27477 at commit
|
|
Test build #118038 has finished for PR 27477 at commit
|
# Conflicts: # sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
|
Test build #118302 has finished for PR 27477 at commit
|
|
cc @cloud-fan |
|
This would be good to have since both Teradata and Snowflake support it. |
|
Test build #121127 has finished for PR 27477 at commit
|
|
retest this please |
|
Looks fine to me |
| | NOT? kind=IN '(' expression (',' expression)* ')' | ||
| | NOT? kind=IN '(' query ')' | ||
| | NOT? kind=RLIKE pattern=valueExpression | ||
| | NOT? kind=LIKE quantifier=(ANY | ALL) ('('')' | '(' expression (',' expression)* ')') |
There was a problem hiding this comment.
shall we support SOME as well? The BoolOr agg func can be called with any and some.
| val escapeChar = Option(ctx.escapeChar).map(string).map { str => | ||
| if (str.length != 1) { | ||
| throw new ParseException("Invalid escape string." + | ||
| "Escape string must contains only one character.", ctx) |
There was a problem hiding this comment.
Nit: contains -> contain ?
| case Some(SqlBaseParser.ANY) => | ||
| getLikeQuantifierExps(ctx.expression).reduceLeft(Or) | ||
| case Some(SqlBaseParser.ALL) => | ||
| getLikeQuantifierExps(ctx.expression).reduceLeft(And) |
There was a problem hiding this comment.
Nit: getLikeQuantifierExps -> getLikeQuantifierExprs ?
|
Test build #121622 has finished for PR 27477 at commit
|
| assertEqual("not (a like any ('foo%', 'bar%'))", !(('a like "foo%") || ('a like "bar%"))) | ||
| assertEqual("a like all ('foo%', 'bar%')", ('a like "foo%") && ('a like "bar%")) | ||
| assertEqual("a not like all ('foo%', 'bar%')", !('a like "foo%") && !('a like "bar%")) | ||
| assertEqual("not (a like all ('foo%', 'bar%'))", !(('a like "foo%") && ('a like "bar%"))) |
There was a problem hiding this comment.
Could you add two more tests for error handling for L1396 and L1422 in AstBuilder.scala?
| -- Automatically generated by SQLQueryTestSuite | ||
| -- Number of queries: 14 | ||
|
|
||
|
|
There was a problem hiding this comment.
note: I've checked that the output is the same with PostgreSQL output: https://gist.github.com/maropu/fa4bd6491e21751d6bbc44c545390b0c
|
Looks fine except for the existing comments. |
|
Test build #121684 has finished for PR 27477 at commit
|
|
Test build #121713 has finished for PR 27477 at commit
|
|
retest this please |
|
Test build #121733 has finished for PR 27477 at commit
|
|
@wangyum btw, we need to update the SQL document for this new syntax. Follow-up PR is alright, though. cc: @huaxingao |
|
Thanks, all! Merged to master. |
|
@maropu Since this is for 3.1, I will not include this new syntax in 3.0 sql ref. |
|
Yea, we need to update it only in master. |
| SELECT company FROM like_all_table WHERE company NOT LIKE ALL (NULL, NULL); | ||
|
|
||
| -- negative case | ||
| SELECT company FROM like_any_table WHERE company LIKE ALL (); |
There was a problem hiding this comment.
Is using of non-existing table intentional? I guess the purpose was to check LIKE ALL ().


What changes were proposed in this pull request?
LIKE ANY/SOMEandLIKE ALLoperators are mostly used when we are matching a text field with numbers of patterns. For example:Teradata / Hive 3.0 / Snowflake:
PostgreSQL:
This PR add support these two operators.
More details:
https://docs.teradata.com/reader/756LNiPSFdY~4JcCCcR5Cw/4~AyrPNmDN0Xk4SALLo6aQ
https://issues.apache.org/jira/browse/HIVE-15229
https://docs.snowflake.net/manuals/sql-reference/functions/like_any.html
Why are the changes needed?
To smoothly migrate SQLs to Spark SQL.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Unit test.