Add all_match, any_match and none_match functions for arrays#1045
Add all_match, any_match and none_match functions for arrays#1045lxynov wants to merge 3 commits intotrinodb:masterfrom
Conversation
Empty array should make
We have two other options to consider
|
I would assume we want this so we can write |
e02e12b to
e130e6d
Compare
|
The PR has been updated. Now the behaviors are:
|
electrum
left a comment
There was a problem hiding this comment.
This is looking good. Please add documentation in presto-docs/src/main/sphinx/functions/array.rst as part of the same commit.
presto-main/src/test/java/io/prestosql/operator/scalar/TestArrayMatchFunctions.java
Outdated
Show resolved
Hide resolved
presto-main/src/test/java/io/prestosql/operator/scalar/TestArrayMatchFunctions.java
Outdated
Show resolved
Hide resolved
650b0fe to
22e6f16
Compare
|
@electrum Thanks for the review. I've updated the PR. Please note that I also changed the implementation because the previous one doesn't take into account the case when these functions return null. |
There was a problem hiding this comment.
I expect null handling for any_match(ARRAY[a, b], x -> f(x)) to behave the same as f(a) OR f(b). So for example, any_match(ARRAY[NULL, true], x -> x) should be true. Analogously, all_match(ARRAY[NULL, false], x -> x) ought to be false.
In addition to consistency with non array behavior, this has the benefit of regaining the ability to exit loops early.
The SQL standard defines the set functions EVERY, ANY, and SOME. @electrum do you know if there's a standardized correspondence of set functions and functions on array?
presto-main/src/main/java/io/prestosql/operator/scalar/ArrayAllMatchFunction.java
Outdated
Show resolved
Hide resolved
@wagnermarkd this indeed is a reasonable expectation. |
|
There are not many functions defined for arrays in the SQL standard, but there are ways to convert from array to table and viceversa. These functions should mimic the semantics of operations over tables as much as possible. EVERY, ANY and SOME are quantifiers for comparison operations, so they are not quite the same. The closest to that would be an expression like: true = any (select f(x) from unnest(a) as t(x)) |
@wagnermarkd Thanks for the review. I think this makes sense from the user's perspective. And if the lambda returns |
|
That should return the same as |
@wagnermarkd This is quite interesting. To summarize here, in Presto:
Not sure if these are SQL standards or Presto's own behaviors, but I will make the match functions be consistent with them. |
|
Yes, that’s standard SQL behavior. In boolean expressions, NULL behaves like the UNKNOWN value. You can easily reason about those outcomes if you take that into account. |
|
We even have documentation for that :) https://prestosql.io/docs/current/functions/logical.html |
22e6f16 to
b647ec1
Compare
|
@wagnermarkd @findepi @dain @electrum @martint |
b647ec1 to
5b4761c
Compare
martint
left a comment
There was a problem hiding this comment.
One minor comment, but otherwise it looks great.
There was a problem hiding this comment.
Instead of duplicating these interfaces across the multiple functions, pull them out and reuse them. There's nothing that ties them to each of the functions -- all that matters is that they are functional interfaces for XXX -> YYY, for some XXX and YYY types.
Also, I'd rename them to SliceToBooleanFunction, LongToBooleanFunction, DoubleToBooleanFunction and BooleanToBooleanFunction (the "lambda" is the syntactic construct of what gets passed in, but the methods that use them take a "function")
There was a problem hiding this comment.
@martint
I'd also pull out interfaces in ArrayFilterFunction because they are same as those in ArrayAllMatchFunction and ArrayAnyMatchFunction.
Two questions:
- Should I put these interfaces in package
io.prestosql.operator.scalar? - Should I separate the changes in two commits? One for extracting and renaming interfaces from
ArrayFilterFunction, the other one for adding match functions.
5b4761c to
7bc5b2b
Compare
|
Merged, thanks! |
|
Looks like this was merged. |
Cherry-picked from trinodb/trino#1045 Co-authored-by: Xingyuan Lin <xinlin@linkedin.com>
Cherry-picked from: trinodb/trino#1045 Co-authored-by: Xingyuan Lin <xinlin@linkedin.com>
Cherry-picked from: trinodb/trino#1045 Co-authored-by: Xingyuan Lin <xinlin@linkedin.com>
Cherry-picked from trinodb/trino#1045 Co-authored-by: Xingyuan Lin <xinlin@linkedin.com>
Cherry-picked from: trinodb/trino#1045 Co-authored-by: Xingyuan Lin <xinlin@linkedin.com>
Cherry-picked from: trinodb/trino#1045 Co-authored-by: Xingyuan Lin <xinlin@linkedin.com>
Cherry-picked from trinodb/trino#1045 Co-authored-by: Xingyuan Lin <xinlin@linkedin.com>
Cherry-picked from: trinodb/trino#1045 Co-authored-by: Xingyuan Lin <xinlin@linkedin.com>
Cherry-picked from: trinodb/trino#1045 Co-authored-by: Xingyuan Lin <xinlin@linkedin.com>
#1036
Test done:
Added
TestArrayMatchFunctions.mvn clean installruns successfully inpresto-maindirectory.Notes:
NULLelements in an array are considered as "not matching". E.g.,select all_match(array[1,null], x -> x = 1)returnsfalse.falseinall_matchandany_match, but atrueinnone_match.ArrayAllMatchFunction.AllMatchBlockLambda,ArrayAnyMatchFunction.AnyMatchBlockLambdaandArrayFilterFunction.FilterBlockLambdaare identical, and they can be replaced by a newPredicateLambdaBlockinterface. I think this can be done in a separate commit if it's necessary.