Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(snql) Support higher order functions in SnQL #2333

Merged
merged 6 commits into from
Jan 12, 2022

Conversation

evanh
Copy link
Member

@evanh evanh commented Jan 8, 2022

SnQL now supports using lambdas in its clauses. Lambdas are defined as a tuple
of identifiers, followed by an arrow, followed by a function that may or may not
use those identifiers.

(`x`, `y`) -> foo(`x`, `y`)

An identifier in SnQL is a set of characters enclosed in
backticks. The distinction to use backticks was to make parsing and validating
easier, since the parser doesn't have to try and determine if a string is a
column or an identifier.

This feature set is slightly more limited than what is available in Clickhouse.
The major difference is that a lambda can only be used as a parameter in a
function, and the transformation of a lambda must always be a function itself.

-- Not valid since lambdas must be inside functions
SELECT (`x`) -> foo(`x`) ... 

-- Not valid since lambdas must have functions
SELECT arrayMap((`x`) -> `x`, ...

This change makes using higher order functions such as arrayMap possible, since
those functions require a lambda to be defined to use them.

SnQL now supports using lambdas in its clauses. Lambdas are defined as a tuple
of identifiers, followed by an arrow, followed by a function that may or may not
use those identifiers. An identifier in SnQL is a set of characters enclosed in
backticks. The distinction to use backticks was to make parsing and validating
easier, since the parser doesn't have to try and determine if a string is a
column or an identifier.

This feature set is slightly more limited than what is available in Clickhouse.
The major difference is that a lambda can only be used as a parameter in a
function, and the transformation of a lambda must always be a function itself.
So (`x`) -> `x` is not a valid lambda.

This change makes using higher order functions such as arrayMap possible, since
those functions required a lambda to be defined to use them.
@evanh evanh requested a review from a team as a code owner January 8, 2022 00:52
@codecov-commenter
Copy link

codecov-commenter commented Jan 8, 2022

Codecov Report

Merging #2333 (fb5a7e8) into master (0b53064) will increase coverage by 0.03%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2333      +/-   ##
==========================================
+ Coverage   92.76%   92.80%   +0.03%     
==========================================
  Files         561      561              
  Lines       25790    25834      +44     
==========================================
+ Hits        23924    23974      +50     
+ Misses       1866     1860       -6     
Impacted Files Coverage Δ
tests/fixtures.py 100.00% <ø> (ø)
tests/query/snql/test_invalid_queries.py 100.00% <ø> (ø)
tests/query/snql/test_query.py 100.00% <ø> (ø)
snuba/query/snql/parser.py 96.33% <100.00%> (+0.21%) ⬆️
tests/query/test_query_validation.py 100.00% <100.00%> (ø)
tests/test_snql_api.py 100.00% <100.00%> (ø)
snuba/datasets/transactions_processor.py 90.67% <0.00%> (+0.51%) ⬆️
snuba/query/parser/__init__.py 98.71% <0.00%> (+1.28%) ⬆️
...cessors/type_converters/hexint_column_processor.py 85.18% <0.00%> (+14.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0b53064...fb5a7e8. Read the comment docs.

@volokluev
Copy link
Member

Just an initial comment from looking at the PR description. This would be much clearer with an example:

Lambdas are defined as a tuple of identifiers, followed by an arrow, followed by a function that may or may not
use those identifiers. An identifier in SnQL is a set of characters enclosed in backticks.

This as well:

This feature set is slightly more limited than what is available in Clickhouse.
The major difference is that a lambda can only be used as a parameter in a
function, and the transformation of a lambda must always be a function itself.
So (x) -> x is not a valid lambda.

It's not totally clear what is a valid lambda where. To make it more clear you could say:

The following is valid in clickhouse sql:

>Insert example here, preferably with a complete query

It is not valid in snql because a lambda can only be used as a parameter in a
function. This prevents usage of a high order function like arrayMap:

>insert example here, preferable with a complete query

@evanh
Copy link
Member Author

evanh commented Jan 10, 2022

@vladkluev I added examples, hope that makes it clearer.

Copy link
Member

@volokluev volokluev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 questions but not blocking. well done!

tests/query/snql/test_invalid_queries.py Show resolved Hide resolved
tests/fixtures.py Show resolved Hide resolved
@@ -1119,6 +1161,33 @@ def mangle_column_value(exp: Expression) -> Expression:
query.transform_expressions(mangle_column_value)


def validate_identifiers_in_lambda(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one thing I remembered before you merge. You don' test the recursive case of this function, you just have the one test. You may want to do that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, I added a new test and tweaked the code as a result.

@evanh evanh merged commit ec1901d into master Jan 12, 2022
@evanh evanh deleted the evanh/feat/support-higher-order-functions branch January 12, 2022 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants