-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JEP 9 - Boolean expressions #16
Merged
Changes from 3 commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,345 @@ | ||
================ | ||
Improved Filters | ||
================ | ||
|
||
:JEP: 9 | ||
:Author: James Saryerwinnie | ||
:Status: proposed | ||
:Created: 07-July-2014 | ||
|
||
|
||
Abstract | ||
======== | ||
|
||
JEP 7 introduced filter expressions, which is a mechanism to allow | ||
list elements to be selected based on matching an expression against | ||
each list element. While this concept is useful, the actual comparator | ||
expressions were not sufficiently capable to accomodate a number of common | ||
queries. This JEP expands on filter expressions by proposing support for | ||
``and-expressions``, ``not-expression``, ``paren-expressions``, and | ||
``unary-expressions``. With these additions, the capabilities of a filter | ||
expression now allow for sufficiently powerful queries to handle the majority | ||
of queries. | ||
|
||
|
||
Motivation | ||
========== | ||
|
||
JEP 7 introduced filter queries, that essentially look like this:: | ||
|
||
foo[?lhs omparator rhs] | ||
|
||
where the left hand side (lhs) and the right hand side (rhs) | ||
are both an ``expression``, and comparator is one of | ||
``==, !=, <, <=, >, >=``. | ||
|
||
This added a useful feature to JMESPath: the ability to filter | ||
a list based on evaluating an expression against each element in a list. | ||
|
||
In the time since JEP 7 has been part of JMESPath, a number of cases have been | ||
pointed out in which filter expressions cannot solve. Below are examples of | ||
each type of missing features. | ||
|
||
|
||
Or Expressions | ||
-------------- | ||
|
||
First, users want the ability to filter based on matching one or more | ||
expressions. For example, given:: | ||
|
||
{ | ||
"cities": [ | ||
{"name": "Seattle", "state": "WA"}, | ||
{"name": "Los Angeles", "state": "CA"}, | ||
{"name": "Bellevue", "state": "WA"}, | ||
{"name": "New York", "state": "NY"}, | ||
{"name": "San Antonio", "state": "TX"}, | ||
{"name": "Portland", "state": "OR"} | ||
] | ||
} | ||
|
||
a user might want to select locations on the west coast, which in | ||
this specific example means cities in either ``WA``, ``OR``, or | ||
``CA``. It's not possible to express this as a filter expression | ||
given the grammar of ``expression comparator expression``. Ideally | ||
a user should be able to use:: | ||
|
||
cities[?state == `WA` || state == `OR` || state == `CA`] | ||
|
||
JMESPath already supports Or expressions, just not in the context | ||
of filter expressions. | ||
|
||
And Expressions | ||
--------------- | ||
|
||
The next missing feature of filter expressions is support for And | ||
expressions. It's actually somewhat odd that JMESPath has support | ||
for Or expressions, but not for And expressions. For example, | ||
given a list of user accounts with permissions:: | ||
|
||
{ | ||
"users": [ | ||
{"name": "user1", "type": "normal"", "allowed_hosts": ["a", "b"]}, | ||
{"name": "user2", "type": "admin", "allowed_hosts": ["a", "b"]}, | ||
{"name": "user3", "type": "normal", "allowed_hosts": ["c", "d"]}, | ||
{"name": "user4", "type": "admin", "allowed_hosts": ["c", "d"]}, | ||
{"name": "user5", "type": "normal", "allowed_hosts": ["c", "d"]}, | ||
{"name": "user6", "type": "normal", "allowed_hosts": ["c", "d"]} | ||
] | ||
} | ||
|
||
We'd like to find admin users that have permissions to the host named | ||
``c``. Ideally, the filter expression would be:: | ||
|
||
users[?type == `admin` && contains(allowed_hosts, `c`)] | ||
|
||
|
||
Unary Expressions | ||
----------------- | ||
|
||
Think of an if statement in a language such as C or Java. While you can write | ||
an if statement that looks like:: | ||
|
||
if (foo == bar) { ... } | ||
|
||
You can also use a unary expression such as:: | ||
|
||
if (allowed_access) { ... } | ||
|
||
or:: | ||
|
||
if (!allowed_access) { ... } | ||
|
||
Adding support for unary expressions brings a natural syntax when filtering | ||
against boolean values. Instead of:: | ||
|
||
foo[?boolean_var == `true`] | ||
|
||
a user could instead use:: | ||
|
||
foo[?boolean_var] | ||
|
||
As a more realistic example, given a slightly different structure | ||
for the ``users`` data above:: | ||
|
||
{ | ||
"users": [ | ||
{"name": "user1", "is_admin": false, "disabled": false}, | ||
{"name": "user2", "is_admin": true, "disabled": true}, | ||
{"name": "user3", "is_admin": false, "disabled": false}, | ||
{"name": "user4", "is_admin": true, "disabled": false}, | ||
{"name": "user5", "is_admin": false, "disabled": true}, | ||
{"name": "user6", "is_admin": false, "disabled": false} | ||
] | ||
} | ||
|
||
If we want to get the names of all admin users whose account is enabled, we | ||
could either say:: | ||
|
||
users[?is_admin == `true` && disabled == `false] | ||
|
||
but it's more natural and succinct to instead say:: | ||
|
||
users[?is_admin && !disabled] | ||
|
||
A case can be made that this syntax is not strictly necessary. This is true. | ||
However, the main reason for adding support for unary expressions in a filter | ||
expression is users expect this syntax, and are surprised when this is not | ||
a supported syntax. Especially now that we are basically anchoring to | ||
a C-like syntax for filtering in this JEP, users will expect unary expressions | ||
even more. | ||
|
||
Paren Expressions | ||
----------------- | ||
|
||
Once ``||`` and ``&&`` statements have been introduced, there will be times | ||
when you want to override the precedence of these operators. | ||
|
||
A ``paren-expression`` allows a user to override the precedence order of | ||
an expression, e.g. ``(a || b) && c``, instead of the default precedence | ||
of ``a || (b && c)`` for the expression ``a || b && c``. | ||
|
||
|
||
|
||
Specification | ||
============= | ||
|
||
There are several updates to the grammar:: | ||
|
||
and-expression = expression "&&" expression | ||
not-expression = "!" expression | ||
paren-expression = "(" expression ")" | ||
|
||
|
||
Additionally, the ``filter-expression`` rule is updated | ||
to be more general:: | ||
|
||
bracket-specifier =/ "[?" expression "]" | ||
|
||
The ``list-filter-expr`` is now a more general | ||
``comparator-expression``:: | ||
|
||
comparator-expression = expression comparator expression | ||
|
||
which is now just an expression:: | ||
|
||
expression /= comparator-expression | ||
|
||
And finally, the ``current-node`` is now allowed as a generic | ||
expression:: | ||
|
||
expression /= current-node | ||
|
||
Operator Precedence | ||
------------------- | ||
|
||
This JEP introduces and expressions, which would normally be defined as:: | ||
|
||
expression = or-expression / and-expression / not-expression | ||
or-expression = expression "||" expression | ||
and-expression = expression "&&" expression | ||
not-expression = "!" expression | ||
|
||
However, if this current pattern is followed, it makes it impossible to parse | ||
an expression with the correct precedence. A more standard way of expressing | ||
this would be:: | ||
|
||
expression = or-expression | ||
or-expression = and-expression "||" and-expression | ||
and-expression = not-expression "&&" not-expression | ||
not-expression = "!" expression | ||
|
||
|
||
The precedence for the new boolean expressions matches how most | ||
other languages define boolean expressions. That is from weakest | ||
binding to tightest binding: | ||
|
||
* Or - ``||`` | ||
* And - ``&&`` | ||
* Unary not - ``!`` | ||
|
||
So for example, ``a || b && c`` is parsed as ``a || (b && c)`` and | ||
not ``(a || b) && c``. | ||
|
||
The operator precedence list in the specification will now read: | ||
|
||
* Pipe - ``||`` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should be There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed. |
||
* Or - ``||`` | ||
* And - ``&&`` | ||
* Unary not - ``!`` | ||
* Rbracket - ``]`` | ||
|
||
|
||
Now that these expressions are allowed as general ``expressions``, there | ||
semantics outside of their original contexts must be defined. | ||
|
||
|
||
And Expressions | ||
--------------- | ||
|
||
For reference, the JMESPath spec already defines the following values | ||
as "false-like" values: | ||
|
||
* Empty list: ``[]`` | ||
* Empty object: ``{}`` | ||
* Empty string: ``""`` | ||
* False boolean: ``false`` | ||
* Null value: ``null`` | ||
|
||
And any value that is not a false-like value is a truth-like value. | ||
|
||
An ``and-expression`` has similar semantics to and expressions in other | ||
languages. If the expression on the left hand side is a truth-like value, then | ||
the value on the right hand side is returned. Otherwise the result of the | ||
expression on the left hand side is returned. This also reduces to the | ||
expected truth table: | ||
|
||
.. list-table:: Truth table for and expressions | ||
:header-rows: 1 | ||
|
||
* - LHS | ||
- RHS | ||
- Result | ||
* - True | ||
- True | ||
- True | ||
* - True | ||
- False | ||
- False | ||
* - False | ||
- True | ||
- False | ||
* - False | ||
- False | ||
- False | ||
|
||
This is the standard truth table for a | ||
`logical conjunction (AND) <https://en.wikipedia.org/wiki/Truth_table#Logical_conjunction_.28AND.29>`__. | ||
|
||
|
||
Below are a few examples of and expressions: | ||
|
||
|
||
Examples | ||
~~~~~~~~ | ||
|
||
:: | ||
|
||
search(True && False, {"True": true, "False": false}) -> false | ||
search(Number && EmptyList, {"Number": 5, EmptyList: []}) -> [] | ||
search(foo[?a == `1` && b == `2`], | ||
{"foo": [{"a": 1, "b": 2}, {"a": 1, "b": 3}]}) -> [{"a": 1, "b": 2}] | ||
|
||
|
||
Not Expressions | ||
--------------- | ||
|
||
A ``not-expression`` negates the result of an expression. If the expression | ||
results in a truth-like value, a ``not-expression`` will change this value to | ||
``false``. If the expression results in a false-like value, a | ||
``not-expression`` will change this value to ``true``. | ||
|
||
Examples | ||
~~~~~~~~ | ||
|
||
:: | ||
|
||
search(!True, {"True": true}) -> false | ||
search(!False, {"False": false}) -> true | ||
search(!Number, {"Number": 5}) -> false | ||
search(!EmptyList, {"EmptyList": []}) -> true | ||
|
||
|
||
Paren Expressions | ||
----------------- | ||
|
||
A ``paren-expression`` allows a user to override the precedence order of | ||
an expression, e.g. ``(a || b) && c``. | ||
|
||
Examples | ||
~~~~~~~~ | ||
|
||
:: | ||
|
||
search(foo[?(a == `1` || b ==`2`) && c == `5`], | ||
{"foo": [{"a": 1, "b": 2, "c": 3}, {"a": 3, "b": 4}]}) -> [] | ||
|
||
|
||
Rationale | ||
========= | ||
|
||
This JEP brings several tokens that were only allowed in specific constructs | ||
into the more general ``expression`` rule. Specifically: | ||
|
||
* The ``current-node`` (``@``) was previously only allowed in function | ||
expressions, but is now allowed as a general ``expression``. | ||
* The ``filter-expression`` now accepts any arbitrary ``expression``. | ||
* The ``list-filter-expr`` is now just a generic ``comparator-expression``, | ||
which again is just a general ``expression``. | ||
|
||
There are several reasons the previous grammar rules were minimally scoped. | ||
One of the main reasons, as stated in JEP 7 which introduced filter | ||
expressions, was to keep the spec "purposefully minimal." In fact the end | ||
of JEP 7 states that there "are several extensions that can be added in | ||
future." This is in fact exactly what this JEP proposes, the recommendations | ||
from JEP 7. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's been a while since this was first written.