JEP: | 9 |
---|---|
Author: | James Saryerwinnie |
Status: | accepted |
Created: | 07-July-2014 |
JEP 7 introduced filter expressions, which is a mechanism to allow
list elements to be selected based on matching an expression against
each list element. While this concept is useful, the actual comparator
expressions were not sufficiently capable to accomodate a number of common
queries. This JEP expands on filter expressions by proposing support for
and-expressions
, not-expression
, paren-expressions
, and
unary-expressions
. With these additions, the capabilities of a filter
expression now allow for sufficiently powerful queries to handle the majority
of queries.
JEP 7 introduced filter queries, that essentially look like this:
foo[?lhs omparator rhs]
where the left hand side (lhs) and the right hand side (rhs)
are both an expression
, and comparator is one of
==, !=, <, <=, >, >=
.
This added a useful feature to JMESPath: the ability to filter a list based on evaluating an expression against each element in a list.
In the time since JEP 7 has been part of JMESPath, a number of cases have been pointed out in which filter expressions cannot solve. Below are examples of each type of missing features.
First, users want the ability to filter based on matching one or more expressions. For example, given:
{ "cities": [ {"name": "Seattle", "state": "WA"}, {"name": "Los Angeles", "state": "CA"}, {"name": "Bellevue", "state": "WA"}, {"name": "New York", "state": "NY"}, {"name": "San Antonio", "state": "TX"}, {"name": "Portland", "state": "OR"} ] }
a user might want to select locations on the west coast, which in
this specific example means cities in either WA
, OR
, or
CA
. It's not possible to express this as a filter expression
given the grammar of expression comparator expression
. Ideally
a user should be able to use:
cities[?state == `WA` || state == `OR` || state == `CA`]
JMESPath already supports Or expressions, just not in the context of filter expressions.
The next missing feature of filter expressions is support for And expressions. It's actually somewhat odd that JMESPath has support for Or expressions, but not for And expressions. For example, given a list of user accounts with permissions:
{ "users": [ {"name": "user1", "type": "normal"", "allowed_hosts": ["a", "b"]}, {"name": "user2", "type": "admin", "allowed_hosts": ["a", "b"]}, {"name": "user3", "type": "normal", "allowed_hosts": ["c", "d"]}, {"name": "user4", "type": "admin", "allowed_hosts": ["c", "d"]}, {"name": "user5", "type": "normal", "allowed_hosts": ["c", "d"]}, {"name": "user6", "type": "normal", "allowed_hosts": ["c", "d"]} ] }
We'd like to find admin users that have permissions to the host named
c
. Ideally, the filter expression would be:
users[?type == `admin` && contains(allowed_hosts, `c`)]
Think of an if statement in a language such as C or Java. While you can write an if statement that looks like:
if (foo == bar) { ... }
You can also use a unary expression such as:
if (allowed_access) { ... }
or:
if (!allowed_access) { ... }
Adding support for unary expressions brings a natural syntax when filtering against boolean values. Instead of:
foo[?boolean_var == `true`]
a user could instead use:
foo[?boolean_var]
As a more realistic example, given a slightly different structure
for the users
data above:
{ "users": [ {"name": "user1", "is_admin": false, "disabled": false}, {"name": "user2", "is_admin": true, "disabled": true}, {"name": "user3", "is_admin": false, "disabled": false}, {"name": "user4", "is_admin": true, "disabled": false}, {"name": "user5", "is_admin": false, "disabled": true}, {"name": "user6", "is_admin": false, "disabled": false} ] }
If we want to get the names of all admin users whose account is enabled, we could either say:
users[?is_admin == `true` && disabled == `false]
but it's more natural and succinct to instead say:
users[?is_admin && !disabled]
A case can be made that this syntax is not strictly necessary. This is true. However, the main reason for adding support for unary expressions in a filter expression is users expect this syntax, and are surprised when this is not a supported syntax. Especially now that we are basically anchoring to a C-like syntax for filtering in this JEP, users will expect unary expressions even more.
Once ||
and &&
statements have been introduced, there will be times
when you want to override the precedence of these operators.
A paren-expression
allows a user to override the precedence order of
an expression, e.g. (a || b) && c
, instead of the default precedence
of a || (b && c)
for the expression a || b && c
.
There are several updates to the grammar:
and-expression = expression "&&" expression not-expression = "!" expression paren-expression = "(" expression ")"
Additionally, the filter-expression
rule is updated
to be more general:
bracket-specifier =/ "[?" expression "]"
The list-filter-expr
is now a more general
comparator-expression
:
comparator-expression = expression comparator expression
which is now just an expression:
expression /= comparator-expression
And finally, the current-node
is now allowed as a generic
expression:
expression /= current-node
This JEP introduces and expressions, which would normally be defined as:
expression = or-expression / and-expression / not-expression or-expression = expression "||" expression and-expression = expression "&&" expression not-expression = "!" expression
However, if this current pattern is followed, it makes it impossible to parse an expression with the correct precedence. A more standard way of expressing this would be:
expression = or-expression or-expression = and-expression "||" and-expression and-expression = not-expression "&&" not-expression not-expression = "!" expression
The precedence for the new boolean expressions matches how most other languages define boolean expressions. That is from weakest binding to tightest binding:
- Or -
||
- And -
&&
- Unary not -
!
So for example, a || b && c
is parsed as a || (b && c)
and
not (a || b) && c
.
The operator precedence list in the specification will now read:
- Pipe -
|
- Or -
||
- And -
&&
- Unary not -
!
- Rbracket -
]
Now that these expressions are allowed as general expressions
, there
semantics outside of their original contexts must be defined.
For reference, the JMESPath spec already defines the following values as "false-like" values:
- Empty list:
[]
- Empty object:
{}
- Empty string:
""
- False boolean:
false
- Null value:
null
And any value that is not a false-like value is a truth-like value.
An and-expression
has similar semantics to and expressions in other
languages. If the expression on the left hand side is a truth-like value, then
the value on the right hand side is returned. Otherwise the result of the
expression on the left hand side is returned. This also reduces to the
expected truth table:
LHS | RHS | Result |
---|---|---|
True | True | True |
True | False | False |
False | True | False |
False | False | False |
This is the standard truth table for a logical conjunction (AND).
Below are a few examples of and expressions:
search(True && False, {"True": true, "False": false}) -> false search(Number && EmptyList, {"Number": 5, EmptyList: []}) -> [] search(foo[?a == `1` && b == `2`], {"foo": [{"a": 1, "b": 2}, {"a": 1, "b": 3}]}) -> [{"a": 1, "b": 2}]
A not-expression
negates the result of an expression. If the expression
results in a truth-like value, a not-expression
will change this value to
false
. If the expression results in a false-like value, a
not-expression
will change this value to true
.
search(!True, {"True": true}) -> false search(!False, {"False": false}) -> true search(!Number, {"Number": 5}) -> false search(!EmptyList, {"EmptyList": []}) -> true
A paren-expression
allows a user to override the precedence order of
an expression, e.g. (a || b) && c
.
search(foo[?(a == `1` || b ==`2`) && c == `5`], {"foo": [{"a": 1, "b": 2, "c": 3}, {"a": 3, "b": 4}]}) -> []
This JEP brings several tokens that were only allowed in specific constructs
into the more general expression
rule. Specifically:
- The
current-node
(@
) was previously only allowed in function expressions, but is now allowed as a generalexpression
. - The
filter-expression
now accepts any arbitraryexpression
. - The
list-filter-expr
is now just a genericcomparator-expression
, which again is just a generalexpression
.
There are several reasons the previous grammar rules were minimally scoped. One of the main reasons, as stated in JEP 7 which introduced filter expressions, was to keep the spec "purposefully minimal." In fact the end of JEP 7 states that there "are several extensions that can be added in future." This is in fact exactly what this JEP proposes, the recommendations from JEP 7.