A simple sql to sql rewrite for cardinality()#17198
Conversation
5014403 to
c905243
Compare
There was a problem hiding this comment.
Might be better to avoid creating a new object when there is no change to the expression.
For example, we can check if the returned pointer of the rewrite is the same as the argument pointer and keep track if any argument is changed.
There was a problem hiding this comment.
Thanks! We're no longer creating a new object every time
highker
left a comment
There was a problem hiding this comment.
For the commit message, we will need to cap each line within 72 characters (inclusive). So something like
Simplify cardinality on map keys and values functions
A new optimizer rule is added to simplify expressions like
`cardinality(map_keys(m))` into `cardinality((m))`. Same for
`map_values` function.
There was a problem hiding this comment.
Use ImmutableList.Builder<Expression> builder = ImmutableList.builder();
There was a problem hiding this comment.
The first condition of if should be outside of the for loop.
There was a problem hiding this comment.
done -- although it makes the code a little bit longer
There was a problem hiding this comment.
Assign (FunctionCall) argument) to some variable so it can be reused in the following 3 lines
There was a problem hiding this comment.
map_keys and map_values can take and only take 1 argument right?
There was a problem hiding this comment.
Define a const at the beginning of the class
private static final Set<QualifiedName> MAP_FUNCTIONS = ImmutableSet.of(QualifiedName.of("map_values"), QualifiedName.of("map_keys"));and use it here
MAP_FUNCTIONS.contains(((FunctionCall) argument).getName())There was a problem hiding this comment.
This will always create new object. We should return new object only when we hit the above written criteria, which I assume most of the queries will not be rewritten with this rule.
There was a problem hiding this comment.
one argument per line and leave the first line empty
assertRewritten(
"cardinality(map(ARRAY[cardinality(map_values(m_1)),3], ARRAY[2,cardinality(map_values(m_2))]))",
"cardinality(map(ARRAY[cardinality(m_1),3], ARRAY[2,cardinality(m_2)]))");There was a problem hiding this comment.
Can we add capital cases and mixed cases like CaDInaliTy(....) ?
f2e3cb4 to
b72aede
Compare
60d1e3e to
c7cdf9b
Compare
There was a problem hiding this comment.
s/tmpFn/functionCall
In presto codebase, we don't usually use abbreviations or acronyms
There was a problem hiding this comment.
s/newFnIfRewritten/newFunctionIfRewritten
There was a problem hiding this comment.
lol, this part is tricky. It's recursive so that we have to hold on a lot on building new objects before we can tell if we need rewrite or not. cc: @yuanzhanhku
There was a problem hiding this comment.
Yes, that is a valid concern. The current implementation makes a copy of all function argument pointers recursively even if we don't need to change it. It may cause regressions for queries with lots of expressions. That being said, I don't have an easy way to address this. One idea is to have a tree node level bitmap encode what type of nodes are contained in the tree so that we could have a O(1) function to tell if a tree contains a given node type. But this requires some large refactor.
c7cdf9b to
61b2955
Compare
A new optimizer rule is added to simplify expressions like `cardinality(map_keys(m))` into `cardinality((m))`. Same for `map_values` function.
61b2955 to
9874bde
Compare
Simplify cardinality on map keys and values functions
A new optimizer rule is added to simplify expressions like
cardinality(map_keys(m))intocardinality((m)). Same formap_valuesfunction.