Fix analyzer for lambda in aggregation#22539
Conversation
7f4bf51 to
a8dd1dc
Compare
| tempExpression = ((Cast) tempExpression).getExpression(); | ||
| } | ||
|
|
||
| if (tempExpression instanceof Literal || tempExpression instanceof ArrayConstructor) { |
There was a problem hiding this comment.
The elements in array constructor should also be Literals
There was a problem hiding this comment.
Can you add a test for this?
There was a problem hiding this comment.
Added one more test for this in abstracttestqueries
There was a problem hiding this comment.
This util is used for the reduce_agg function, and it will change the behaviour for the reduce_agg function.
For example, currently the query
SELECT id, reduce_agg(value, array[id, value], (a, b) -> a || b, (a, b) -> a || b) FROM ( VALUES (1, 2), (1, 3), (1, 4), (2, 20), (2, 30), (2, 40) ) AS t(id, value) GROUP BY id will succeed as it considers expression array[id, value] as constant.
However, after this fix, it will fail with error REDUCE_AGG only supports non-NULL literal as the initial value as array[id, value] is not considered as constant now.
cc @kaikalur
| @Override | ||
| public Expression rewriteLambdaExpression(LambdaExpression node, Boolean context, ExpressionTreeRewriter<Boolean> treeRewriter) | ||
| { | ||
| Expression result = super.rewriteLambdaExpression(node, true, treeRewriter); |
There was a problem hiding this comment.
Context set to true for lambda expression
| } | ||
|
|
||
| @Test | ||
| public void testLambdaInAggregation() |
There was a problem hiding this comment.
These two will fail without fix here
a8dd1dc to
a6294e9
Compare
| assertQuery("SELECT id, reduce_agg(value, 0, (a, b) -> a + b+0, (a, b) -> a + b) FROM ( VALUES (1, 2), (1, 3), (1, 4), (2, 20), (2, 30), (2, 40) ) AS t(id, value) GROUP BY id", "values (1, 9), (2, 90)"); | ||
| assertQuery("SELECT id, reduce_agg(value, 's', (a, b) -> concat(a, b, 's'), (a, b) -> concat(a, b, 's')) FROM ( VALUES (1, '2'), (1, '3'), (1, '4'), (2, '20'), (2, '30'), (2, '40') ) AS t(id, value) GROUP BY id", | ||
| "values (1, 's2s3s4s'), (2, 's20s30s40s')"); | ||
| assertQueryFails("SELECT id, reduce_agg(value, array[id, value], (a, b) -> a || b, (a, b) -> a || b) FROM ( VALUES (1, 2), (1, 3), (1, 4), (2, 20), (2, 30), (2, 40) ) AS t(id, value) GROUP BY id", |
There was a problem hiding this comment.
This is for the change of isConstant(Expression expression) function. Before change "array[id, value]" is considered constant and this query pass. Now it will throw exception.
There was a problem hiding this comment.
@feilong-liu explained to me that the reason we want this to fail is because semantically it doesn't make sense for the initial state of the lambda function to depend on the value of a column (because which row of the column is it even talking about? It wouldn't even be consistent within a query because on any given worker it would depend on which row it happened to read first)
Description
Fix issue #22558
Fix query compilation error for lambda function in aggregation.
Sample query
It will throw error
(GENERIC_INTERNAL_ERROR) no type found for symbol 'expr_10'This is because the 0 inside the lambda was rewritten to symbol 'expr_10'. The fix is to skip rewrite for constants within lambda function.
Motivation and Context
Fix bug.
Impact
Fix compilation error for query
Test Plan
Unit tests
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.