v3: fix regression in aggregate handling#3129
v3: fix regression in aggregate handling#3129sougou merged 2 commits intovitessio:masterfrom sougou:aggregates
Conversation
The recent v3 scatter aggregates feature introduced a regression: If there is a complex aggregate expression, but the group by references a unique vindex, then it fails the query as unsupported. The fix was tricky because the newer code creates the OrderedAggregate primitive upfront, and then it gets dissolved later if a group by was found referencing a unique vindex. But this was possible only for non-complex expressions where the OA primitive could represent them. The newer code instead looks ahead into the group by first. If a unique vindex is found, then no OA is created. So, complex expressions can be freely pushed down into the route. This complicates the look-ahead code because it has to do it without disturbing the symbol table. But it also simplifies the OA code a bit because there's no need to dissolve it in the future. The execution tree is essentially fixed upfront.
demmer
left a comment
There was a problem hiding this comment.
I don't feel like I fully understand all the depths of the aggregation query planner, but from what I can glean this change looks good to me and the approach seems sound given the tests.
A few more test cases would be beneficial as I mentioned in the review comments, but overall this looks good.
| return hasAggregates | ||
| } | ||
|
|
||
| // groupbyHaUniqueVindex looks ahead at the group by expression to see if |
data/test/vtgate/aggr_cases.txt
Outdated
| } | ||
| } | ||
|
|
||
| # group by a unique vindex should revert to simple route, even if aggr is complex |
There was a problem hiding this comment.
For completeness, you should add a couple more unsupported test cases:
One to show that complex aggregates are unsupported when grouping by a field that is not in a vindex at all, and one when grouping by a field that is in a non-unique vindex.
| } | ||
| } | ||
|
|
||
| # group by a unique vindex where expression is qualified (alias should be ignored) |
There was a problem hiding this comment.
Also I would like to see a test case showing that this is supported where the unique vindex is aliased, e.g. select id as x, 1+count(*) from user group by x
There was a problem hiding this comment.
Also (again for completeness) it would be good to add a test case that this is supported when grouping by a unique field AND additional fields.
data/test/vtgate/aggr_cases.txt
Outdated
| } | ||
| } | ||
|
|
||
| # group by a unique vindex where it should skip non-aliasex expressions. |
|
Those are good test suggestions. I've added them all. |
@rnavarro @demmer @bbeaudreault @mberlin-bot This needs urgent review because it's blocking StitchLabs rollout. Using a private test, I've verified that the previously failing query from StitchLabs passes.
The recent v3 scatter aggregates feature introduced a regression:
If there is a complex aggregate expression, but the group by
references a unique vindex, then it fails the query as unsupported.
The fix was tricky because the newer code creates the OrderedAggregate
primitive upfront, and then it gets dissolved later if a group
by was found referencing a unique vindex. But this was possible
only for non-complex expressions where the OA primitive could
represent them.
The newer code instead looks ahead into the group by first.
If a unique vindex is found, then no OA is created. So, complex
expressions can be freely pushed down into the route. This complicates
the look-ahead code because it has to do it without disturbing
the symbol table. But it also simplifies the OA code a bit because
there's no need to dissolve it in the future. The execution tree
is essentially fixed upfront.