Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
149 commits
Select commit Hold shift + click to select a range
bb1ecd3
Wrap with interface, composition over inheritance
GalLalouche Dec 24, 2025
133f1af
more extractions
GalLalouche Dec 25, 2025
cd2f369
more extractions
GalLalouche Dec 25, 2025
ee765e1
More extractions, some code for Grouped
GalLalouche Dec 25, 2025
e1080c7
TEMP
GalLalouche Dec 26, 2025
2a67151
Before cursor
GalLalouche Dec 26, 2025
9a6b49a
cursor test 1
GalLalouche Dec 26, 2025
25691af
some test refactor
GalLalouche Dec 26, 2025
ae94d75
Before cursor
GalLalouche Dec 26, 2025
4e2efcf
More refactors and stuff
GalLalouche Dec 26, 2025
41ead05
Before cursor test
GalLalouche Dec 26, 2025
5fdb733
Before cursor
GalLalouche Dec 26, 2025
4fd333d
TEMP
GalLalouche Dec 26, 2025
14d2687
TEMP
GalLalouche Dec 27, 2025
c57fb6c
Adds GroupedTopN to the parser and the logical plan
ncordon Dec 24, 2025
3388918
fix tests
GalLalouche Dec 27, 2025
cfa5dc5
Before cursor
GalLalouche Dec 29, 2025
788b984
More Before cursor
GalLalouche Dec 29, 2025
23d2c43
Refactor GroupedRow and GroupedRowFiller to include group key handlin…
GalLalouche Dec 29, 2025
df7f4d0
Cursor
GalLalouche Dec 29, 2025
0d37ca0
Refactor GroupedRowFiller to support multiple group key extractors an…
GalLalouche Dec 29, 2025
cb1475a
Add tests for row eviction and size management in GroupedQueue
GalLalouche Dec 29, 2025
54e8c3e
Add RAM usage tests for GroupedQueue to verify memory consumption in …
GalLalouche Dec 29, 2025
eb9a3da
lost in overcounting, trying cursor
GalLalouche Dec 30, 2025
dce75b4
Temp
GalLalouche Dec 30, 2025
95ee4c0
Thanks cursor!
GalLalouche Dec 30, 2025
05f5812
Yey
GalLalouche Dec 30, 2025
7224fc0
Small refactors
GalLalouche Dec 31, 2025
ad61558
Refactors
GalLalouche Dec 31, 2025
8af289e
refactors
GalLalouche Dec 31, 2025
84b5255
Fix leak
GalLalouche Jan 5, 2026
01546fc
Some more tests, fixed some bugs
GalLalouche Jan 5, 2026
d5d06fc
Stuff
GalLalouche Jan 6, 2026
9c4311d
More stuff
GalLalouche Jan 6, 2026
de5dfd5
More tests (shard contexts)
GalLalouche Jan 7, 2026
b5ed2e8
Uses the 🐔
ncordon Jan 7, 2026
2064816
Large random
GalLalouche Jan 7, 2026
e3e9694
Uses a groupings field instead of GroupedTopN separate AST
ncordon Jan 19, 2026
ad57526
Merge remote-tracking branch 'upstream/main' into grouped_topn
ncordon Jan 20, 2026
2ea9a5c
Fixes serialization tests
ncordon Jan 20, 2026
0a09abd
Adds groupings to TopNExec
ncordon Jan 21, 2026
190934e
Removes unneccesary new line from antlr file
ncordon Jan 21, 2026
d6bbc17
Adds tests for physical and local physical part
ncordon Jan 21, 2026
78e9d36
Merge branch 'main' into feature/grouped_top_n
GalLalouche Jan 22, 2026
0828b18
Merge branch 'feature/grouped_top_n' into grouped_topn
GalLalouche Jan 23, 2026
a5f373c
Some small fixes after merge
GalLalouche Jan 23, 2026
fd2ae5f
[CI] Auto commit changes from spotless
Jan 23, 2026
dad886d
Before agent
GalLalouche Jan 26, 2026
5ca3345
After agent, before another
GalLalouche Jan 26, 2026
ef90153
Tests pass!
GalLalouche Jan 26, 2026
f854cc2
Refactor
GalLalouche Jan 26, 2026
22025cc
Refactor tests
GalLalouche Jan 26, 2026
14f8c11
Handle group key size preallocation
GalLalouche Jan 27, 2026
f4b6d0d
Reuse inheritance
GalLalouche Jan 27, 2026
fa4ec5f
Some small clean ups
ncordon Jan 28, 2026
f0f11d6
[CI] Auto commit changes from spotless
Jan 28, 2026
6255afc
Fix another TODO
GalLalouche Jan 28, 2026
d73df48
Replace List with array
GalLalouche Jan 28, 2026
4cd58b4
Merge remote-tracking branch 'upstream/main' into grouped_topn
ncordon Jan 28, 2026
ae6c126
Adds transport version gating
ncordon Jan 28, 2026
89addfc
[CI] Update transport version definitions
Jan 28, 2026
9a5e3fe
[CI] Update transport version definitions
Jan 28, 2026
8668f99
Fixes transport versions and csv test
ncordon Jan 28, 2026
79d710b
Fixes transport versions and csv test, adds a couple more
ncordon Jan 28, 2026
1ae738a
Fixes compilation
ncordon Jan 29, 2026
ada3bbe
Adds more tests
ncordon Jan 29, 2026
e9f5b3a
Merge remote-tracking branch 'upstream/main' into grouped_topn
ncordon Jan 29, 2026
d0ccc61
Test fix
GalLalouche Jan 28, 2026
73dfb22
Test refactoring
GalLalouche Jan 28, 2026
7978408
Slightly less ugly memory releasing
GalLalouche Jan 28, 2026
eea2f7c
More test refactors, mostly around RAM usages
GalLalouche Jan 29, 2026
9bbdcfb
Remove TODOs and other misc.
GalLalouche Jan 29, 2026
5d9f099
Fixes serialization
ncordon Jan 30, 2026
f28bcf8
Adds more tests
ncordon Jan 30, 2026
43e7f50
Fixes local physical plan test
ncordon Jan 30, 2026
529572d
Fixes tests (hopefully)
ncordon Feb 1, 2026
72fa42f
Adds some golden tests
ncordon Feb 1, 2026
6bf4061
Merge remote-tracking branch 'upstream/main' into grouped_topn
ncordon Feb 2, 2026
bd91a2d
This golden test seems flaky in how the output is formatted
ncordon Feb 2, 2026
1f1a503
Fixes another serialization that is failing
ncordon Feb 2, 2026
c0e2f85
Merge remote-tracking branch 'upstream/main' into grouped_topn
ncordon Feb 2, 2026
8accc92
Merge commit 'b27c97b8f1c28749678b4c7ebb47c93b8d453a86' into grouped_…
ivancea Feb 11, 2026
5de5f0e
Merge commit 'acd9e98' into grouped_topn
ivancea Feb 11, 2026
d20e37e
Fix GroupedTopNOperatorTests
ivancea Feb 11, 2026
8009667
Merge branch 'main' into grouped_topn
ivancea Feb 11, 2026
23c8caf
Merge branch 'main' into grouped_topn
ivancea Feb 11, 2026
0302dc7
Fixes more merge conflicts
ncordon Feb 12, 2026
1d11414
Merge branch 'main' into grouped_topn
ncordon Feb 12, 2026
4288248
Updated golden tests
ivancea Feb 12, 2026
b786a67
Rename LIMIT PER to LIMIT BY
ivancea Feb 12, 2026
c7efa7c
Added tests without an extra sort at the end
ivancea Feb 12, 2026
e253371
Fix bug in grouped topN when short-circuiting condition is true
ivancea Feb 12, 2026
dfeb094
Adds ReplaceLimitByExpressionWithEval
ncordon Feb 12, 2026
9628e57
Adds more tests
ncordon Feb 12, 2026
006c41f
Temp
ncordon Feb 12, 2026
9a4d7a8
Addresses the case where we have a constant grouping
ncordon Feb 12, 2026
dbdc34e
Merge remote-tracking branch 'upstream/main' into grouped_topn
ncordon Feb 12, 2026
ee6ab54
Add new test cases for sort_by_limit capability in topN.csv-spec
ncordon Feb 12, 2026
9b26b43
Merge remote-tracking branch 'upstream/main' into grouped_topn
ncordon Feb 13, 2026
04e4f43
Fixes compilation
ncordon Feb 13, 2026
58b4793
Merge remote-tracking branch 'upstream/main' into grouped_topn
ncordon Feb 13, 2026
63e0e16
Fixes more compilation errors
ncordon Feb 13, 2026
3979759
Adds golden tests
ncordon Feb 13, 2026
e69566d
Small nit
ncordon Feb 13, 2026
e7a7291
Migrated HashMap to BlockHash in Grouped TopN
ivancea Feb 13, 2026
ac43fc8
Remove extra0 files
ivancea Feb 13, 2026
859d8f4
Fixed ToString tests
ivancea Feb 13, 2026
4687029
[CI] Auto commit changes from spotless
Feb 13, 2026
e8f3590
Fixed multivalue behaviour to reproduce that of stats by
ivancea Feb 16, 2026
4c5bbba
Merge branch 'main' into grouped_topn
ivancea Feb 16, 2026
fc92e54
Multivalue fix for pages with many groups
ivancea Feb 18, 2026
44688a8
Merge branch 'main' into grouped_topn
ivancea Feb 18, 2026
b6ddcf8
[CI] Auto commit changes from spotless
Feb 18, 2026
b3e4ea2
Merge branch 'main' into grouped_topn
ivancea Feb 24, 2026
3d7dc08
Merge branch 'main' into grouped_topn
ivancea Feb 24, 2026
1985436
Format
ivancea Feb 24, 2026
1ef812b
Fixed multivalue propagation through LIMIT BY groupings
ivancea Feb 25, 2026
06e20c5
Separated TopNOperator into TopN and GroupedTopN operators
ivancea Feb 25, 2026
d7c20bf
Added logical plan tests for multivalues
ivancea Feb 25, 2026
06586c2
Enables LIMIT BY
ncordon Feb 25, 2026
aec8a15
Adds tests for limit by
ncordon Feb 25, 2026
0799d95
Fixes bug with LIMIT pushed past EVAL
ncordon Feb 25, 2026
12aa13f
Simplifies check in PushDownAndCombineLimits
ncordon Feb 26, 2026
0a39dac
Removes non needed capability
ncordon Feb 26, 2026
030f24d
Add memory accounting test to Grouped too
ivancea Feb 26, 2026
5917fc5
Merge branch 'grouped-topn-separated-operators' into grouped_topn
ivancea Feb 26, 2026
eee3368
Small nits
ncordon Feb 26, 2026
67de1b8
Uses BytesRefHashTable instead of Map
ncordon Feb 26, 2026
0e51868
Merge remote-tracking branch 'upstream/main' into grouped_topn
ncordon Feb 26, 2026
1e8c873
Merge branch 'main' into grouped_topn
ivancea Mar 2, 2026
7f39325
Reduce duplicated test code
ivancea Mar 2, 2026
2ea23c0
Updated multivalues handling to treat them as lists, and update plans…
ivancea Mar 2, 2026
20b6005
Make groups longs
ivancea Mar 2, 2026
88aa079
Merge branch 'main' into grouped_topn
ivancea Mar 3, 2026
84b24ce
Added separated operator status for grouped
ivancea Mar 3, 2026
0034e99
[CI] Auto commit changes from spotless
Mar 3, 2026
4fb118b
Add default limits ignoring LIMIT BY
ivancea Mar 4, 2026
b3bd9aa
Merge branch 'main' into grouped_topn
ivancea Mar 4, 2026
2d4bfe6
Updated GoldenTests and added LimitBy without sort test
ivancea Mar 4, 2026
cbe8a51
Changed equals to semanticEquals and added tests for it
ivancea Mar 4, 2026
2e85d86
Fix compilation issue and checkstyle
ivancea Mar 5, 2026
b835488
Merge remote-tracking branch 'upstream/main' into grouped_topn
ncordon Mar 5, 2026
a6a6a36
Avoids pushing the LimitExec to source when there are groupings
ncordon Mar 5, 2026
0bfda27
Merge branch 'main' into grouped_topn
ivancea Mar 11, 2026
58a6fe6
Remove post-merge remains
ivancea Mar 11, 2026
a495c36
Merge branch 'main' into grouped_topn
ivancea Mar 12, 2026
cfa4cb8
Merge branch 'main' into grouped_topn
ivancea Mar 13, 2026
82792cb
Remove LIMIT BY verifier check
ivancea Mar 13, 2026
d3286a4
[CI] Auto commit changes from spotless
Mar 13, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .idea/inspectionProfiles/Project_Default.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 5 additions & 0 deletions docs/changelog/140019.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 140019
summary: Adds `GroupedTopN` ESQL parsing and planning
area: ES|QL
type: feature
issues: []
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
9314000
2 changes: 1 addition & 1 deletion server/src/main/resources/transport/upper_bounds/9.4.csv
Original file line number Diff line number Diff line change
@@ -1 +1 @@
sql_optional_allow_partial_search_results,9313000
esql_limit_by,9314000
976 changes: 976 additions & 0 deletions x-pack/plugin/esql/qa/testFixtures/src/main/resources/topN.csv-spec

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -2205,6 +2205,12 @@ public enum Cap {
*/
EXTERNAL_COMMAND(Build.current().isSnapshot()),

/**

* Enables LIMIT N BY in the LIMIT command, both with and without a preceding SORT.
*/
LIMIT_BY,

/**
* https://github.com/elastic/elasticsearch/issues/142219
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1911,7 +1911,8 @@ private InferenceFunction<?> resolveInferenceFunction(InferenceFunction<?> infer
private static class AddImplicitLimit extends ParameterizedRule<LogicalPlan, LogicalPlan, AnalyzerContext> {
@Override
public LogicalPlan apply(LogicalPlan logicalPlan, AnalyzerContext context) {
List<LogicalPlan> limits = logicalPlan.collectFirstChildren(Limit.class::isInstance);
// Find existing LIMITs without groups. Groups are unbounded, and they still require another default LIMIT
List<LogicalPlan> limits = logicalPlan.collectFirstChildren(lp -> lp instanceof Limit l && l.groupings().isEmpty());
// We check whether the query contains a TimeSeriesAggregate to determine if we should apply
// the default limit for TS queries or for non-TS queries.
// NOTE: PromqlCommand is translated to TimeSeriesAggregate during optimization.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@
import org.elasticsearch.xpack.esql.plan.logical.InlineStats;
import org.elasticsearch.xpack.esql.plan.logical.Insist;
import org.elasticsearch.xpack.esql.plan.logical.Limit;
import org.elasticsearch.xpack.esql.plan.logical.LimitBy;
import org.elasticsearch.xpack.esql.plan.logical.LogicalPlan;
import org.elasticsearch.xpack.esql.plan.logical.Lookup;
import org.elasticsearch.xpack.esql.plan.logical.Project;
Expand Down Expand Up @@ -120,7 +119,6 @@ Collection<Failure> verify(LogicalPlan plan, BitSet partialMetrics) {
checkUnsupportedAttributeRenaming(p, failures);
checkInsist(p, failures);
checkLimitBeforeInlineStats(p, failures);
checkLimitBy(p, failures);
});

if (failures.hasFailures() == false) {
Expand Down Expand Up @@ -334,12 +332,6 @@ private static void checkLimitBeforeInlineStats(LogicalPlan plan, Failures failu
}
}

private static void checkLimitBy(LogicalPlan plan, Failures failures) {
if (plan instanceof LimitBy) {
failures.add(fail(plan, "LIMIT BY is not yet supported"));
}
}

private void licenseCheck(LogicalPlan plan, Failures failures) {
Consumer<Node<?>> licenseCheck = n -> {
if (n instanceof LicenseAware la && la.licenseCheck(licenseState) == false) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -188,6 +188,21 @@ public static boolean equalsAsAttribute(Expression left, Expression right) {
return true;
}

public static boolean listSemanticEquals(List<Expression> leftList, List<Expression> rightList) {
if (leftList.size() != rightList.size()) {
return false;
}
for (int i = 0; i < leftList.size(); i++) {
Expression left = leftList.get(i);
Expression right = rightList.get(i);
assert left != null && right != null;
if (left.semanticEquals(right) == false) {
return false;
}
}
return true;
}

public static List<Tuple<Attribute, Expression>> aliases(List<? extends NamedExpression> named) {
// an alias of same name and data type can be reused (by mistake): need to use a list to collect all refs (and later report them)
List<Tuple<Attribute, Expression>> aliases = new ArrayList<>();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@
import org.elasticsearch.xpack.esql.optimizer.rules.logical.ReplaceAggregateNestedExpressionWithEval;
import org.elasticsearch.xpack.esql.optimizer.rules.logical.ReplaceAliasingEvalWithProject;
import org.elasticsearch.xpack.esql.optimizer.rules.logical.ReplaceLimitAndSortAsTopN;
import org.elasticsearch.xpack.esql.optimizer.rules.logical.ReplaceLimitByExpressionWithEval;
import org.elasticsearch.xpack.esql.optimizer.rules.logical.ReplaceOrderByExpressionWithEval;
import org.elasticsearch.xpack.esql.optimizer.rules.logical.ReplaceRegexMatch;
import org.elasticsearch.xpack.esql.optimizer.rules.logical.ReplaceRowAsLocalRelation;
Expand Down Expand Up @@ -181,6 +182,7 @@ protected static Batch<LogicalPlan> substitutions() {
// check for a trivial conversion introduced by a surrogate
new ReplaceTrivialTypeConversions(),
new ReplaceOrderByExpressionWithEval(),
new ReplaceLimitByExpressionWithEval(),
// new NormalizeAggregate(), - waits on https://github.com/elastic/elasticsearch/issues/100634
new SubstituteApproximationPlan()
);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@
import org.elasticsearch.xpack.esql.plan.logical.Project;
import org.elasticsearch.xpack.esql.plan.logical.TopN;

import static org.elasticsearch.xpack.esql.core.expression.Expressions.listSemanticEquals;

/**
* Combines a Limit immediately followed by a TopN into a single TopN.
* This is needed because {@link HoistRemoteEnrichTopN} can create new TopN nodes that are not covered by the previous rules.
Expand All @@ -25,13 +27,13 @@ public CombineLimitTopN() {

@Override
public LogicalPlan rule(Limit limit) {
if (limit.child() instanceof TopN topn) {
if (limit.child() instanceof TopN topn && listSemanticEquals(limit.groupings(), topn.groupings())) {
int thisLimitValue = Foldables.limitValue(limit.limit(), limit.sourceText());
int topNValue = Foldables.limitValue(topn.limit(), topn.sourceText());
if (topNValue <= thisLimitValue) {
return topn;
} else {
return new TopN(topn.source(), topn.child(), topn.order(), limit.limit(), topn.local());
return new TopN(topn.source(), topn.child(), topn.order(), limit.limit(), topn.groupings(), topn.local());
}
}
if (limit.child() instanceof Project proj) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ protected LogicalPlan rule(Enrich en) {
if (missingAttributes.isEmpty()) {
// No need for aliasing - then it's simple, just copy the TopN on top of Enrich and mark the original as local
LogicalPlan transformedEnrich = en.transformDown(TopN.class, t -> t == topN ? topN.withLocal(true) : t);
return new TopN(topN.source(), transformedEnrich, topN.order(), topN.limit(), false);
return new TopN(topN.source(), transformedEnrich, topN.order(), topN.limit(), topN.groupings(), false);
} else {
Set<String> missingNames = new HashSet<>(Expressions.names(missingAttributes));
AttributeMap.Builder<Alias> aliasesForReplacedAttributesBuilder = AttributeMap.builder();
Expand Down Expand Up @@ -127,7 +127,7 @@ protected LogicalPlan rule(Enrich en) {
}

// Create a copy of TopN with the rewritten attributes on top on Enrich
var copyTop = new TopN(topN.source(), transformedEnrich, newOrder, topN.limit(), false);
var copyTop = new TopN(topN.source(), transformedEnrich, newOrder, topN.limit(), topN.groupings(), false);
// And use the Project to remove the fields that we don't need anymore
return new Project(en.source(), copyTop, outputs);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,10 @@ public LogicalPlan apply(LogicalPlan plan, LogicalOptimizerContext ctx) {
AttributeMap.Builder<Expression> builder = AttributeMap.builder();
builder.putAll(collectRefs);

// Exclude multi-value (List) literals used as GROUP BY keys from propagation.
// Exclude multi-value (List) literals used as grouping keys from propagation.
//
// Rationale: GROUP BY explodes multi-value fields into single values. For example:
// Rationale: both GROUP BY and LIMIT BY explode multi-value fields into single values.
// For example:
Comment on lines +44 to +45
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not true anymore, we need to change this

// ROW a = [1, 2] | STATS x = a + SUM(a) BY a
// Before aggregation, `a` is multi-valued [1, 2]. After GROUP BY, `a` becomes single-valued
// (either 1 or 2 per group). If we propagate the original multi-value literal [1, 2] into
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ private static LogicalPlan pruneColumns(LogicalPlan plan, AttributeSet.Builder u

// TODO: revisit with every new command
// skip nodes that simply pass the input through and use no references
if (p instanceof Limit || p instanceof Sample) {
if (p instanceof Limit limit && limit.groupings().isEmpty() || p instanceof Sample) {
return p;
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,10 @@
package org.elasticsearch.xpack.esql.optimizer.rules.logical;

import org.elasticsearch.xpack.esql.core.expression.Alias;
import org.elasticsearch.xpack.esql.core.expression.Attribute;
import org.elasticsearch.xpack.esql.core.expression.Expression;
import org.elasticsearch.xpack.esql.core.expression.FoldContext;
import org.elasticsearch.xpack.esql.core.expression.NameId;
import org.elasticsearch.xpack.esql.core.util.Holder;
import org.elasticsearch.xpack.esql.expression.function.fulltext.Score;
import org.elasticsearch.xpack.esql.optimizer.LogicalOptimizerContext;
Expand All @@ -31,7 +34,11 @@
import org.elasticsearch.xpack.esql.rule.Rule;

import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

import static org.elasticsearch.xpack.esql.core.expression.Expressions.listSemanticEquals;

public final class PushDownAndCombineLimits extends OptimizerRules.ParameterizedOptimizerRule<Limit, LogicalOptimizerContext>
implements
Expand All @@ -50,7 +57,7 @@ private PushDownAndCombineLimits(boolean local) {

@Override
public LogicalPlan rule(Limit limit, LogicalOptimizerContext ctx) {
if (limit.child() instanceof Limit childLimit) {
if (limit.child() instanceof Limit childLimit && listSemanticEquals(childLimit.groupings(), limit.groupings())) {
return combineLimits(limit, childLimit, ctx.foldCtx());
} else if (limit.child() instanceof UnaryPlan unary) {
if (unary instanceof Eval
Expand All @@ -61,6 +68,8 @@ public LogicalPlan rule(Limit limit, LogicalOptimizerContext ctx) {
if (false == local && unary instanceof Eval && evalAliasNeedsData((Eval) unary)) {
// do not push down the limit through an eval that needs data (e.g. a score function) during initial planning
return limit;
} else if (groupingsReferenceAttributeDefinedByChild(limit, unary)) {
return limit;
} else {
return unary.replaceChild(limit.replaceChild(unary.child()));
}
Expand All @@ -82,7 +91,7 @@ public LogicalPlan rule(Limit limit, LogicalOptimizerContext ctx) {
// this applies for cases such as | limit 1 | sort field | limit 10
else {
Limit descendantLimit = descendantLimit(unary);
if (descendantLimit != null) {
if (descendantLimit != null && descendantLimit.groupings().isEmpty()) {
var l1 = (int) limit.limit().fold(ctx.foldCtx());
var l2 = (int) descendantLimit.limit().fold(ctx.foldCtx());
if (l2 <= l1) {
Expand Down Expand Up @@ -163,7 +172,7 @@ private static Limit combineLimits(Limit upper, Limit lower, FoldContext ctx) {
// If any of them is local, we want the local limit
return lower.local() ? lower : lower.withLocal(upper.local());
} else {
return new Limit(upper.source(), upper.limit(), lower.child(), upper.duplicated(), upper.local());
return new Limit(upper.source(), upper.limit(), lower.child(), upper.groupings(), upper.duplicated(), upper.local());
}
}

Expand All @@ -183,14 +192,35 @@ private boolean evalAliasNeedsData(Eval eval) {
}

/**
* Checks the existence of another 'visible' Limit, that exists behind an operation that doesn't produce output more data than
* its input (that is not a relation/source nor aggregation).
* Returns {@code true} if any attribute referenced by the limit's groupings is defined by the child node
* (i.e. present in the child's output but absent from the grandchild's output). Pushing a grouped limit
* past such a child would leave the grouping attribute unresolved.
*/
private static boolean groupingsReferenceAttributeDefinedByChild(Limit limit, UnaryPlan child) {
if (limit.groupings().isEmpty()) {
return false;
}
Set<NameId> grandchildIds = new HashSet<>();
for (Attribute a : child.child().output()) {
grandchildIds.add(a.id());
}
for (Expression g : limit.groupings()) {
if (g instanceof Attribute a && grandchildIds.contains(a.id()) == false) {
return true;
}
}
return false;
}

/**
* Checks the existence of another 'visible' non-grouping Limit, that exists behind an operation that doesn't produce more output data
* than its input (that is not a relation/source nor aggregation).
* P.S. Typically an aggregation produces less data than the input.
*/
private static Limit descendantLimit(UnaryPlan unary) {
UnaryPlan plan = unary;
while (plan instanceof Aggregate == false) {
if (plan instanceof Limit limit) {
if (plan instanceof Limit limit && limit.groupings().isEmpty()) {
return limit;
} else if (plan instanceof MvExpand) {
// the limit that applies to mv_expand shouldn't be changed
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,13 @@
public final class ReplaceLimitAndSortAsTopN extends OptimizerRules.OptimizerRule<Limit> {

@Override
protected LogicalPlan rule(Limit plan) {
LogicalPlan p = plan;
if (plan.child() instanceof OrderBy o) {
p = new TopN(o.source(), o.child(), o.order(), plan.limit(), false);
protected LogicalPlan rule(Limit limit) {
LogicalPlan p = limit;

if (limit.child() instanceof OrderBy o) {
p = new TopN(o.source(), o.child(), o.order(), limit.limit(), limit.groupings(), false);
}

return p;
}
}
Loading
Loading