[ES|QL] Remove implicit limit appended to each subquery branch#139058
Merged
fang-xing-esql merged 41 commits intoelastic:mainfrom Jan 14, 2026
Merged
Conversation
Collaborator
|
Hi @fang-xing-esql, I've created a changelog YAML for you. |
Member
Author
|
There are some unrelated failures in release tests, remove the |
Collaborator
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
astefan
approved these changes
Jan 12, 2026
Contributor
astefan
left a comment
There was a problem hiding this comment.
LGTM.
Left only some minor comments.
...ack/src/javaRestTest/java/org/elasticsearch/xpack/esql/heap_attack/HeapAttackSubqueryIT.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Show resolved
Hide resolved
...org/elasticsearch/xpack/esql/optimizer/rules/logical/PushDownFilterAndLimitIntoUnionAll.java
Outdated
Show resolved
Hide resolved
...org/elasticsearch/xpack/esql/optimizer/rules/logical/PushDownFilterAndLimitIntoUnionAll.java
Outdated
Show resolved
Hide resolved
Member
Talked with @fang-xing-esql, all of the changes this PR powers are currently only available in SNAPSHOT builds. So it's quite safe to merge without fixing the OOMs. So long as we lock the removal of the SNAPSHOT behind fixing the OOMs, we're safe. I'll look at the PR soon with that in mind. |
nik9000
approved these changes
Jan 13, 2026
Member
nik9000
left a comment
There was a problem hiding this comment.
Found an out of date comment, otherwise looks good to me.
...heap-attack/src/javaRestTest/java/org/elasticsearch/xpack/esql/heap_attack/HeapAttackIT.java
Outdated
Show resolved
Hide resolved
Member
Author
spinscale
pushed a commit
to spinscale/elasticsearch
that referenced
this pull request
Jan 21, 2026
…stic#139058) * remove implicit limit appended to each subquery
alex-spies
added a commit
to alex-spies/elasticsearch
that referenced
this pull request
Feb 2, 2026
9.3 does not have elastic#139058, so the implicit limits at the top of subquery branches are still in place. Adjust the expectations accordingly.
elasticsearchmachine
pushed a commit
that referenced
this pull request
Feb 2, 2026
…) (#141675) * ESQL: Fix injected attributes's IDs in UnionAll branches (#141262) This fixes the generation of name IDs for the attributes corresponding to the unmapped fields and are pushed to different branches in `UntionAll`. So far, one set of IDs was generated and reused for all subplans. This is now updated to individual set per subplan. Along the change, the handling of `Fork` in `ResolveUnmapped` has been somewhat simplified. Also, more unit tests have been completed (where the plans are simple enough) and the plan comments updated to replace the `EsqlProject` with the now merged `Project`. A minor collateral proposed change: the CSV spec-based tests skipped due to missing capabilities are now logged. (cherry picked from commit 8e3113c) * Fix tests 9.3 does not have #139058, so the implicit limits at the top of subquery branches are still in place. Adjust the expectations accordingly. * Checkstyle --------- Co-authored-by: Bogdan Pintea <bogdan.pintea@elastic.co>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Resolves: #138106
The implicit
limitappended to each subquery adds limitations for subquery, especially when working withinline stats. It may also cause non-deterministic results from subqueries when there is nosortin the subquery to ensure the order of the intermediate results.This PR attempts to remove the implicit
limitappend to each subquery branch, and enables more capabilities of subquery. There is potential risk that the intermediate results returned by a subquery is huge, and they need to be processed in batches or CB if it is too big to process.The major changes are listed here, there is NO change to
Fork. More follow ups on heap attack tests are needed, this PR focus on functionality, making sure correct results are returned without the implicitlimitappended to each subquery. The subquery feature is still behind snapshot.HeapAttackSubqueryITis added to catch potential OOM or CBE caused by large intermediate results from subqueries. Some tests with 8 subqueries hit OOM, instead of CBE, there are PR and new issues created to address those OOMs as follow ups. Currently most of the subquery heap attack tests run with 2 subqueries, root causes of the OOMs andTODOs and commented in each test.PushDownFilterAndLimitIntoUnionAllis updated accordingly afterAnalyzerdoes not add the implicitlimitto each subquery branch.knnfunction. When a subquery contains aknn, it still requires alimitto appear after theknn, becauseknnhas animplicitKthat is not serialized or sent to the remote nodes. This was not an issue previously, as an implicitlimitwas appended to each branch. ThePushDownFilterAndLimitIntoUnionAllrule has been updated to handle thisknncase. Without alimitfollowing theknn, the query either failsLogicalVerifieror produces incorrect results.sortoperator has a similar situation, as unbounded sort is not supported. If a subquery contains asortwithout alimit, removing the implicitlimitpreviously appended to each branch causesLogicalVerifierto fail. In this PR, we do not append an implicitlimitfor this case, instead, it letsLogicalVerifierfail and inform users of the unboundedsort. If support for this query pattern is required in the future, an implicitlimitcould be appended, using an approach similar to the one applied forknn.Some details on the OOM happened to the new subquery heap attack tests
There are some OOM discovered by the new subquery heap attack tests, they exposed some untracked memory used during reading(unsorted or sorted) data from lucene, the list of issues and PRs are below. These will be fixed as follow ups. This PR focus on functionality, instead of memory intensive queries.
PackedValuesBlockHash.bytestoBreakingBytesRefBuilderfor better memory tracking.sorts(with many large fields) to lucene may take lot of untracked memorykeyword/textfields may take a lot of untracked memory