Add circuit breaker for query construction to prevent OOM from automaton-based queries#142150
Merged
drempapis merged 89 commits intoelastic:mainfrom Mar 3, 2026
Merged
Conversation
Collaborator
|
Hi @drempapis, I've created a changelog YAML for you. |
Collaborator
|
Pinging @elastic/es-search-foundations (Team:Search Foundations) |
spinscale
reviewed
Feb 12, 2026
...ternalClusterTest/java/org/elasticsearch/indices/memory/breaker/CircuitBreakerServiceIT.java
Outdated
Show resolved
Hide resolved
spinscale
reviewed
Feb 12, 2026
server/src/main/java/org/elasticsearch/index/query/FuzzyQueryBuilder.java
Outdated
Show resolved
Hide resolved
…is/elasticsearch into fix/use_cb_for_lucene_automatons
spinscale
reviewed
Feb 12, 2026
server/src/main/java/org/elasticsearch/index/query/WildcardQueryBuilder.java
Outdated
Show resolved
Hide resolved
Collaborator
💔 Backport failed
You can use sqren/backport to manually backport by running |
drempapis
added a commit
to drempapis/elasticsearch
that referenced
this pull request
Mar 3, 2026
…ton-based queries (elastic#142150) (cherry picked from commit c575665)
drempapis
added a commit
to drempapis/elasticsearch
that referenced
this pull request
Mar 3, 2026
…ton-based queries (elastic#142150) (cherry picked from commit c575665)
elasticsearchmachine
pushed a commit
that referenced
this pull request
Mar 3, 2026
…automaton-based queries (#142150) (#143456) * Add circuit breaker for query construction to prevent OOM from automaton-based queries (#142150) (cherry picked from commit c575665) * [CI] Auto commit changes from spotless --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
drempapis
added a commit
to drempapis/elasticsearch
that referenced
this pull request
Mar 3, 2026
…ton-based queries (elastic#142150) (cherry picked from commit c575665)
drempapis
added a commit
to drempapis/elasticsearch
that referenced
this pull request
Mar 3, 2026
…ton-based queries (elastic#142150)
Contributor
Author
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
szybia
added a commit
to szybia/elasticsearch
that referenced
this pull request
Mar 3, 2026
…cations * upstream/main: (56 commits) Mute org.elasticsearch.compute.lucene.read.ValueSourceReaderTypeConversionTests testLoadAll elastic#143471 [DOCS] Fix ES|QL function and commands lists versioning metadata (elastic#143402) Fix MMROperatorTests (elastic#143453) Fix CSV-escaped quotes in generated docs examples (elastic#143449) Fix SQL client parsing of array header values (elastic#143408) ESQL: Add extended distribution tests and fault injection for external sources (elastic#143420) ESQL: Fix datasource test failures on Windows and FIPS (elastic#143417) Add circuit breaker for query construction to prevent OOM from automaton-based queries (elastic#142150) Cleanup SpecIT logging configuration (elastic#143365) ESQL: Prune unused regex extract nodes in optimizer (elastic#140982) Ensure supported locale outside of Entitlements check (elastic#143405) feat(es|ql): add dense_vector support in coalesce (elastic#142974) [Test] Unmute SnapshotStressTestsIT (elastic#143359) Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:lookup-join.LookupJoinWithCoalesceFilterOnRight} elastic#143443 Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:lookup-join.MvJoinKeyOnTheLookupIndex} elastic#143442 ESQL: Fix CCS exchange sink cleanup (elastic#143325) Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:lookup-join.MvJoinKeyOnTheLookupIndexAfterStats} elastic#143434 Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:lookup-join.MvJoinKeyFromRow} elastic#143432 Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:k8s-timeseries.Datenanos_derivative_compared_to_rate} elastic#143431 Mute org.elasticsearch.multiproject.test.CoreWithMultipleProjectsClientYamlTestSuiteIT test {yaml=search.retrievers/result-diversification/10_mmr_result_diversification_retriever/Test MMR result diversification single index float type} elastic#143430 ...
tballison
pushed a commit
to tballison/elasticsearch
that referenced
this pull request
Mar 3, 2026
…ton-based queries (elastic#142150)
GalLalouche
pushed a commit
to GalLalouche/elasticsearch
that referenced
this pull request
Mar 3, 2026
…ton-based queries (elastic#142150)
elasticsearchmachine
pushed a commit
that referenced
this pull request
Mar 4, 2026
… automaton-based queries (#142150) (#143469) * Add circuit breaker for query construction to prevent OOM from automaton-based queries (#142150) (cherry picked from commit c575665) * Add circuit breaker for query construction to prevent OOM from automaton-based queries (#142150) * [CI] Auto commit changes from spotless * update after review --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
shmuelhanoch
pushed a commit
to shmuelhanoch/elasticsearch
that referenced
this pull request
Mar 4, 2026
…ton-based queries (elastic#142150)
This was referenced Mar 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Solves: #87024
Problem
Automaton-based queries can cause OOMs. Example: a boolean query with 2,000 wildcard
subqueries matching long strings resulted in 6GB+ of in-memory automata, which killed a node.
This PR integrates circuit breaker accounting at query construction time to catch aggregate memory growth from many automaton-based sub-queries before it leads to OOM.
The accounting is placed in
StringFieldType, the base class where keyword and text fields construct their Lucene queries (WildcardQuery,RegexpQuery,PrefixQueryTermRangeQuery,FuzzyQuery). This is the convergence point for all callers, so a single instrumentation point coversDirect query builders:
WildcardQueryBuilder,RegexpQueryBuilder,PrefixQueryBuilder,RangeQueryBuilder,FuzzyQueryBuilder.Composite query parsers:
query_stringOther callers
After each Lucene query is constructed, its
ramBytesUsed()(via Accountable interface) is reported to the request circuit breaker. In a composite query (e.g.BoolQuerywith 1,000 wildcard clauses), the breaker trips after N sub-queries exceed the limit, before all memory is allocated.SearchExecutionContexttracks accumulated query construction memory.Releasableregistered onSearchContextcallsreleaseQueryConstructionMemory()on close, returning all accounted memory to the circuit breaker after query execution completes.Note
This pr does not act against single huge automatons, e.g.,"
.*a.*b.*c.*d.*e.*f.*g.*h.*", where the OOM could happen during construction. In this implementation, the circuit breaker only fires after the automaton is built. I'll try to mitigate this issue in a follow-up PR during the construction phase and build the automaton incrementally.