Skip to content

Add circuit breaker for query construction to prevent OOM from automaton-based queries#142150

Merged
drempapis merged 89 commits intoelastic:mainfrom
drempapis:fix/use_cb_for_lucene_automatons
Mar 3, 2026
Merged

Add circuit breaker for query construction to prevent OOM from automaton-based queries#142150
drempapis merged 89 commits intoelastic:mainfrom
drempapis:fix/use_cb_for_lucene_automatons

Conversation

@drempapis
Copy link
Copy Markdown
Contributor

@drempapis drempapis commented Feb 9, 2026

Solves: #87024

Problem

Automaton-based queries can cause OOMs. Example: a boolean query with 2,000 wildcard
subqueries matching long strings resulted in 6GB+ of in-memory automata, which killed a node.

This PR integrates circuit breaker accounting at query construction time to catch aggregate memory growth from many automaton-based sub-queries before it leads to OOM.

  1. Memory Accounting at MappedFieldType Level:
    The accounting is placed in StringFieldType, the base class where keyword and text fields construct their Lucene queries (WildcardQuery, RegexpQuery, PrefixQuery TermRangeQuery, FuzzyQuery). This is the convergence point for all callers, so a single instrumentation point covers
  • Direct query builders: WildcardQueryBuilder, RegexpQueryBuilder,PrefixQueryBuilder, RangeQueryBuilder, FuzzyQueryBuilder.

  • Composite query parsers: query_string

  • Other callers

    After each Lucene query is constructed, its ramBytesUsed() (via Accountable interface) is reported to the request circuit breaker. In a composite query (e.g. BoolQuery with 1,000 wildcard clauses), the breaker trips after N sub-queries exceed the limit, before all memory is allocated.

  1. Memory Release
  • SearchExecutionContext tracks accumulated query construction memory.
  • A Releasable registered on SearchContext calls releaseQueryConstructionMemory() on close, returning all accounted memory to the circuit breaker after query execution completes.

Note

This pr does not act against single huge automatons, e.g.,".*a.*b.*c.*d.*e.*f.*g.*h.*", where the OOM could happen during construction. In this implementation, the circuit breaker only fires after the automaton is built. I'll try to mitigate this issue in a follow-up PR during the construction phase and build the automaton incrementally.

@drempapis drempapis added >bug Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch :Search Foundations/Search Catch all for Search Foundations v9.4.0 labels Feb 9, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @drempapis, I've created a changelog YAML for you.

@drempapis drempapis requested a review from martijnvg February 10, 2026 10:09
@drempapis drempapis marked this pull request as ready for review February 10, 2026 10:11
@drempapis drempapis requested a review from a team as a code owner February 10, 2026 10:11
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@drempapis drempapis added auto-backport Automatically create backport pull requests when merged v8.19.0 v9.2.0 v9.3.0 labels Mar 3, 2026
@drempapis drempapis merged commit c575665 into elastic:main Mar 3, 2026
35 checks passed
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

💔 Backport failed

Status Branch Result
8.19 Commit could not be cherrypicked due to conflicts
9.2 Commit could not be cherrypicked due to conflicts
9.3 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 142150

drempapis added a commit to drempapis/elasticsearch that referenced this pull request Mar 3, 2026
drempapis added a commit to drempapis/elasticsearch that referenced this pull request Mar 3, 2026
elasticsearchmachine pushed a commit that referenced this pull request Mar 3, 2026
…automaton-based queries (#142150) (#143456)

* Add circuit breaker for query construction to prevent OOM from automaton-based queries (#142150)

(cherry picked from commit c575665)

* [CI] Auto commit changes from spotless

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
elasticsearchmachine pushed a commit that referenced this pull request Mar 3, 2026
drempapis added a commit to drempapis/elasticsearch that referenced this pull request Mar 3, 2026
drempapis added a commit to drempapis/elasticsearch that referenced this pull request Mar 3, 2026
@drempapis
Copy link
Copy Markdown
Contributor Author

💚 All backports created successfully

Status Branch Result
9.2
8.19

Questions ?

Please refer to the Backport tool documentation

szybia added a commit to szybia/elasticsearch that referenced this pull request Mar 3, 2026
…cations

* upstream/main: (56 commits)
  Mute org.elasticsearch.compute.lucene.read.ValueSourceReaderTypeConversionTests testLoadAll elastic#143471
  [DOCS] Fix ES|QL function and commands lists versioning metadata (elastic#143402)
  Fix MMROperatorTests (elastic#143453)
  Fix CSV-escaped quotes in generated docs examples (elastic#143449)
  Fix SQL client parsing of array header values (elastic#143408)
  ESQL: Add extended distribution tests and fault injection for external sources (elastic#143420)
  ESQL: Fix datasource test failures on Windows and FIPS (elastic#143417)
  Add circuit breaker for query construction to prevent OOM from automaton-based queries (elastic#142150)
  Cleanup SpecIT logging configuration (elastic#143365)
  ESQL: Prune unused regex extract nodes in optimizer (elastic#140982)
  Ensure supported locale outside of Entitlements check (elastic#143405)
  feat(es|ql): add dense_vector support in coalesce (elastic#142974)
  [Test] Unmute SnapshotStressTestsIT (elastic#143359)
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:lookup-join.LookupJoinWithCoalesceFilterOnRight} elastic#143443
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:lookup-join.MvJoinKeyOnTheLookupIndex} elastic#143442
  ESQL: Fix CCS exchange sink cleanup (elastic#143325)
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:lookup-join.MvJoinKeyOnTheLookupIndexAfterStats} elastic#143434
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:lookup-join.MvJoinKeyFromRow} elastic#143432
  Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:k8s-timeseries.Datenanos_derivative_compared_to_rate} elastic#143431
  Mute org.elasticsearch.multiproject.test.CoreWithMultipleProjectsClientYamlTestSuiteIT test {yaml=search.retrievers/result-diversification/10_mmr_result_diversification_retriever/Test MMR result diversification single index float type} elastic#143430
  ...
tballison pushed a commit to tballison/elasticsearch that referenced this pull request Mar 3, 2026
GalLalouche pushed a commit to GalLalouche/elasticsearch that referenced this pull request Mar 3, 2026
elasticsearchmachine pushed a commit that referenced this pull request Mar 4, 2026
… automaton-based queries (#142150) (#143469)

* Add circuit breaker for query construction to prevent OOM from automaton-based queries (#142150)

(cherry picked from commit c575665)

* Add circuit breaker for query construction to prevent OOM from automaton-based queries (#142150)

* [CI] Auto commit changes from spotless

* update after review

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
shmuelhanoch pushed a commit to shmuelhanoch/elasticsearch that referenced this pull request Mar 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged >bug :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v8.19.0 v9.2.0 v9.3.0 v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants