Skip to content

Use a copy of the SearchExecutionContext for each Percolator execution#142765

Merged
davidkyle merged 17 commits intoelastic:mainfrom
davidkyle:auto-scope
Mar 10, 2026
Merged

Use a copy of the SearchExecutionContext for each Percolator execution#142765
davidkyle merged 17 commits intoelastic:mainfrom
davidkyle:auto-scope

Conversation

@davidkyle
Copy link
Copy Markdown
Member

@davidkyle davidkyle commented Feb 20, 2026

The AutoPrefilteringScope member of SearchExecutionContext is not thread safe but Percolate accesses the member from multiple threads resulting in the stack trace below. SearchExecutionContext::NestedScope can also be used in non-thread safe manner in Percolator query.

This PR adds a copyForConcurrentUse() to SearchExecutionContext which creates a shallow copy of the context with new instances of the mutable members. These members are no longer shared between the threads removing the concurrent access issues.

Caused by: java.util.NoSuchElementException
	at java.util.LinkedList.removeFirst(LinkedList.java:282) ~[?:?]
	at java.util.LinkedList.pop(LinkedList.java:813) ~[?:?]
	at org.elasticsearch.index.query.support.AutoPrefilteringScope.pop(AutoPrefilteringScope.java:47) ~[elasticsearch-9.4.0-SNAPSHOT.jar:9.4.0-SNAPSHOT]
	at org.elasticsearch.index.query.support.AutoPrefilteringScope.close(AutoPrefilteringScope.java:59) ~[elasticsearch-9.4.0-SNAPSHOT.jar:9.4.0-SNAPSHOT]
	at org.elasticsearch.index.query.BoolQueryBuilder.addBooleanClauses(BoolQueryBuilder.java:333) ~[elasticsearch-9.4.0-SNAPSHOT.jar:9.4.0-SNAPSHOT]
	at org.elasticsearch.index.query.BoolQueryBuilder.doToQuery(BoolQueryBuilder.java:303) ~[elasticsearch-9.4.0-SNAPSHOT.jar:9.4.0-SNAPSHOT]
	at org.elasticsearch.index.query.AbstractQueryBuilder.toQuery(AbstractQueryBuilder.java:119) ~[elasticsearch-9.4.0-SNAPSHOT.jar:9.4.0-SNAPSHOT]
	at org.elasticsearch.percolator.PercolateQueryBuilder.lambda$createStore$9(PercolateQueryBuilder.java:577) ~[main/:?]
	at org.elasticsearch.percolator.PercolateQuery$1$1$1.matchDocId(PercolateQuery.java:138) ~[main/:?]
	at org.elasticsearch.percolator.PercolateQuery$BaseScorer$1.matches(PercolateQuery.java:301) ~[main/:?]
	at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreTwoPhaseIterator(Weight.java:311) ~[lucene-core-10.3.2.jar:10.3.2 dadfd90b4401947f4d0387669dc94999fbb2c830 - 2025-11-13 10:41:29]
	at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:282) ~[lucene-core-10.3.2.jar:10.3.2 dadfd90b4401947f4d0387669dc94999fbb2c830 - 2025-11-13 10:41:29]
	at org.elasticsearch.search.internal.ContextIndexSearcher.searchLeaf(ContextIndexSearcher.java:483) ~[elasticsearch-9.4.0-SNAPSHOT.jar:9.4.0-SNAPSHOT]
	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:809) ~[lucene-core-10.3.2.jar:10.3.2 dadfd90b4401947f4d0387669dc94999fbb2c830 - 2025-11-13 10:41:29]
	at org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:407) ~[elasticsearch-9.4.0-SNAPSHOT.jar:9.4.0-SNAPSHOT]
	at org.elasticsearch.search.internal.ContextIndexSearcher.lambda$search$3(ContextIndexSearcher.java:385) ~[elasticsearch-9.4.0-SNAPSHOT.jar:9.4.0-SNAPSHOT]
	at java.util.concurrent.FutureTask.run(FutureTask.java:328) ~[?:?]
	at org.apache.lucene.search.TaskExecutor$Task.run(TaskExecutor.java:173) ~[lucene-core-10.3.2.jar:10.3.2 dadfd90b4401947f4d0387669dc94999fbb2c830 - 2025-11-13 10:41:29]
	at org.apache.lucene.search.TaskExecutor.lambda$invokeAll$1(TaskExecutor.java:98) ~[lucene-core-10.3.2.jar:10.3.2 dadfd90b4401947f4d0387669dc94999fbb2c830 - 2025-11-13 10:41:29]

Closes #141489

Copy link
Copy Markdown
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimitris-athanasiou @Mikep86

y'all might find this interesting.

// PercolateQuery.QueryStore function from multiple threads. Here the
// solution is to create an AutoPrefilteringScope for each invocation
// of PercolateQuery.QueryStore
var safeContext = new FilteredSearchExecutionContext(context) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that Percolator should be reading/writing from the auto-prefiltering scope at all.

I really dislike that this was just added for all boolean queries, when its only needed in very specific cases for knn searches.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this autoPrefilteringScope should be null and not available at all unless the very specific query asks for it. Otherwise we end up with weird latent bugs for code paths that aren't actually necessary or expected.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benwtrent autoPrefilteringScope uses the same design pattern as nestedScope, which is one of the reasons we went with it. This leads to an immediate follow-up question: Are there similar potential issues with nested percolated queries?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this issue seems to happen with nestedScope as well: #141489

I've been trying to reproduce this with percolated nested queries locally, but haven't had much luck yet. Such is the nature of concurrency bugs :/

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a copyForConcurrentUse() method to SearchExecutionContext that addresses both the AutoPrefilteringScope and NestedScope issues.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really dislike that this was just added for all boolean queries, when its only needed in very specific cases for knn searches.

I explored walking the query tree to find a knn query and only enabling auto pre-filtering if present and ended up with logic not dissimilar to this https://github.com/elastic/elasticsearch/blob/main/server/src/main/java/org/elasticsearch/index/query/support/AutoPrefilteringUtils.java#L45

There doesn't seem to be a blessed way to walk the query tree as getting the child queries for each query type has to be handled differently. It feels out of scope for this bug fix, we can explore the best way to handle this in another PR possibly by adding a List<QueryBuilder> subQueries() function to QueryBuilder

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ for exploring other ways to walk the query tree in a separate PR. I think this one should focus only on making SearchExecutionContext usable by percolator.

@davidkyle davidkyle changed the title Use the delegate's scope Use a copy of the SEC for each percolator execution Feb 27, 2026
@davidkyle davidkyle changed the title Use a copy of the SEC for each percolator execution Use a copy of the SearchExecutionContext for each Percolator execution Feb 27, 2026
// The context's NestedScope is also vulnerable to concurrent modification.
// Use a cloned SearchExecutionContext for each thread with new instances of
// the mutable fields.
var safeContext = context.copyForConcurrentUse();
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

@Mikep86 Mikep86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review, I can finish up tomorrow

Copy link
Copy Markdown
Contributor

@Mikep86 Mikep86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a closer look at this and I think the current solution is both too complicated and fundamentally flawed.

We create the FilteredSearchExecutionContext for the percolator query in wrapAllEmptyTextFields. However, when copying it in createStore, we generate a new FilteredSearchExecutionContext instance that does not persist the anonymously overridden methods. The whole reason for using FilteredSearchExecutionContext is lost by generating a copy.

There is a simpler solution that both generates a copy of SearchExecutionContext and persists the anonymously overridden methods: Configure the context late in createStore. The steps would be:

  1. Refactor configureContext and downstream methods as necessary so they are callable from createStore.
  2. Simplify the implementation of wrapAllEmptyTextFields to:
    static SearchExecutionContext wrapAllEmptyTextFields(SearchExecutionContext searchExecutionContext) {
        return new SearchExecutionContext(searchExecutionContext) {
            @Override
            public boolean fieldExistsInIndex(String fieldname) {
                return true;
            }
        };
    }

This both generates a copy of the context (fixing this bug) and anonymously overrides the desired methods in each copy.

  1. Call configureContext in createStore
  2. We can completely remove FilteredSearchExecutionContext at this point. It is only used by the percolator query, which no longer needs it.

@benwtrent WDYT?

@davidkyle
Copy link
Copy Markdown
Member Author

Thanks for the simplification @Mikep86 I ended up doing some Yak shaving on this one.

The combination of FilteredSearchExecutionContext and the anonymous classes is broken and very hard to reason about. The FilteredX pattern should implement an interface so that if a new method is added then the filtered version will have to implement it. FilteredSearchExecutionContext does not do this so a developer could add new methods to SearchExecutionContext but FilteredSearchExecutionContext will not delegate those methods. I started to fix but it was much simpler to removed FilteredSearchExecutionContext as you noted. Also it turns out that Percolator is dependant on the implicit copy when deriving anonymous SearchExecutionContext classes.

Then I've pretty much done as suggested to move the creation of the new context to createStore() so each calling thread will have a copy.

Copy link
Copy Markdown
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ended up in a nice place. Thank you @Mikep86 @davidkyle !!!

Copy link
Copy Markdown
Contributor

@Mikep86 Mikep86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice clean fix!

}

executionContext = configureContext(executionContext, isMapUnmappedFieldAsText());
executionContext = PercolateQueryBuilder.newPercolateSearchContext(executionContext, isMapUnmappedFieldAsText());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great that we're using a SearchExecutionContext configured similarly at both document parsing time and at query time. There were probably other bugs we hadn't found yet caused by using differently configured contexts here vs. at query time...

@davidkyle davidkyle enabled auto-merge (squash) March 9, 2026 15:47
@davidkyle davidkyle merged commit 22b4577 into elastic:main Mar 10, 2026
35 checks passed
@davidkyle
Copy link
Copy Markdown
Member Author

💚 All backports created successfully

Status Branch Result
9.3

Questions ?

Please refer to the Backport tool documentation

davidkyle added a commit to davidkyle/elasticsearch that referenced this pull request Mar 10, 2026
szybia added a commit to szybia/elasticsearch that referenced this pull request Mar 10, 2026
…locations

* upstream/main: (126 commits)
  Update KnnIndexTester to use more settings from datasets (elastic#143869)
  fix: dynamic template vector array is overridden by automatic dense_vector mapping (elastic#143733)
  ES|QL: Don't reuse the same alias for _fork column (elastic#143909)
  Close and initialize clients after each node upgrade in logsdb rolling upgrade tests. (elastic#143823)
  ESQL: Added GroupedTopNOperator for LIMIT BY, compute only (elastic#143476)
  Handle views in ResolveIndexAction (elastic#143561)
  Improve reindex rethrottle API in stateless (elastic#143771)
  Use a copy of the SearchExecutionContext for each Percolator execution (elastic#142765)
  Log the stacktrace when we encounter a deprecation warning for `default_metric` (elastic#143929)
  ESQL: evaluate ReferenceAttributes to potentially FieldAttributes for full-text functions restriction (elastic#143893)
  Add ClusterStateSerializationStats Serializatation Tests (elastic#142703)
  Adds Coordination Diagnostics Tests (elastic#142709)
  Upgrade Elasticsearch to Apache Lucene 10.4 (elastic#141882)
  ESQL: Add configurable bracket-based multi-value support for CSV reader (elastic#143890)
  time series es819 binary dv use up to a 1mb block size (elastic#143049)
  Dynamically enable / disable plugins in correspondence to stateless mode. (elastic#142147)
  ES|QL: Implement first/last_over_time for tdigest (elastic#143832)
  Document CHANGE_POINT limitation (elastic#143877)
  Fix OperationsOnSeqNoDisabledIndicesIT (elastic#143892)
  [Test] Test that sequence numbers are not pruned with retention lease (elastic#143825)
  ...
elasticsearchmachine pushed a commit that referenced this pull request Mar 10, 2026
…ecution (#142765) (#143942)

* Use a copy of the SearchExecutionContext for each Percolator execution (#142765)

(cherry picked from commit 22b4577)

* compilation fix
@benwtrent
Copy link
Copy Markdown
Member

@davidkyle do we know if this bug also exists in 8.19? I wonder if we want to backport there as well.

@davidkyle
Copy link
Copy Markdown
Member Author

💚 All backports created successfully

Status Branch Result
8.19

Questions ?

Please refer to the Backport tool documentation

davidkyle added a commit to davidkyle/elasticsearch that referenced this pull request Mar 10, 2026
elastic#142765)

(cherry picked from commit 22b4577)

# Conflicts:
#	modules/percolator/src/main/java/org/elasticsearch/percolator/PercolateQueryBuilder.java
#	modules/percolator/src/test/java/org/elasticsearch/percolator/QueryBuilderStoreTests.java
elasticsearchmachine pushed a commit that referenced this pull request Mar 10, 2026
#142765) (#143951)

(cherry picked from commit 22b4577)

# Conflicts:
#	modules/percolator/src/main/java/org/elasticsearch/percolator/PercolateQueryBuilder.java
#	modules/percolator/src/test/java/org/elasticsearch/percolator/QueryBuilderStoreTests.java
@Mikep86
Copy link
Copy Markdown
Contributor

Mikep86 commented Mar 11, 2026

@davidkyle Sorry to pile on, but if we're backporting to 8.19, we should backport to 9.2 as well.

@davidkyle
Copy link
Copy Markdown
Member Author

💚 All backports created successfully

Status Branch Result
9.2

Questions ?

Please refer to the Backport tool documentation

davidkyle added a commit to davidkyle/elasticsearch that referenced this pull request Mar 11, 2026
elastic#142765)

(cherry picked from commit 22b4577)

# Conflicts:
#	modules/percolator/src/test/java/org/elasticsearch/percolator/QueryBuilderStoreTests.java
@davidkyle
Copy link
Copy Markdown
Member Author

Thanks for the shout @Mikep86. When I saw there are no more branches I wept for there were no more backports to make

@davidkyle davidkyle deleted the auto-scope branch March 12, 2026 10:28
elasticsearchmachine pushed a commit that referenced this pull request Mar 31, 2026
…ecution (#142765) (#144027)

* Use a copy of the SearchExecutionContext for each Percolator execution (#142765)

(cherry picked from commit 22b4577)

# Conflicts:
#	modules/percolator/src/test/java/org/elasticsearch/percolator/QueryBuilderStoreTests.java

* [CI] Auto commit changes from spotless

* fix comp

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NoSuchElementException in NestedScope.previousLevel() during percolate query execution with nested queries

4 participants