Reduce LuceneOperator.Status memory consumption with large QueryDSL queries by craigtaverner · Pull Request #143175 · elastic/elasticsearch

craigtaverner · 2026-02-26T18:35:26Z

As reported in #143164 we've seen users writing extremely large QueryDSL queries (of the order of many megabytes), and the LuceneOperator.Status keeps a HashSet of the toString of these queries, which is very large. This fix just truncates the string to 200 characters.

Fixes #143164

…ueries

elasticsearchmachine · 2026-02-26T18:36:14Z

Hi @craigtaverner, I've created a changelog YAML for you.

nik9000

Looks right to me. I think you can test this with the profile tests.

dnhatn · 2026-02-26T19:35:17Z

...plugin/esql/compute/src/main/java/org/elasticsearch/compute/lucene/query/LuceneOperator.java

        protected Status(LuceneOperator operator) {
            processedSlices = operator.processedSlices;
-            processedQueries = operator.processedQueries.stream().map(Query::toString).collect(Collectors.toCollection(TreeSet::new));
+            processedQueries = operator.processedQueries.stream().map(Status::queryString).collect(Collectors.toCollection(TreeSet::new));


I think the queries themselves are consuming a significant amount of memory, based on the heap dump screenshot. Should we convert processQueries to use String instead and apply a limit there?

So:

-final Set<Query> processedQueries = new HashSet<>(); +final Set<String> processedQueries = new HashSet<>();

Hey, if we do that and we limit the size, could we make the output:

queryString.substring(0, QUERY_STRING_TRUNCATION) + "...(" + (queryString.length() - QUERY_STRING_TRUNCATION) + "more characters)[" + queryString.hashcode() + "]

Just some extra paranoia about the hash of the query. If the queries are

Bool[SomeBigShape, SomethingImportant] Bool[SomeBigShape, SomethingElseImportant]

Then we'll at least see that there were two.

Good call on the hash getting big.

Yes, your proposal is good.

While looking at the heap dump myself, I cannot find any excessive memory usage by the processedQueries inside the operator, only the string version in the status. But since it only exists to pass to the status, we might as well truncate early, as suggested.

…gtaverner/elasticsearch into reduce_luceneoperator_status_memory

craigtaverner · 2026-02-27T11:33:08Z

OK. I've made the changes. Let me know what you think.

Convert and truncate earlier (in operator, not status)
Increased size because I imagine sometimes the information is useful (200 seems to small for many of the cases I've seen, and the issue we're mitigating had 80M strings here)
There were existing tests that needed fixing to cope with truncation

elasticsearchmachine · 2026-03-02T13:46:30Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

craigtaverner · 2026-03-02T13:48:07Z

I checked that PushQueriesIT.testEqualityTooBigToPush covers this case (ie. the assertion needed to be changed to verify the truncated string). Hopefully this is sufficient testing.

elasticsearchmachine · 2026-03-02T16:18:16Z

💚 Backport successful

Status	Branch	Result
✅	9.3
✅	9.2

…ueries (elastic#143175)

…locations * upstream/main: (94 commits) Mute org.elasticsearch.xpack.esql.qa.mixed.EsqlClientYamlIT test {p0=esql/40_tsdb/TS Command grouping on text field} elastic#142544 Mute org.elasticsearch.index.store.StoreDirectoryMetricsIT testDirectoryMetrics elastic#143419 Mute org.elasticsearch.xpack.esql.qa.multi_node.GenerativeIT test elastic#143023 TS_INFO information retrieval command (elastic#142721) ESQL: External source parallel execution and distribution (elastic#143349) Mute org.elasticsearch.index.mapper.blockloader.FlattenedFieldRootBlockLoaderTests testBlockLoaderForFieldInObject {preference=Params[syntheticSource=false, preference=DOC_VALUES]} elastic#143414 Mute org.elasticsearch.index.mapper.blockloader.FlattenedFieldRootBlockLoaderTests testBlockLoaderForFieldInObject {preference=Params[syntheticSource=false, preference=NONE]} elastic#143413 Mute org.elasticsearch.index.mapper.blockloader.FlattenedFieldRootBlockLoaderTests testBlockLoaderForFieldInObject {preference=Params[syntheticSource=false, preference=STORED]} elastic#143412 Removing ingest random sampling (elastic#143289) Mute org.elasticsearch.xpack.esql.qa.single_node.GenerativeIT test elastic#143023 [Transform] Clean up internal tests (elastic#143246) Skip time series field type merge for non-TS agg queries (elastic#143262) Enable zero-copy SIMD vector scoring on searchable snapshots (frozen tier) (elastic#141718) Mute org.elasticsearch.xpack.search.CrossClusterAsyncSearchIT testCancelViaExpirationOnRemoteResultsWithMinimizeRoundtrips elastic#143407 Fix MemorySegmentUtilsTests (elastic#143391) Unmute testWorkflowsRestrictionAllowsAccess (elastic#143308) Cancel async query on expiry (elastic#143016) ESQL: Finish migrating error testing (elastic#143322) Reduce LuceneOperator.Status memory consumption with large QueryDSL queries (elastic#143175) ESQL: Generative testing with full text functions (elastic#142961) ...

…ueries (elastic#143175)

…yDSL queries (#143175) (#143403) * Reduce LuceneOperator.Status memory consumption with large QueryDSL queries (#143175) * Backporting to 9.3 required inlining the constant

…yDSL queries (#143175) (#143401) * Reduce LuceneOperator.Status memory consumption with large QueryDSL queries (#143175) * Backporting to 9.3 required inlining the constant

Reduce LuceneOperator.Status memory consumption with large QueryDSL q…

25836ab

…ueries

craigtaverner added >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL labels Feb 26, 2026

elasticsearchmachine added the v9.4.0 label Feb 26, 2026

Update docs/changelog/143175.yaml

929b8ae

nik9000 reviewed Feb 26, 2026

View reviewed changes

dnhatn reviewed Feb 26, 2026

View reviewed changes

craigtaverner added 2 commits February 27, 2026 12:30

Convert to truncated string early, increase size and fix tests

2dc4ae0

Merge branch 'reduce_luceneoperator_status_memory' of github.com:crai…

7103c6c

…gtaverner/elasticsearch into reduce_luceneoperator_status_memory

Merge branch 'main' into reduce_luceneoperator_status_memory

7101a4f

craigtaverner marked this pull request as ready for review March 2, 2026 13:46

nik9000 approved these changes Mar 2, 2026

View reviewed changes

craigtaverner added auto-backport Automatically create backport pull requests when merged branch:9.2 branch:9.3 labels Mar 2, 2026

elasticsearchmachine added v9.3.2 v9.2.7 and removed branch:9.2 branch:9.3 labels Mar 2, 2026

craigtaverner merged commit c47480d into elastic:main Mar 2, 2026
34 of 35 checks passed

This was referenced Mar 2, 2026

[9.3] Reduce LuceneOperator.Status memory consumption with large QueryDSL queries (#143175) #143401

Merged

[9.2] Reduce LuceneOperator.Status memory consumption with large QueryDSL queries (#143175) #143403

Merged

craigtaverner added a commit to craigtaverner/elasticsearch that referenced this pull request Mar 2, 2026

Reduce LuceneOperator.Status memory consumption with large QueryDSL q…

d839a7a

…ueries (elastic#143175)

craigtaverner added a commit to craigtaverner/elasticsearch that referenced this pull request Mar 2, 2026

Reduce LuceneOperator.Status memory consumption with large QueryDSL q…

4b0c0f0

…ueries (elastic#143175)

tballison pushed a commit to tballison/elasticsearch that referenced this pull request Mar 3, 2026

Reduce LuceneOperator.Status memory consumption with large QueryDSL q…

818d32a

…ueries (elastic#143175)

This was referenced Mar 6, 2026

[ML] Wait for cluster state in test #143767

Merged

[Transform] Disable PIT for CPS #143876

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce LuceneOperator.Status memory consumption with large QueryDSL queries#143175

Reduce LuceneOperator.Status memory consumption with large QueryDSL queries#143175
craigtaverner merged 5 commits intoelastic:mainfrom
craigtaverner:reduce_luceneoperator_status_memory

craigtaverner commented Feb 26, 2026

Uh oh!

elasticsearchmachine commented Feb 26, 2026

Uh oh!

nik9000 left a comment

Uh oh!

dnhatn Feb 26, 2026 •

edited

Loading

Uh oh!

nik9000 Feb 26, 2026

Uh oh!

nik9000 Feb 26, 2026

Uh oh!

dnhatn Feb 26, 2026

Uh oh!

craigtaverner Feb 27, 2026

Uh oh!

craigtaverner commented Feb 27, 2026

Uh oh!

elasticsearchmachine commented Mar 2, 2026

Uh oh!

craigtaverner commented Mar 2, 2026

Uh oh!

Uh oh!

elasticsearchmachine commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

craigtaverner commented Feb 26, 2026

Uh oh!

elasticsearchmachine commented Feb 26, 2026

Uh oh!

nik9000 left a comment

Choose a reason for hiding this comment

Uh oh!

dnhatn Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nik9000 Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

nik9000 Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

dnhatn Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

craigtaverner Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

craigtaverner commented Feb 27, 2026

Uh oh!

elasticsearchmachine commented Mar 2, 2026

Uh oh!

craigtaverner commented Mar 2, 2026

Uh oh!

Uh oh!

elasticsearchmachine commented Mar 2, 2026

💚 Backport successful

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dnhatn Feb 26, 2026 •

edited

Loading