Skip to content

ESQL: Improve Lookup Join performance with CachedDirectoryReader#139314

Merged
julian-elastic merged 15 commits intoelastic:mainfrom
julian-elastic:CachedDirectoryReader_v3
Jan 9, 2026
Merged

ESQL: Improve Lookup Join performance with CachedDirectoryReader#139314
julian-elastic merged 15 commits intoelastic:mainfrom
julian-elastic:CachedDirectoryReader_v3

Conversation

@julian-elastic
Copy link
Contributor

@julian-elastic julian-elastic commented Dec 10, 2025

Improve Lookup Join performance by caching objects needed for Lucene Queries.
We cache DocValues to improve query performance. Benchmark results show show 5-15% improvement in performance for lookup join with this PR.

This PR is thoroughly tested by all existing lookup join UTs.

Special thanks to @dnhatn for POC.

Closes #137268

@elasticsearchmachine
Copy link
Collaborator

Hi @julian-elastic, I've created a changelog YAML for you.

@julian-elastic
Copy link
Contributor Author

Buildkite benchmark this with esql-joins please

@julian-elastic julian-elastic marked this pull request as ready for review December 15, 2025 14:01
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Dec 15, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some comments (apart from Nik's). Thanks, Julian!

@julian-elastic
Copy link
Contributor Author

Buildkite benchmark this with esql-joins please

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for all iterations @julian-elastic

@julian-elastic
Copy link
Contributor Author

Buildkite benchmark this with esql-joins please

@julian-elastic
Copy link
Contributor Author

Buildkite benchmark this with esql-joins please

@elasticmachine
Copy link
Collaborator

elasticmachine commented Jan 8, 2026

💚 Build Succeeded

This build ran two esql-joins benchmarks to evaluate performance impact of this PR.

History

cc @julian-elastic

@julian-elastic julian-elastic merged commit 6b4ba06 into elastic:main Jan 9, 2026
35 checks passed
szybia added a commit to szybia/elasticsearch that referenced this pull request Jan 9, 2026
* upstream/main: (76 commits)
  [Inference API] Get _services skips EIS authorization call if CCM is not configured (elastic#139964)
  Improve TSDB codec benchmarks with full encoder and compression metrics (elastic#140299)
  ESQL: Consolidate test `BlockLoaderContext`s (elastic#140403)
  ESQL: Improve Lookup Join performance with CachedDirectoryReader (elastic#139314)
  ES|QL: Add more examples for the match operator (elastic#139815)
  ESQL: Add timezone to add and sub operators, and ConfigurationAware planning support (elastic#140101)
  ESQL: Updated ToIp tests and generated documentation for map parameters (elastic#139994)
  Disable _delete_by_query and _update_by_query for CCS/stateful (elastic#140301)
  Remove unused method ElasticInferenceService.translateToChunkedResults (elastic#140442)
  logging hot threads on large queue of the management threadpool (elastic#140251)
  Search functions docs cleanup (elastic#140435)
  Unmute 350_point_in_time/point-in-time with index filter (elastic#140443)
  Remove unused methods (elastic#140222)
  Add CPS and `project_routing` support for `_mvt` (elastic#140053)
  Streamline `ShardDeleteResults` collection (elastic#140363)
  Fix Docker build to use --load for single-platform images (elastic#140402)
  Parametrize + test VectorScorerOSQBenchmark (elastic#140354)
  `RecyclerBytesStreamOutput` using absolute offsets (elastic#140303)
  Define bulk float native methods for vector scoring (elastic#139885)
  Make `TimeSeriesAggregate` `TimestampAware` (elastic#140270)
  ...
jimczi pushed a commit to jimczi/elasticsearch that referenced this pull request Jan 12, 2026
…stic#139314)

Improve Lookup Join performance by caching objects needed for Lucene Queries.
We cache DocValues to improve query performance. 
Benchmark results show show 5-15% improvement in performance for lookup join with this PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Performance: Improve Lookup Join performance by caching docValues

5 participants