Skip to content

Upgrade Elasticsearch to Apache Lucene 10.4#141882

Merged
benwtrent merged 393 commits intomainfrom
lucene_snapshot_10_4
Mar 10, 2026
Merged

Upgrade Elasticsearch to Apache Lucene 10.4#141882
benwtrent merged 393 commits intomainfrom
lucene_snapshot_10_4

Conversation

@benwtrent
Copy link
Copy Markdown
Member

@benwtrent benwtrent commented Feb 4, 2026

This updates to new version of Apache Lucene.

And provides some significant performance improvements for both relevance lexical, logs, metrics, and vector search.

elasticsearchmachine and others added 30 commits November 6, 2025 07:14
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 4, 2026

Vale Linting Results

Summary: 1 warning found

⚠️ Warnings (1)
File Line Rule Message
docs/reference/elasticsearch/mapping-reference/semantic-text-setup-configuration.md 200 Elastic.Latinisms Latin terms and abbreviations are a common source of confusion. Use 'and the reverse' instead of 'vice versa'.

The Vale linter checks documentation changes against the Elastic Docs style guide.

To use Vale locally or report issues, refer to Elastic style guide for Vale.

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Library update summary (dependency bump)

This PR upgrades Elasticsearch’s embedded Apache Lucene dependency set and refreshes the related build verification metadata; most of the code churn is adapting codecs/vectors/tests from lucene103 to lucene104.

Update Summary

Library Old New
Apache Lucene (org.apache.lucene:*) 10.3.2 10.4.0

Version Differences

Library Change
Apache Lucene (org.apache.lucene:*) 10.3.210.4.0

Paths touched (high level)

  • Version source: build-tools-internal/version.properties
  • Dependency verification: gradle/verification-metadata.xml
  • Docs/changelog: docs/Versions.asciidoc, docs/changelog/141074.yaml, plus a couple mapping docs updates under docs/reference/...
  • Server + codecs/vectors: broad updates under server/src/main/java/org/elasticsearch/index/codec/** and related tests under server/src/test/** (notably new/updated Lucene104* codec references)
  • SIMD + GPU/vector plumbing: libs/simdvec/**, libs/gpu-codec/**
  • REST/YAML and QA coverage: rest-api-spec/** YAML tests, qa/vector/**, qa/rolling-upgrade/**
  • X-Pack: inference semantic text mapping/tests and some frozen engine bits under x-pack/plugin/**

Key observations

  • Scope: while this is a single library bump, it’s a wide surface-area update (codecs, vector formats, REST tests, QA), consistent with Lucene’s lucene104 package/format changes.
  • Release notes highlights (Lucene 10.4.0) (from https://lucene.apache.org/core/10_4_0/changes/Changes.html):
    • Vectors/codecs: adds/extends Lucene104*ScalarQuantizedVectorsFormat and Lucene104*HnswScalarQuantizedVectorsFormat (incl. asymmetric quantization variants) and related vector scoring/rescoring/bulk-scoring APIs.
    • Performance: multiple optimizations around HNSW indexing/search (bulk scoring, diversity checks, graph building), plus postings/docvalues/query hot-path improvements.
    • APIs: additions like NumericDocValues#longValues, DocIdStream#intoArray, and DirectoryReader.open(...) accepting an ExecutorService.
  • Follow-ups to watch: PR discussion already flags a potential GPU impact with int8/HNSW quantization format shifts; worth keeping an eye on GPU-related tests/benchmarks and any downstream consumers.

@elastic elastic deleted a comment from elasticsearchmachine Mar 4, 2026
Copy link
Copy Markdown
Contributor

@iverase iverase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read through all the changes and everything looks good.

LGTM!

Copy link
Copy Markdown
Contributor

@mayya-sharipova mayya-sharipova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked a part about hnsw threshold (I am not familiar about confidence).

So that part LGTM!

@benwtrent benwtrent removed the serverless-linked Added by automation, don't add manually label Mar 7, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @benwtrent, I've created a changelog YAML for you.

@benwtrent benwtrent merged commit 92f3ccd into main Mar 10, 2026
37 of 38 checks passed
@benwtrent benwtrent deleted the lucene_snapshot_10_4 branch March 10, 2026 11:03
szybia added a commit to szybia/elasticsearch that referenced this pull request Mar 10, 2026
…locations

* upstream/main: (126 commits)
  Update KnnIndexTester to use more settings from datasets (elastic#143869)
  fix: dynamic template vector array is overridden by automatic dense_vector mapping (elastic#143733)
  ES|QL: Don't reuse the same alias for _fork column (elastic#143909)
  Close and initialize clients after each node upgrade in logsdb rolling upgrade tests. (elastic#143823)
  ESQL: Added GroupedTopNOperator for LIMIT BY, compute only (elastic#143476)
  Handle views in ResolveIndexAction (elastic#143561)
  Improve reindex rethrottle API in stateless (elastic#143771)
  Use a copy of the SearchExecutionContext for each Percolator execution (elastic#142765)
  Log the stacktrace when we encounter a deprecation warning for `default_metric` (elastic#143929)
  ESQL: evaluate ReferenceAttributes to potentially FieldAttributes for full-text functions restriction (elastic#143893)
  Add ClusterStateSerializationStats Serializatation Tests (elastic#142703)
  Adds Coordination Diagnostics Tests (elastic#142709)
  Upgrade Elasticsearch to Apache Lucene 10.4 (elastic#141882)
  ESQL: Add configurable bracket-based multi-value support for CSV reader (elastic#143890)
  time series es819 binary dv use up to a 1mb block size (elastic#143049)
  Dynamically enable / disable plugins in correspondence to stateless mode. (elastic#142147)
  ES|QL: Implement first/last_over_time for tdigest (elastic#143832)
  Document CHANGE_POINT limitation (elastic#143877)
  Fix OperationsOnSeqNoDisabledIndicesIT (elastic#143892)
  [Test] Test that sequence numbers are not pruned with retention lease (elastic#143825)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cloud-deploy Publish cloud docker image for Cloud-First-Testing :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team >upgrade v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants