Skip to content

Add int4 vector scoring benchmarks#144105

Merged
ldematte merged 11 commits intoelastic:mainfrom
ldematte:native/vec-i4
Mar 16, 2026
Merged

Add int4 vector scoring benchmarks#144105
ldematte merged 11 commits intoelastic:mainfrom
ldematte:native/vec-i4

Conversation

@ldematte
Copy link
Copy Markdown
Contributor

@ldematte ldematte commented Mar 12, 2026

Summary

  • Add JMH benchmarks for int4 quantized vector scoring to establish performance baselines before adding native int4 support.
  • Three benchmark levels mirror the existing int7u suite: operation-level (raw dot product), single-score, and bulk scoring.
  • Each benchmark compares SCALAR (plain loop) vs LUCENE (Panama SIMD) implementations for DOT_PRODUCT and EUCLIDEAN similarity.

Test plan

  • ./gradlew :benchmarks:test --tests 'org.elasticsearch.benchmark.vector.scorer.VectorScorerInt4*' passes
  • Benchmarks run successfully with ./gradlew -p benchmarks run --args 'VectorScorerInt4OperationBenchmark' (and similar for Benchmark/BulkBenchmark)

Co-created with Cursor

Add JMH benchmarks for int4 (PACKED_NIBBLE) quantized vector scoring
to establish performance baselines before adding native C++ support.

Three benchmark levels mirror the existing int7u suite:
- VectorScorerInt4OperationBenchmark: raw dot product
- VectorScorerInt4Benchmark: single-score with correction math
- VectorScorerInt4BulkBenchmark: multi-vector scoring patterns
  including bulkScore API

Each benchmark compares SCALAR (plain loop) vs LUCENE (Panama SIMD)
implementations for DOT_PRODUCT and EUCLIDEAN similarity.

Made-with: Cursor
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.4.0 labels Mar 12, 2026
@ldematte ldematte added >test Issues or PRs that are addressing/adding tests :Search Relevance/Vectors Vector search and removed needs:triage Requires assignment of a team area label labels Mar 12, 2026
@ldematte ldematte requested a review from thecoop March 12, 2026 13:54
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Mar 12, 2026
public int dims;

@Param({ "128", "1500", "130000" })
public int numVectors;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like there's some common code we can pull out between all the benchmarks, but that can be done later on

@ldematte ldematte merged commit daf1e28 into elastic:main Mar 16, 2026
28 checks passed
@ldematte ldematte deleted the native/vec-i4 branch March 16, 2026 08:04
ncordon pushed a commit to ncordon/elasticsearch that referenced this pull request Mar 16, 2026
Added JMH benchmarks for int4 quantized vector scoring to establish performance baselines before adding native int4 support:
- Three benchmark levels mirror the existing int7u suite: operation-level (raw dot product), single-score, and bulk scoring.
- Each benchmark compares SCALAR (plain loop) vs LUCENE (Panama SIMD) implementations for DOT_PRODUCT and EUCLIDEAN similarity.
szybia added a commit to szybia/elasticsearch that referenced this pull request Mar 16, 2026
…elocations

* upstream/main: (33 commits)
  Unmute InferenceRestIT and DefaultEndPointsIT (elastic#144217)
  feat: add keep_alive to async task status (elastic#144010)
  Add explicit isNoOpUpdate() method to MapperService (elastic#144113)
  Always attach APM Agent (elastic#144120)
  Fix random_score nightly tests (elastic#144176)
  Add nested query checks for disabled sequence numbers (elastic#144185)
  Return sentinel values from Fetch when sequence numbers are disabled (elastic#144212)
  [Test] Test peer-recovery with sequence numbers pruning (elastic#144116)
  Remove `scaled-*` field assertions from mixed cluster downsampling test (elastic#144295)
  Refactor: Use range syntax in ES|QL exponential histogram tests (elastic#144110)
  Move resolve aliases to IndexAbstractionOptions (elastic#143953)
  unmute test (elastic#144299)
  Fix approximation csvtests (elastic#144233)
  fix test (elastic#144171)
  Add int4 vector scoring benchmarks (elastic#144105)
  Mute org.elasticsearch.xpack.esql.qa.single_node.GenerativeIT test elastic#143023
  Mute org.elasticsearch.test.apmintegration.MetricsApmIT testApmIntegration {withOTel=false} elastic#144282
  Native cli launcher (elastic#143712)
  Mute org.elasticsearch.xpack.esql.qa.multi_node.GenerativeIT test elastic#143023
  Mute org.elasticsearch.xpack.esql.heap_attack.HeapAttackSubqueryIT testManyRandomKeywordFieldsInSubqueryIntermediateResults elastic#144274
  ...
ldematte added a commit that referenced this pull request Mar 17, 2026
#144215)

This PR introduces support and plumbing for native int4 vector scoring.

In particular:
- a "naive" native int4 vector scoring implementation — scalar (non-SIMD) native C implementations of packed-nibble int4 dot product for both ARM and x64 (vec_doti4 (single), vec_doti4_bulk, vec_doti4_bulk_offsets).
- the usual Java-side plumbing (JDKVectorLibrary, Similatities, etc.) in libs/native
- Vector scorer implementations in libs/simdvec (Int4VectorScorer and Int4VectorScorerSupplier)
- Tests, both at scorer level (Int4VectorScorerFactoryTests, with MMap and NIO directory variants) and lower level (JDKVectorLibraryInt4Tests)
- Updated JMH benchmarks from Add int4 vector scoring benchmarks #144105 (VectorScorerInt4Benchmark and VectorScorerInt4BulkBenchmark to include NATIVE implementations
   - switched to IndexInput-based data for fair comparison
   - refactored to avoid duplication with tests

What is NOT included / future work:
- The new scorer is not (yet) used in production. The integration in ES94ScalarQuantizedVectorsFormat.java was reverted (commit 6c94b3f), as the naive scalar native implementation is not competitive against Lucene's Panama SIMD. To re-enable: revert that commit.
- Which of course means we want to add SIMD-optimized native int4 implementations, and optimized bulk operations
- Notice that we are not missing distance functions -- only DOT_PRODUCT is needed for native Int4 — other functions are computed by applying correction terms on top of the raw dot product result.
michalborek pushed a commit to michalborek/elasticsearch that referenced this pull request Mar 23, 2026
Added JMH benchmarks for int4 quantized vector scoring to establish performance baselines before adding native int4 support:
- Three benchmark levels mirror the existing int7u suite: operation-level (raw dot product), single-score, and bulk scoring.
- Each benchmark compares SCALAR (plain loop) vs LUCENE (Panama SIMD) implementations for DOT_PRODUCT and EUCLIDEAN similarity.
michalborek pushed a commit to michalborek/elasticsearch that referenced this pull request Mar 23, 2026
elastic#144215)

This PR introduces support and plumbing for native int4 vector scoring.

In particular:
- a "naive" native int4 vector scoring implementation — scalar (non-SIMD) native C implementations of packed-nibble int4 dot product for both ARM and x64 (vec_doti4 (single), vec_doti4_bulk, vec_doti4_bulk_offsets).
- the usual Java-side plumbing (JDKVectorLibrary, Similatities, etc.) in libs/native
- Vector scorer implementations in libs/simdvec (Int4VectorScorer and Int4VectorScorerSupplier)
- Tests, both at scorer level (Int4VectorScorerFactoryTests, with MMap and NIO directory variants) and lower level (JDKVectorLibraryInt4Tests)
- Updated JMH benchmarks from Add int4 vector scoring benchmarks elastic#144105 (VectorScorerInt4Benchmark and VectorScorerInt4BulkBenchmark to include NATIVE implementations
   - switched to IndexInput-based data for fair comparison
   - refactored to avoid duplication with tests

What is NOT included / future work:
- The new scorer is not (yet) used in production. The integration in ES94ScalarQuantizedVectorsFormat.java was reverted (commit 6c94b3f), as the naive scalar native implementation is not competitive against Lucene's Panama SIMD. To re-enable: revert that commit.
- Which of course means we want to add SIMD-optimized native int4 implementations, and optimized bulk operations
- Notice that we are not missing distance functions -- only DOT_PRODUCT is needed for native Int4 — other functions are computed by applying correction terms on top of the raw dot product result.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch >test Issues or PRs that are addressing/adding tests v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants