Add int4 vector scoring benchmarks#144105
Merged
ldematte merged 11 commits intoelastic:mainfrom Mar 16, 2026
Merged
Conversation
Add JMH benchmarks for int4 (PACKED_NIBBLE) quantized vector scoring to establish performance baselines before adding native C++ support. Three benchmark levels mirror the existing int7u suite: - VectorScorerInt4OperationBenchmark: raw dot product - VectorScorerInt4Benchmark: single-score with correction math - VectorScorerInt4BulkBenchmark: multi-vector scoring patterns including bulkScore API Each benchmark compares SCALAR (plain loop) vs LUCENE (Panama SIMD) implementations for DOT_PRODUCT and EUCLIDEAN similarity. Made-with: Cursor
Collaborator
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
thecoop
reviewed
Mar 12, 2026
...marks/src/main/java/org/elasticsearch/benchmark/vector/scorer/VectorScorerInt4Benchmark.java
Outdated
Show resolved
Hide resolved
thecoop
reviewed
Mar 12, 2026
...marks/src/main/java/org/elasticsearch/benchmark/vector/scorer/VectorScorerInt4Benchmark.java
Outdated
Show resolved
Hide resolved
thecoop
reviewed
Mar 13, 2026
benchmarks/src/main/java/org/elasticsearch/benchmark/vector/scorer/ScalarOperations.java
Outdated
Show resolved
Hide resolved
thecoop
reviewed
Mar 13, 2026
...marks/src/main/java/org/elasticsearch/benchmark/vector/scorer/VectorScorerInt4Benchmark.java
Outdated
Show resolved
Hide resolved
thecoop
reviewed
Mar 13, 2026
| public int dims; | ||
|
|
||
| @Param({ "128", "1500", "130000" }) | ||
| public int numVectors; |
Member
There was a problem hiding this comment.
It feels like there's some common code we can pull out between all the benchmarks, but that can be done later on
thecoop
reviewed
Mar 13, 2026
...estFixtures/java/org/elasticsearch/simdvec/internal/vectorization/VectorScorerTestUtils.java
Outdated
Show resolved
Hide resolved
thecoop
reviewed
Mar 13, 2026
...estFixtures/java/org/elasticsearch/simdvec/internal/vectorization/VectorScorerTestUtils.java
Outdated
Show resolved
Hide resolved
thecoop
approved these changes
Mar 13, 2026
ncordon
pushed a commit
to ncordon/elasticsearch
that referenced
this pull request
Mar 16, 2026
Added JMH benchmarks for int4 quantized vector scoring to establish performance baselines before adding native int4 support: - Three benchmark levels mirror the existing int7u suite: operation-level (raw dot product), single-score, and bulk scoring. - Each benchmark compares SCALAR (plain loop) vs LUCENE (Panama SIMD) implementations for DOT_PRODUCT and EUCLIDEAN similarity.
szybia
added a commit
to szybia/elasticsearch
that referenced
this pull request
Mar 16, 2026
…elocations * upstream/main: (33 commits) Unmute InferenceRestIT and DefaultEndPointsIT (elastic#144217) feat: add keep_alive to async task status (elastic#144010) Add explicit isNoOpUpdate() method to MapperService (elastic#144113) Always attach APM Agent (elastic#144120) Fix random_score nightly tests (elastic#144176) Add nested query checks for disabled sequence numbers (elastic#144185) Return sentinel values from Fetch when sequence numbers are disabled (elastic#144212) [Test] Test peer-recovery with sequence numbers pruning (elastic#144116) Remove `scaled-*` field assertions from mixed cluster downsampling test (elastic#144295) Refactor: Use range syntax in ES|QL exponential histogram tests (elastic#144110) Move resolve aliases to IndexAbstractionOptions (elastic#143953) unmute test (elastic#144299) Fix approximation csvtests (elastic#144233) fix test (elastic#144171) Add int4 vector scoring benchmarks (elastic#144105) Mute org.elasticsearch.xpack.esql.qa.single_node.GenerativeIT test elastic#143023 Mute org.elasticsearch.test.apmintegration.MetricsApmIT testApmIntegration {withOTel=false} elastic#144282 Native cli launcher (elastic#143712) Mute org.elasticsearch.xpack.esql.qa.multi_node.GenerativeIT test elastic#143023 Mute org.elasticsearch.xpack.esql.heap_attack.HeapAttackSubqueryIT testManyRandomKeywordFieldsInSubqueryIntermediateResults elastic#144274 ...
ldematte
added a commit
that referenced
this pull request
Mar 17, 2026
#144215) This PR introduces support and plumbing for native int4 vector scoring. In particular: - a "naive" native int4 vector scoring implementation — scalar (non-SIMD) native C implementations of packed-nibble int4 dot product for both ARM and x64 (vec_doti4 (single), vec_doti4_bulk, vec_doti4_bulk_offsets). - the usual Java-side plumbing (JDKVectorLibrary, Similatities, etc.) in libs/native - Vector scorer implementations in libs/simdvec (Int4VectorScorer and Int4VectorScorerSupplier) - Tests, both at scorer level (Int4VectorScorerFactoryTests, with MMap and NIO directory variants) and lower level (JDKVectorLibraryInt4Tests) - Updated JMH benchmarks from Add int4 vector scoring benchmarks #144105 (VectorScorerInt4Benchmark and VectorScorerInt4BulkBenchmark to include NATIVE implementations - switched to IndexInput-based data for fair comparison - refactored to avoid duplication with tests What is NOT included / future work: - The new scorer is not (yet) used in production. The integration in ES94ScalarQuantizedVectorsFormat.java was reverted (commit 6c94b3f), as the naive scalar native implementation is not competitive against Lucene's Panama SIMD. To re-enable: revert that commit. - Which of course means we want to add SIMD-optimized native int4 implementations, and optimized bulk operations - Notice that we are not missing distance functions -- only DOT_PRODUCT is needed for native Int4 — other functions are computed by applying correction terms on top of the raw dot product result.
michalborek
pushed a commit
to michalborek/elasticsearch
that referenced
this pull request
Mar 23, 2026
Added JMH benchmarks for int4 quantized vector scoring to establish performance baselines before adding native int4 support: - Three benchmark levels mirror the existing int7u suite: operation-level (raw dot product), single-score, and bulk scoring. - Each benchmark compares SCALAR (plain loop) vs LUCENE (Panama SIMD) implementations for DOT_PRODUCT and EUCLIDEAN similarity.
michalborek
pushed a commit
to michalborek/elasticsearch
that referenced
this pull request
Mar 23, 2026
elastic#144215) This PR introduces support and plumbing for native int4 vector scoring. In particular: - a "naive" native int4 vector scoring implementation — scalar (non-SIMD) native C implementations of packed-nibble int4 dot product for both ARM and x64 (vec_doti4 (single), vec_doti4_bulk, vec_doti4_bulk_offsets). - the usual Java-side plumbing (JDKVectorLibrary, Similatities, etc.) in libs/native - Vector scorer implementations in libs/simdvec (Int4VectorScorer and Int4VectorScorerSupplier) - Tests, both at scorer level (Int4VectorScorerFactoryTests, with MMap and NIO directory variants) and lower level (JDKVectorLibraryInt4Tests) - Updated JMH benchmarks from Add int4 vector scoring benchmarks elastic#144105 (VectorScorerInt4Benchmark and VectorScorerInt4BulkBenchmark to include NATIVE implementations - switched to IndexInput-based data for fair comparison - refactored to avoid duplication with tests What is NOT included / future work: - The new scorer is not (yet) used in production. The integration in ES94ScalarQuantizedVectorsFormat.java was reverted (commit 6c94b3f), as the naive scalar native implementation is not competitive against Lucene's Panama SIMD. To re-enable: revert that commit. - Which of course means we want to add SIMD-optimized native int4 implementations, and optimized bulk operations - Notice that we are not missing distance functions -- only DOT_PRODUCT is needed for native Int4 — other functions are computed by applying correction terms on top of the raw dot product result.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Test plan
./gradlew :benchmarks:test --tests 'org.elasticsearch.benchmark.vector.scorer.VectorScorerInt4*'passes./gradlew -p benchmarks run --args 'VectorScorerInt4OperationBenchmark'(and similar for Benchmark/BulkBenchmark)Co-created with Cursor