Add int7 support for DiskBBQ and tests by ah89 · Pull Request #141183 · elastic/elasticsearch

ah89 · 2026-01-23T14:53:11Z

Enable 7-bit quantization in DiskBBQ paths.
Extend SIMD OSQ tests and OSQ benchmark for 7-bit.

Closes #139591

elasticsearchmachine · 2026-01-23T14:53:36Z

Pinging @elastic/es-search-relevance (Team:Search Relevance)

benwtrent

What does recall & latency look like?

Good first pass here.

benwtrent · 2026-01-23T14:58:11Z

...main/java/org/elasticsearch/index/codec/vectors/diskbbq/next/ESNextDiskBBQVectorsReader.java

        private final ESNextOSQVectorsScorer osqVectorsScorer;
+        private final ES92Int7VectorsScorer int7VectorsScorer;


we should do this. please augment the OSQ scorer. Flipping between two of them here will lead to confusion.

...main/java/org/elasticsearch/index/codec/vectors/diskbbq/next/ESNextDiskBBQVectorsReader.java

benwtrent · 2026-01-23T15:03:17Z

server/src/main/java/org/elasticsearch/index/codec/vectors/diskbbq/DiskBBQBulkWriter.java

@@ -203,6 +212,9 @@ public void writeVectors(QuantizedVectorValues qvv, CheckedIntConsumer<IOExcepti
                writeCorrections(corrections);
            }
            // write tail


And this doesn't break the scoring? none of the other bits encode the tail as blocks yet.

//cc @tteofili

agreed, this is likely to affect recall, @ah89 did you check with KnnIndexTester ?

Since LargeBitDiskBBQBulkWriter was seemingly never instantiated before case 7 was introduced—with cases 1, 2, and 4 handled by SmallBitDiskBBQBulkWriter—removing it corrupts the on-disk layout, preventing the reader from locating the doc IDs at the start of the block.

agreed, this is likely to affect recall, @ah89 did you check with KnnIndexTester ?

Verified with KnnIndexTester on GIST-1M (960 dims, 100K docs, 100 queries, euclidean, IVF):

index_type quantize_bits recall latency (ms) QPS visited

ivf 4 0.56 0.16–0.20 5000–6250 2374

ivf 7 0.61 0.66–0.68 1470–1515 2362

7-bit shows higher recall (0.61 vs 0.56) as expected from reduced quantization error. The docsWriter fix is validated — if doc IDs were missing or misaligned, recall would drop to near zero.

as per my previous comment LargeBitDiskBBQBulkWriter was never instantiated before this PR (only case 7 routes to it, previously only 1/2/4 existed). The docsWriter.accept(i) calls mirror what SmallBitDiskBBQBulkWriter already does — writing doc IDs at the start of each bulk blokc so the reader can associate scores back to documents.

ok, the thing is that now the ESNextDiskBBQVectorsWriter always calls DiskBBQBulkWriter#fromBitSize with the blockEncodeTailVectors parameter set to true (I was in fact thinking of dropping the parameter entirely), therefore you should see LargeBitEncodedDiskBBQBulkWriter being used with 7 bits.

I see, so I will remove the docsWriter calls from the dead LargeBitDiskBBQBulkWriter (never instantiated), but will keep them on LargeBitEncodedDiskBBQBulkWriter -- the one that's actually used.

benwtrent · 2026-02-05T18:10:47Z

@ah89 I think all the blockers are finished :) You should be able to progress on this now.

tteofili

my only concern here is with the DiskBBQBulkWriter comment.

libs/simdvec/src/main/java/org/elasticsearch/simdvec/ESNextOSQVectorsScorer.java

tteofili · 2026-02-06T14:16:23Z

server/src/main/java/org/elasticsearch/index/codec/vectors/diskbbq/DiskBBQBulkWriter.java

@@ -203,6 +212,9 @@ public void writeVectors(QuantizedVectorValues qvv, CheckedIntConsumer<IOExcepti
                writeCorrections(corrections);
            }
            // write tail


agreed, this is likely to affect recall, @ah89 did you check with KnnIndexTester ?

tteofili · 2026-02-06T14:17:39Z

server/src/test/java/org/elasticsearch/index/codec/vectors/diskbbq/DiskBBQBulkWriterTests.java

+        int dimensions = 4;
+        int bulkSize = 2;


how about a couple of similar tests with a random pick within (i) arrays of allowed values and a (ii) arrays of incompatible values ?

Agree — good idea.

- Added support for 7-bit symmetric quantization in VectorScorerOSQBenchmark and ESNextOSQVectorsScorer. - Updated the handling of quantization in various methods to accommodate the new bit size. - Modified tests to validate the new 7-bit encoding functionality in DiskBBQBulkWriter and related classes. - Ensured backward compatibility by maintaining existing 1, 2, and 4-bit quantization methods. Relates to elastic#139591

…k writing process.

benwtrent

You are going to need to update MemorySegmentESNextOSQVectorsScorer

ah89 · 2026-02-12T00:29:14Z

You are going to need to update MemorySegmentESNextOSQVectorsScorer

The 7-bit QPS went from ~1,500 to ~9,500, a ~6x speedup with negligible recall drops on the same test configuration.

index_type	quantize_bits	recall	latency (ms)	QPS	visited
ivf	4	0.56	0.13–0.15	6666–7692	2305
ivf	7	0.60	0.10–0.11	9090–10000	2310

…extOSQVectorsScorer - Introduced MSInt7SymmetricESNextOSQVectorsScorer to handle 7-bit symmetric quantization. - Updated MemorySegmentESNextOSQVectorsScorer to support new query/index bits combination. - Modified PanamaESVectorizationProvider to accommodate the new scoring logic for 7-bit vectors.

tteofili · 2026-02-12T16:06:56Z

once we're satisfied with the low level impl, we might open it up to be configurable in DenseVectorFieldMapper (although this also changes query bits, not just indexed vectors bits).

benwtrent

Glad to see the perf numbers :). One concern on the 'unoptimized path'.

benwtrent · 2026-02-12T21:29:39Z

libs/simdvec/src/main/java/org/elasticsearch/simdvec/ESNextOSQVectorsScorer.java

+            int total = 0;
+            for (int i = 0; i < dimensions; i++) {
+                total += in.readByte() * q[i];
+            }
+            return total;


let's put this in its own method and there is no reason for it to be so poorly optimized.

Please, read in the entire byte array and use VectorUtil.dotProduct.

The logic replaced with quantized7BitScore method that bulk-reads all index vector bytes at once into a pre-allocated reusable scratch buffer and computes the result using VectorUtil.dotProduct

tteofili

once the VectorUtil change is done, LGTM

tteofili · 2026-02-13T13:44:49Z

libs/simdvec/src/main/java/org/elasticsearch/simdvec/ESNextOSQVectorsScorer.java

+            int total = 0;
+            for (int i = 0; i < dimensions; i++) {
+                total += in.readByte() * q[i];
+            }
+            return total;


Updated the conditional logic in the PanamaESVectorizationProvider to improve readability by grouping related conditions. This change ensures that the logic for checking query and index bits is clearer and more maintainable.

Supports 7-bit quantization by introducing proper packing routines and adjusting test logic to clamp values and pass correct bit types. Fixes issues with handling 7-bit symmetric quantization and ensures consistent query vector creation and scoring. Enhances robustness of tests and vector scorer logic for 7-bit cases.

benwtrent · 2026-02-25T13:12:14Z

...hmarks/src/main/java/org/elasticsearch/benchmark/vector/scorer/VectorScorerOSQBenchmark.java

-                int addition = Short.toUnsignedInt(input.readShort());
+                int addition = input.readInt();


This doesn't break everything else? I am surprised. Seems like a huge bug as I thought we were on int in the Next scorer for a while now...

@ldematte @thecoop

It doesn't break, as it's reading garbage, but it's the same garbage each time, so the scores still match up in the benchmark test.

See #143137 for a fix

benwtrent

My only concern now is the weird OSQ benchmark change.

...hmarks/src/main/java/org/elasticsearch/benchmark/vector/scorer/VectorScorerOSQBenchmark.java

Unifies parameter types for quantization logic, improving clarity and reducing casting. Removes obsolete clamp helper, streamlining vector preprocessing.

tteofili · 2026-02-26T14:05:30Z

would it be possible to add the corresponding option also in KnnIndexTester ?

ldematte

I came in and checked it is "compatible" with our recent changes. Overall looks good, I left some comments to improve readability, but nothing functional

...hmarks/src/main/java/org/elasticsearch/benchmark/vector/scorer/VectorScorerOSQBenchmark.java

ldematte · 2026-02-26T17:22:47Z

.../org/elasticsearch/simdvec/internal/vectorization/MSInt7SymmetricESNextOSQVectorsScorer.java

+import java.lang.foreign.MemorySegment;
+
+/** Panamized scorer for 7-bit symmetric quantized vectors stored as a {@link MemorySegment}. */
+final class MSInt7SymmetricESNextOSQVectorsScorer extends MemorySegmentESNextOSQVectorsScorer.MemorySegmentScorer {


Nit: we settled on a convention for type names, to explicit the data and query sizes. This would be MSD7Q7ESNextOSQVectorsScorer (D7Q7 meaning: 7 bits for index data, 7 bits for queries); it is true we should rename the other implementation classes, but it would be nice to start using it for new code. CC @thecoop

ldematte · 2026-02-26T17:25:22Z

.../org/elasticsearch/simdvec/internal/vectorization/MSInt7SymmetricESNextOSQVectorsScorer.java

+import java.io.IOException;
+import java.lang.foreign.MemorySegment;
+
+/** Panamized scorer for 7-bit symmetric quantized vectors stored as a {@link MemorySegment}. */


NIT: I would call this "Vectorized", as the underlying implementation can be either Panama or Native.

ldematte · 2026-02-26T17:27:24Z

.../test/java/org/elasticsearch/simdvec/internal/vectorization/ESNextOSQVectorsScorerTests.java

        final byte[] vector = new byte[length];
-
-        final int queryBytes = length * (queryBits / indexBits);
+        final int queryBytes = indexBits == 7 ? dimensions : length * (queryBits / indexBits);


They are probably identical, but shouldn't this be length instead of dimensions? For consistency and readability

Done. For 7-bit quantization, length and dimensions are identical (each dimension is stored as one byte), but you are right — using length is more consistent and makes the intent clearer.

ldematte · 2026-02-26T17:31:06Z

.../test/java/org/elasticsearch/simdvec/internal/vectorization/ESNextOSQVectorsScorerTests.java

                // padding bytes.
                final IndexInput slice = in.slice("test", 0, (long) length * numVectors);
-                final var defaultScorer = defaultProvider().newESNextOSQVectorsScorer(
+                byte effectiveQueryBits = indexBits == 7 ? (byte) 7 : queryBits;


Instead of doing that multiple times, let's make queryBits non static and initialize it in the ctor.

Even better it would be to pass queryBits in the ctor too and generate the correct combinations in parametersFactory(), so when/if we extend it (e.g. to support D4Q7 or other combinations) we need to make minimal changes.

ldematte · 2026-02-26T17:35:04Z

...estFixtures/java/org/elasticsearch/simdvec/internal/vectorization/VectorScorerTestUtils.java

        );
        final byte[] quantizeQuery = new byte[queryVectorPackedLengthInBytes];
-        ESVectorUtil.transposeHalfByte(scratch, quantizeQuery);
+        if (queryBits == 7) {


We don't need this IF, packQuery for 4 bits is ESVectorUtil.transposeHalfByte()

benwtrent

Let's have lorenzo or simon approve to make sure benchmarking is good. But the rest looks good to me :).

Removes special-case logic for 7-bit quantization by treating queryBits uniformly across tests and scorer selection. Standardizes scorer naming and updates test parameter generation to ensure comprehensive coverage of query/index bit combinations. Streamlines query data creation for improved maintainability.

Extends quantization options to include 7 bits for IVF vectors, enabling improved flexibility and potential accuracy in benchmarking and testing. Updates validation to accept 7-bit quantization and refactors relevant call sites to use the new option. Relates to improved quantization capabilities.

ldematte

Changes to benchmarking now are minimal as the PR includes Simon's fix. LGTM

tteofili

LGTM

since this is the first work that affects the quantization of the query (it is always 4-bits in all the other scenarios), we might have to adjust the way these config options are exposed in dense_vector fields. Currently, we expose the bits config in index_options (1, 2, 4 are valid values), which relates to the bits used for doc embeddings; while we can just add 7 there too, I feel like, for the long run, we need to make sure those options can be consistent (e.g., once we implement 2-2, how do we expose that?).

benwtrent · 2026-03-02T15:58:10Z

@ah89 we still need to expose this as bits: 7 in the index settings. Please add that.

* Enhance vector scoring with 7-bit quantization support - Added support for 7-bit symmetric quantization in VectorScorerOSQBenchmark and ESNextOSQVectorsScorer. - Updated the handling of quantization in various methods to accommodate the new bit size. - Modified tests to validate the new 7-bit encoding functionality in DiskBBQBulkWriter and related classes. - Ensured backward compatibility by maintaining existing 1, 2, and 4-bit quantization methods. Relates to elastic#139591 * Remove unused docsWriter calls in DiskBBQBulkWriter to streamline bulk writing process. * Add support for 7-bit symmetric quantized vectors in MemorySegmentESNextOSQVectorsScorer - Introduced MSInt7SymmetricESNextOSQVectorsScorer to handle 7-bit symmetric quantization. - Updated MemorySegmentESNextOSQVectorsScorer to support new query/index bits combination. - Modified PanamaESVectorizationProvider to accommodate the new scoring logic for 7-bit vectors. * Add scratch byte array and refactor quantized 7-bit scoring method * [CI] Auto commit changes from spotless * Refactor condition in PanamaESVectorizationProvider for clarity Updated the conditional logic in the PanamaESVectorizationProvider to improve readability by grouping related conditions. This change ensures that the logic for checking query and index bits is clearer and more maintainable. * Add clamping to 7-bit for binary vectors and queries in VectorScorerOSQBenchmark This update introduces a new method, clampTo7Bit, which ensures that binary vectors and queries are clamped to 7 bits when the bits parameter is set to 7. This change enhances the accuracy of the benchmarking process by preventing overflow in the generated byte arrays. * Handles 7-bit quantization and packing for vectors Supports 7-bit quantization by introducing proper packing routines and adjusting test logic to clamp values and pass correct bit types. Fixes issues with handling 7-bit symmetric quantization and ensures consistent query vector creation and scoring. Enhances robustness of tests and vector scorer logic for 7-bit cases. * Switches bits to byte and removes unused method Unifies parameter types for quantization logic, improving clarity and reducing casting. Removes obsolete clamp helper, streamlining vector preprocessing. * Unifies query bits handling and simplifies scorer Removes special-case logic for 7-bit quantization by treating queryBits uniformly across tests and scorer selection. Standardizes scorer naming and updates test parameter generation to ensure comprehensive coverage of query/index bit combinations. Streamlines query data creation for improved maintainability. * [CI] Auto commit changes from spotless * Adds 7-bit quantization support for IVF index Extends quantization options to include 7 bits for IVF vectors, enabling improved flexibility and potential accuracy in benchmarking and testing. Updates validation to accept 7-bit quantization and refactors relevant call sites to use the new option. Relates to improved quantization capabilities. --------- Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>

Enables the use of 7-bit quantization for indexed vectors in BBQ, improving flexibility for disk-based vector fields. Updates validation, documentation, and tests to accommodate the new option and ensures correct parameter handling throughout the codebase. Relates to elastic#141183

* Adds support for 7-bit quantization in BBQ index Enables the use of 7-bit quantization for indexed vectors in BBQ, improving flexibility for disk-based vector fields. Updates validation, documentation, and tests to accommodate the new option and ensures correct parameter handling throughout the codebase. Relates to #141183 * Updates quantization encoding selection logic Switches from using a shifted ID to directly interpreting bits for quantization encoding selection, improving accuracy and consistency in vector format initialization.

* Adds support for 7-bit quantization in BBQ index Enables the use of 7-bit quantization for indexed vectors in BBQ, improving flexibility for disk-based vector fields. Updates validation, documentation, and tests to accommodate the new option and ensures correct parameter handling throughout the codebase. Relates to elastic#141183 * Updates quantization encoding selection logic Switches from using a shifted ID to directly interpreting bits for quantization encoding selection, improving accuracy and consistency in vector format initialization.

ah89 added >feature :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.4.0 labels Jan 23, 2026

ah89 requested review from benwtrent and tteofili January 23, 2026 14:54

benwtrent reviewed Jan 23, 2026

View reviewed changes

tteofili reviewed Feb 6, 2026

View reviewed changes

ah89 force-pushed the feature/bbq-multibit-quantization-139591 branch from f3feeec to 6acfa0f Compare February 11, 2026 01:21

ah89 added 2 commits February 11, 2026 10:37

Merge branch 'main' into feature/bbq-multibit-quantization-139591

17a960e

Remove unused docsWriter calls in DiskBBQBulkWriter to streamline bul…

5589e32

…k writing process.

benwtrent requested changes Feb 11, 2026

View reviewed changes

ah89 added 2 commits February 11, 2026 16:39

Merge branch 'main' into feature/bbq-multibit-quantization-139591

58352e1

ah89 requested review from benwtrent and tteofili February 12, 2026 16:18

Merge branch 'main' into feature/bbq-multibit-quantization-139591

7eb4f44

benwtrent reviewed Feb 12, 2026

View reviewed changes

tteofili approved these changes Feb 13, 2026

View reviewed changes

Add scratch byte array and refactor quantized 7-bit scoring method

cd27e9c

ah89 requested a review from benwtrent February 13, 2026 23:47

ah89 and others added 4 commits February 17, 2026 17:38

Merge branch 'main' into feature/bbq-multibit-quantization-139591

be47645

[CI] Auto commit changes from spotless

3bb0df3

Merge branch 'main' into feature/bbq-multibit-quantization-139591

169c37a

benwtrent and others added 3 commits February 18, 2026 10:42

Merge branch 'main' into feature/bbq-multibit-quantization-139591

b41379a

Merge branch 'main' into feature/bbq-multibit-quantization-139591

f9329e3

benwtrent reviewed Feb 25, 2026

View reviewed changes

thecoop reviewed Feb 25, 2026

View reviewed changes

...hmarks/src/main/java/org/elasticsearch/benchmark/vector/scorer/VectorScorerOSQBenchmark.java Outdated Show resolved Hide resolved

ah89 added 2 commits February 25, 2026 10:09

Switches bits to byte and removes unused method

3409834

Unifies parameter types for quantization logic, improving clarity and reducing casting. Removes obsolete clamp helper, streamlining vector preprocessing.

Merge branch 'main' into feature/bbq-multibit-quantization-139591

c9cc38c

ah89 enabled auto-merge (squash) February 26, 2026 04:37

ah89 requested review from benwtrent, thecoop and tteofili February 26, 2026 04:37

ldematte reviewed Feb 26, 2026

View reviewed changes

benwtrent approved these changes Feb 26, 2026

View reviewed changes

ah89 disabled auto-merge February 27, 2026 00:29

ah89 and others added 4 commits February 27, 2026 02:14

[CI] Auto commit changes from spotless

1c4c310

Merge branch 'main' into feature/bbq-multibit-quantization-139591

70e594d

ldematte approved these changes Feb 27, 2026

View reviewed changes

tteofili approved these changes Feb 27, 2026

View reviewed changes

ah89 merged commit 2058a7e into elastic:main Feb 27, 2026
35 checks passed

ah89 mentioned this pull request Mar 4, 2026

Adds support for 7-bit quantization in BBQ index #143611

Merged

		private final ESNextOSQVectorsScorer osqVectorsScorer;
		private final ES92Int7VectorsScorer int7VectorsScorer;

index_type	quantize_bits	recall	latency (ms)	QPS	visited
ivf	4	0.56	0.16–0.20	5000–6250	2374
ivf	7	0.61	0.66–0.68	1470–1515	2362

		int addition = Short.toUnsignedInt(input.readShort());
		int addition = input.readInt();

Conversation

ah89 commented Jan 23, 2026

Uh oh!

elasticsearchmachine commented Jan 23, 2026

Uh oh!

benwtrent left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ah89 Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

benwtrent commented Feb 5, 2026

Uh oh!

tteofili left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

benwtrent left a comment

Choose a reason for hiding this comment

Uh oh!

ah89 commented Feb 12, 2026

Uh oh!

tteofili commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benwtrent left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tteofili left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

benwtrent left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tteofili commented Feb 26, 2026

Uh oh!

ldematte left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ah89 Feb 11, 2026 •

edited

Loading

tteofili commented Feb 12, 2026 •

edited

Loading

tteofili left a comment •

edited

Loading