Add int7 support for DiskBBQ and tests#141183
Conversation
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
benwtrent
left a comment
There was a problem hiding this comment.
What does recall & latency look like?
Good first pass here.
| private final ESNextOSQVectorsScorer osqVectorsScorer; | ||
| private final ES92Int7VectorsScorer int7VectorsScorer; |
There was a problem hiding this comment.
we should do this. please augment the OSQ scorer. Flipping between two of them here will lead to confusion.
...main/java/org/elasticsearch/index/codec/vectors/diskbbq/next/ESNextDiskBBQVectorsReader.java
Outdated
Show resolved
Hide resolved
| @@ -203,6 +212,9 @@ public void writeVectors(QuantizedVectorValues qvv, CheckedIntConsumer<IOExcepti | |||
| writeCorrections(corrections); | |||
| } | |||
| // write tail | |||
There was a problem hiding this comment.
And this doesn't break the scoring? none of the other bits encode the tail as blocks yet.
//cc @tteofili
There was a problem hiding this comment.
agreed, this is likely to affect recall, @ah89 did you check with KnnIndexTester ?
There was a problem hiding this comment.
Since LargeBitDiskBBQBulkWriter was seemingly never instantiated before case 7 was introduced—with cases 1, 2, and 4 handled by SmallBitDiskBBQBulkWriter—removing it corrupts the on-disk layout, preventing the reader from locating the doc IDs at the start of the block.
There was a problem hiding this comment.
agreed, this is likely to affect recall, @ah89 did you check with
KnnIndexTester?
Verified with KnnIndexTester on GIST-1M (960 dims, 100K docs, 100 queries, euclidean, IVF):
| index_type | quantize_bits | recall | latency (ms) | QPS | visited |
|---|---|---|---|---|---|
| ivf | 4 | 0.56 | 0.16–0.20 | 5000–6250 | 2374 |
| ivf | 7 | 0.61 | 0.66–0.68 | 1470–1515 | 2362 |
7-bit shows higher recall (0.61 vs 0.56) as expected from reduced quantization error. The docsWriter fix is validated — if doc IDs were missing or misaligned, recall would drop to near zero.
as per my previous comment LargeBitDiskBBQBulkWriter was never instantiated before this PR (only case 7 routes to it, previously only 1/2/4 existed). The docsWriter.accept(i) calls mirror what SmallBitDiskBBQBulkWriter already does — writing doc IDs at the start of each bulk blokc so the reader can associate scores back to documents.
There was a problem hiding this comment.
ok, the thing is that now the ESNextDiskBBQVectorsWriter always calls DiskBBQBulkWriter#fromBitSize with the blockEncodeTailVectors parameter set to true (I was in fact thinking of dropping the parameter entirely), therefore you should see LargeBitEncodedDiskBBQBulkWriter being used with 7 bits.
There was a problem hiding this comment.
I see, so I will remove the docsWriter calls from the dead LargeBitDiskBBQBulkWriter (never instantiated), but will keep them on LargeBitEncodedDiskBBQBulkWriter -- the one that's actually used.
|
@ah89 I think all the blockers are finished :) You should be able to progress on this now. |
tteofili
left a comment
There was a problem hiding this comment.
my only concern here is with the DiskBBQBulkWriter comment.
libs/simdvec/src/main/java/org/elasticsearch/simdvec/ESNextOSQVectorsScorer.java
Outdated
Show resolved
Hide resolved
| @@ -203,6 +212,9 @@ public void writeVectors(QuantizedVectorValues qvv, CheckedIntConsumer<IOExcepti | |||
| writeCorrections(corrections); | |||
| } | |||
| // write tail | |||
There was a problem hiding this comment.
agreed, this is likely to affect recall, @ah89 did you check with KnnIndexTester ?
| int dimensions = 4; | ||
| int bulkSize = 2; |
There was a problem hiding this comment.
how about a couple of similar tests with a random pick within (i) arrays of allowed values and a (ii) arrays of incompatible values ?
- Added support for 7-bit symmetric quantization in VectorScorerOSQBenchmark and ESNextOSQVectorsScorer. - Updated the handling of quantization in various methods to accommodate the new bit size. - Modified tests to validate the new 7-bit encoding functionality in DiskBBQBulkWriter and related classes. - Ensured backward compatibility by maintaining existing 1, 2, and 4-bit quantization methods. Relates to elastic#139591
f3feeec to
6acfa0f
Compare
benwtrent
left a comment
There was a problem hiding this comment.
You are going to need to update MemorySegmentESNextOSQVectorsScorer
The 7-bit QPS went from ~1,500 to ~9,500, a ~6x speedup with negligible recall drops on the same test configuration.
|
…extOSQVectorsScorer - Introduced MSInt7SymmetricESNextOSQVectorsScorer to handle 7-bit symmetric quantization. - Updated MemorySegmentESNextOSQVectorsScorer to support new query/index bits combination. - Modified PanamaESVectorizationProvider to accommodate the new scoring logic for 7-bit vectors.
|
once we're satisfied with the low level impl, we might open it up to be configurable in DenseVectorFieldMapper (although this also changes query bits, not just indexed vectors bits). |
benwtrent
left a comment
There was a problem hiding this comment.
Glad to see the perf numbers :). One concern on the 'unoptimized path'.
| int total = 0; | ||
| for (int i = 0; i < dimensions; i++) { | ||
| total += in.readByte() * q[i]; | ||
| } | ||
| return total; |
There was a problem hiding this comment.
let's put this in its own method and there is no reason for it to be so poorly optimized.
Please, read in the entire byte array and use VectorUtil.dotProduct.
There was a problem hiding this comment.
The logic replaced with quantized7BitScore method that bulk-reads all index vector bytes at once into a pre-allocated reusable scratch buffer and computes the result using VectorUtil.dotProduct
| int total = 0; | ||
| for (int i = 0; i < dimensions; i++) { | ||
| total += in.readByte() * q[i]; | ||
| } | ||
| return total; |
Updated the conditional logic in the PanamaESVectorizationProvider to improve readability by grouping related conditions. This change ensures that the logic for checking query and index bits is clearer and more maintainable.
Supports 7-bit quantization by introducing proper packing routines and adjusting test logic to clamp values and pass correct bit types. Fixes issues with handling 7-bit symmetric quantization and ensures consistent query vector creation and scoring. Enhances robustness of tests and vector scorer logic for 7-bit cases.
| int addition = Short.toUnsignedInt(input.readShort()); | ||
| int addition = input.readInt(); |
There was a problem hiding this comment.
It doesn't break, as it's reading garbage, but it's the same garbage each time, so the scores still match up in the benchmark test.
benwtrent
left a comment
There was a problem hiding this comment.
My only concern now is the weird OSQ benchmark change.
...hmarks/src/main/java/org/elasticsearch/benchmark/vector/scorer/VectorScorerOSQBenchmark.java
Outdated
Show resolved
Hide resolved
Unifies parameter types for quantization logic, improving clarity and reducing casting. Removes obsolete clamp helper, streamlining vector preprocessing.
|
would it be possible to add the corresponding option also in |
ldematte
left a comment
There was a problem hiding this comment.
I came in and checked it is "compatible" with our recent changes. Overall looks good, I left some comments to improve readability, but nothing functional
...hmarks/src/main/java/org/elasticsearch/benchmark/vector/scorer/VectorScorerOSQBenchmark.java
Outdated
Show resolved
Hide resolved
| import java.lang.foreign.MemorySegment; | ||
|
|
||
| /** Panamized scorer for 7-bit symmetric quantized vectors stored as a {@link MemorySegment}. */ | ||
| final class MSInt7SymmetricESNextOSQVectorsScorer extends MemorySegmentESNextOSQVectorsScorer.MemorySegmentScorer { |
There was a problem hiding this comment.
Nit: we settled on a convention for type names, to explicit the data and query sizes. This would be MSD7Q7ESNextOSQVectorsScorer (D7Q7 meaning: 7 bits for index data, 7 bits for queries); it is true we should rename the other implementation classes, but it would be nice to start using it for new code. CC @thecoop
| import java.io.IOException; | ||
| import java.lang.foreign.MemorySegment; | ||
|
|
||
| /** Panamized scorer for 7-bit symmetric quantized vectors stored as a {@link MemorySegment}. */ |
There was a problem hiding this comment.
NIT: I would call this "Vectorized", as the underlying implementation can be either Panama or Native.
| final byte[] vector = new byte[length]; | ||
|
|
||
| final int queryBytes = length * (queryBits / indexBits); | ||
| final int queryBytes = indexBits == 7 ? dimensions : length * (queryBits / indexBits); |
There was a problem hiding this comment.
They are probably identical, but shouldn't this be length instead of dimensions? For consistency and readability
There was a problem hiding this comment.
Done. For 7-bit quantization, length and dimensions are identical (each dimension is stored as one byte), but you are right — using length is more consistent and makes the intent clearer.
| // padding bytes. | ||
| final IndexInput slice = in.slice("test", 0, (long) length * numVectors); | ||
| final var defaultScorer = defaultProvider().newESNextOSQVectorsScorer( | ||
| byte effectiveQueryBits = indexBits == 7 ? (byte) 7 : queryBits; |
There was a problem hiding this comment.
Instead of doing that multiple times, let's make queryBits non static and initialize it in the ctor.
There was a problem hiding this comment.
Even better it would be to pass queryBits in the ctor too and generate the correct combinations in parametersFactory(), so when/if we extend it (e.g. to support D4Q7 or other combinations) we need to make minimal changes.
| ); | ||
| final byte[] quantizeQuery = new byte[queryVectorPackedLengthInBytes]; | ||
| ESVectorUtil.transposeHalfByte(scratch, quantizeQuery); | ||
| if (queryBits == 7) { |
There was a problem hiding this comment.
We don't need this IF, packQuery for 4 bits is ESVectorUtil.transposeHalfByte()
benwtrent
left a comment
There was a problem hiding this comment.
Let's have lorenzo or simon approve to make sure benchmarking is good. But the rest looks good to me :).
Removes special-case logic for 7-bit quantization by treating queryBits uniformly across tests and scorer selection. Standardizes scorer naming and updates test parameter generation to ensure comprehensive coverage of query/index bit combinations. Streamlines query data creation for improved maintainability.
Extends quantization options to include 7 bits for IVF vectors, enabling improved flexibility and potential accuracy in benchmarking and testing. Updates validation to accept 7-bit quantization and refactors relevant call sites to use the new option. Relates to improved quantization capabilities.
ldematte
left a comment
There was a problem hiding this comment.
Changes to benchmarking now are minimal as the PR includes Simon's fix. LGTM
tteofili
left a comment
There was a problem hiding this comment.
LGTM
since this is the first work that affects the quantization of the query (it is always 4-bits in all the other scenarios), we might have to adjust the way these config options are exposed in dense_vector fields. Currently, we expose the bits config in index_options (1, 2, 4 are valid values), which relates to the bits used for doc embeddings; while we can just add 7 there too, I feel like, for the long run, we need to make sure those options can be consistent (e.g., once we implement 2-2, how do we expose that?).
|
@ah89 we still need to expose this as |
* Enhance vector scoring with 7-bit quantization support - Added support for 7-bit symmetric quantization in VectorScorerOSQBenchmark and ESNextOSQVectorsScorer. - Updated the handling of quantization in various methods to accommodate the new bit size. - Modified tests to validate the new 7-bit encoding functionality in DiskBBQBulkWriter and related classes. - Ensured backward compatibility by maintaining existing 1, 2, and 4-bit quantization methods. Relates to elastic#139591 * Remove unused docsWriter calls in DiskBBQBulkWriter to streamline bulk writing process. * Add support for 7-bit symmetric quantized vectors in MemorySegmentESNextOSQVectorsScorer - Introduced MSInt7SymmetricESNextOSQVectorsScorer to handle 7-bit symmetric quantization. - Updated MemorySegmentESNextOSQVectorsScorer to support new query/index bits combination. - Modified PanamaESVectorizationProvider to accommodate the new scoring logic for 7-bit vectors. * Add scratch byte array and refactor quantized 7-bit scoring method * [CI] Auto commit changes from spotless * Refactor condition in PanamaESVectorizationProvider for clarity Updated the conditional logic in the PanamaESVectorizationProvider to improve readability by grouping related conditions. This change ensures that the logic for checking query and index bits is clearer and more maintainable. * Add clamping to 7-bit for binary vectors and queries in VectorScorerOSQBenchmark This update introduces a new method, clampTo7Bit, which ensures that binary vectors and queries are clamped to 7 bits when the bits parameter is set to 7. This change enhances the accuracy of the benchmarking process by preventing overflow in the generated byte arrays. * Handles 7-bit quantization and packing for vectors Supports 7-bit quantization by introducing proper packing routines and adjusting test logic to clamp values and pass correct bit types. Fixes issues with handling 7-bit symmetric quantization and ensures consistent query vector creation and scoring. Enhances robustness of tests and vector scorer logic for 7-bit cases. * Switches bits to byte and removes unused method Unifies parameter types for quantization logic, improving clarity and reducing casting. Removes obsolete clamp helper, streamlining vector preprocessing. * Unifies query bits handling and simplifies scorer Removes special-case logic for 7-bit quantization by treating queryBits uniformly across tests and scorer selection. Standardizes scorer naming and updates test parameter generation to ensure comprehensive coverage of query/index bit combinations. Streamlines query data creation for improved maintainability. * [CI] Auto commit changes from spotless * Adds 7-bit quantization support for IVF index Extends quantization options to include 7 bits for IVF vectors, enabling improved flexibility and potential accuracy in benchmarking and testing. Updates validation to accept 7-bit quantization and refactors relevant call sites to use the new option. Relates to improved quantization capabilities. --------- Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Enables the use of 7-bit quantization for indexed vectors in BBQ, improving flexibility for disk-based vector fields. Updates validation, documentation, and tests to accommodate the new option and ensures correct parameter handling throughout the codebase. Relates to elastic#141183
* Adds support for 7-bit quantization in BBQ index Enables the use of 7-bit quantization for indexed vectors in BBQ, improving flexibility for disk-based vector fields. Updates validation, documentation, and tests to accommodate the new option and ensures correct parameter handling throughout the codebase. Relates to #141183 * Updates quantization encoding selection logic Switches from using a shifted ID to directly interpreting bits for quantization encoding selection, improving accuracy and consistency in vector format initialization.
* Adds support for 7-bit quantization in BBQ index Enables the use of 7-bit quantization for indexed vectors in BBQ, improving flexibility for disk-based vector fields. Updates validation, documentation, and tests to accommodate the new option and ensures correct parameter handling throughout the codebase. Relates to elastic#141183 * Updates quantization encoding selection logic Switches from using a shifted ID to directly interpreting bits for quantization encoding selection, improving accuracy and consistency in vector format initialization.
* Adds support for 7-bit quantization in BBQ index Enables the use of 7-bit quantization for indexed vectors in BBQ, improving flexibility for disk-based vector fields. Updates validation, documentation, and tests to accommodate the new option and ensures correct parameter handling throughout the codebase. Relates to elastic#141183 * Updates quantization encoding selection logic Switches from using a shifted ID to directly interpreting bits for quantization encoding selection, improving accuracy and consistency in vector format initialization.
Enable 7-bit quantization in DiskBBQ paths.
Extend SIMD OSQ tests and OSQ benchmark for 7-bit.
Closes #139591