Fix GPU merge ClassCastException with wrapped directories#143531
Conversation
MemorySegmentUtils directly cast the Directory to FSDirectory,
but Elasticsearch wraps directories in Store$StoreDirectory
(which extends FilterDirectory, not FSDirectory). When vector
data exceeds MMapDirectory's max chunk size, the fallback path
hit this cast and threw a ClassCastException, failing all
shard merges.
ClassCastException:
Store$StoreDirectory cannot be cast to FSDirectory
(Store$StoreDirectory is in module
org.elasticsearch.server; FSDirectory is in module
org.apache.lucene.core)
Use FilterDirectory.unwrap() to peel through wrapper layers
before casting.
Also fix log message formatting for segment size values.
|
Hi @mayya-sharipova, I've created a changelog YAML for you. |
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
libs/gpu-codec/src/main/java/org/elasticsearch/gpu/codec/MemorySegmentUtils.java
Show resolved
Hide resolved
ldematte
left a comment
There was a problem hiding this comment.
Change looks good, but can you please add an IT test (or possibly a javaRestTest?) that tests this is working as intended?
|
The changes look good to me. I wonder if we should start to conditionally add some of these "known" wrappers to our unit tests. It's quite annoying to have to catch these things with an IT - even though an IT is the only definitive way to know that things actually work as expected. |
That's an interesting idea; we could add some utilities, perhaps a base class, and methods to get dir/in/out implementations with random filters from a set of known ones (and maybe even some unknown ones, to be robust in case we introduce new ones?) |
I opened this small PR with an initial step towards this. It also has some additions that reproduce the bug fixed by this PR. It's not a replacement for an IT, but a simple step towards closing the gap between unit tests and IT! ( @mayya-sharipova, feel free to fold anything from my PR into this one, or we can merge it separately as a follow up ) |
…ck test In the else branch of mergeByteVectorField, each int8_hnsw vector occupies dims+4 bytes (dims quantized bytes + 4-byte scalar quantization correction constant). The previous code read only dims bytes per vector, causing all vectors after the first to be read from the wrong offset and corrupting the GPU graph. Also splits the merge fallback integration test into two separate test methods (hnsw and int8_hnsw), adds before/after quality assertion requiring at least 4/10 KNN results to overlap across the merge boundary, and fixes flush to use flushAndRefresh so documents are visible to the NRT reader.
|
@ldematte @ChrisHegarty
Should we replace |
1. Use FilterIndexInput.unwrap instead of unwrapOnlyTest in ES92GpuHnswVectorsWriter. unwrapOnlyTest stops at ES's RandomAccessIndexInput wrapper, so the instanceof MemorySegmentAccessInput check always failed. unwrap strips all FilterIndexInput layers, reaching MMapIndexInput. 2. Use FileChannel.map(MapMode, long, long) in MemorySegmentUtils.createFileBackedMemorySegment instead of the Java 21 Arena-based overload. The ES entitlement proxy wraps FileChannel without overriding the newer method, causing UnsupportedOperationException. MappedByteBuffer is wrapped via MemorySegment.ofBuffer() and kept in the holder to prevent GC from unmapping it prematurely. TODO: revert to Arena-based map once entitlement supports it. 3. Add file read_write entitlement on the indices data path for org.elasticsearch.gpu, required by FileChannel.open on the temp file created during the MemorySegment fallback.
| MemorySegment mapped = fc.map(FileChannel.MapMode.READ_ONLY, 0L, dataSize, arena); | ||
| return new FileBackedMemorySegmentHolder(mapped, arena, dataFile); | ||
| } | ||
| try (FileChannel fc = FileChannel.open(dataFile, Set.of(READ))) { |
There was a problem hiding this comment.
@ldematte This was written by AI. Can you please check this is correct.
With this change and changes to entitlement I was able to successful execute GPuMergeFallBackIT, and could see in logs that we fall back to memory mapped file
libs/gpu-codec/src/main/java/org/elasticsearch/gpu/codec/ES92GpuHnswVectorsWriter.java
Outdated
Show resolved
Hide resolved
| try (FileChannel fc = FileChannel.open(dataFile, Set.of(READ))) { | ||
| // TODO: switch to FileChannel.map(MapMode, long, long, Arena) once the ES entitlement system supports it. | ||
| // The Arena-based overload currently throws UnsupportedOperationException because the entitlement proxy | ||
| // wraps FileChannel without overriding the Java 21 method. The MappedByteBuffer is kept in the holder |
There was a problem hiding this comment.
This is very strange. Entitlements do not work like that, they do not wrap things. The change to the policy is absolutely correct, but this should be not needed. Have you tried without this change (but with the policy)?
There was a problem hiding this comment.
I've double checked; we actually do not cover map at all, so this change is not relevant entitlement wise. The policy change also should not be needed, but that's a bug and it should be, so I would leave that change in place. If FileChannel.map throws UnsupportedOperationException is for another reason. Are you sure you are executing these tests with runtime >= 22?
There was a problem hiding this comment.
- the API is JDK 22+
- the default implementation throws UnsupportedOperationException.
I looked at the JDK code, and this is kind of misleading: the only implementation ofFileChannelissun.nio.ch.FileChannelImpl. But with a caveat: a customFileSystemProviderimplementation can return null fromnewFileChannel(). I suspect we might have some test filesystem here at play (maybe just in tests, maybe in production code too?)
That said, I'm happy to keep the ByteBuffer variant of map, but the comment is bogus so I'd remove it.
There was a problem hiding this comment.
I think I found it; it's probably lucene and wrapping/unwrapping again. FileChannel provides a default implementation of map with the Arena that simply throws UnsupportedOperationException. FileChannel is abstract, and its only implementation (sun.nio.ch.FileChannelImpl) overrides it. BUT any FileChannel subclass that doesn't explicitly override the Arena variant will throw.
Lucene's FilterFileChannel overrides only the 3-arg map variant. So if any wrapped FileChannel based on FilterFileChannel gets called, you get UnsupportedOperationException.
Lucene "knows" this (https://github.com/apache/lucene/blob/ba7f659b1d81dabbcb62bc3fff763f03d601c5fb/lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java#L326) and unwraps before; the proper fix would be to do the same (Unwrappable.unwrapAll).
There was a problem hiding this comment.
Well, the proper fix would be for FilterFileChannel to override the Arena overload too, but...
There was a problem hiding this comment.
(I think I've beaten AI this time :D )
There was a problem hiding this comment.
Thanks for the pointers, you indeed have bitter AI :)
I've used your suggestion: Path unwrappedPath = Unwrappable.unwrapAll(dataFile); and everything works now!!!
Unwrap the data file path with Unwrappable.unwrapAll before opening FileChannel so test-only filesystem layers are stripped and the real FileChannelImpl is used. This replaces the MappedByteBuffer workaround with deterministic Arena-based memory mapping. Also use skipBytes instead of reading into a throwaway buffer when skipping scalar quantization correction bytes during int8_hnsw merge.
0bfc154 to
4c1a806
Compare
…p-filter-directory
Use CuVSGPUSupport.instance().isSupported() instead of GPUSupport.isSupported() since isSupported() is an instance method.
|
@elasticmachine run Elasticsearch Serverless Checks |
💚 Backport successful
|
…3531) MemorySegmentUtils directly cast the Directory to FSDirectory, but Elasticsearch wraps directories in Store$StoreDirectory (which extends FilterDirectory, not FSDirectory). When vector data exceeds MMapDirectory's max chunk size, the fallback path hit this cast and threw a ClassCastException, failing all shard merges. ClassCastException: Store$StoreDirectory cannot be cast to FSDirectory (Store$StoreDirectory is in module org.elasticsearch.server; FSDirectory is in module org.apache.lucene.core) Use FilterDirectory.unwrap() to peel through wrapper layers before casting. Also fix log message formatting for segment size values. Relates to elastic#141872
) (#143924) * Fix GPU merge ClassCastException with wrapped directories (#143531) MemorySegmentUtils directly cast the Directory to FSDirectory, but Elasticsearch wraps directories in Store$StoreDirectory (which extends FilterDirectory, not FSDirectory). When vector data exceeds MMapDirectory's max chunk size, the fallback path hit this cast and threw a ClassCastException, failing all shard merges. ClassCastException: Store$StoreDirectory cannot be cast to FSDirectory (Store$StoreDirectory is in module org.elasticsearch.server; FSDirectory is in module org.apache.lucene.core) Use FilterDirectory.unwrap() to peel through wrapper layers before casting. Also fix log message formatting for segment size values. Relates to #141872 * Fix compilation error: update CuVSGPUSupport to GPUSupport in GPUMergeFallbackIT
…143563) Elasticsearch wraps directories in multiple FilterDirectory layers in production (e.g. StoreDirectory -> ByteSizeCachingDirectory -> MMapDirectory), but unit tests typically pass bare directory instances. This mismatch allowed the ClassCastException fixed in #143531 to go undetected by unit tests. This PR adds ESTestCase.maybeWrapDirectoryInFilterDirectory, a randomized test utility that wraps a Directory in 0-3 FilterDirectory layers. It can be used by any test where production code receives a Directory that may be wrapped. And we can expand up it later with specific ES wrapper types, if necessary, but minimally FilterDirectory is enough to catch most of these cases. MemorySegmentUtilsTests is updated to use this utility in all tests that pass a Directory to getContiguousMemorySegment / getContiguousPackedMemorySegment. Additionally, new tests are also added for getContiguousPackedMemorySegment, which previously had no direct test coverage. relates #143531
…lastic#143563) Elasticsearch wraps directories in multiple FilterDirectory layers in production (e.g. StoreDirectory -> ByteSizeCachingDirectory -> MMapDirectory), but unit tests typically pass bare directory instances. This mismatch allowed the ClassCastException fixed in elastic#143531 to go undetected by unit tests. This PR adds ESTestCase.maybeWrapDirectoryInFilterDirectory, a randomized test utility that wraps a Directory in 0-3 FilterDirectory layers. It can be used by any test where production code receives a Directory that may be wrapped. And we can expand up it later with specific ES wrapper types, if necessary, but minimally FilterDirectory is enough to catch most of these cases. MemorySegmentUtilsTests is updated to use this utility in all tests that pass a Directory to getContiguousMemorySegment / getContiguousPackedMemorySegment. Additionally, new tests are also added for getContiguousPackedMemorySegment, which previously had no direct test coverage. relates elastic#143531
…lastic#143563) Elasticsearch wraps directories in multiple FilterDirectory layers in production (e.g. StoreDirectory -> ByteSizeCachingDirectory -> MMapDirectory), but unit tests typically pass bare directory instances. This mismatch allowed the ClassCastException fixed in elastic#143531 to go undetected by unit tests. This PR adds ESTestCase.maybeWrapDirectoryInFilterDirectory, a randomized test utility that wraps a Directory in 0-3 FilterDirectory layers. It can be used by any test where production code receives a Directory that may be wrapped. And we can expand up it later with specific ES wrapper types, if necessary, but minimally FilterDirectory is enough to catch most of these cases. MemorySegmentUtilsTests is updated to use this utility in all tests that pass a Directory to getContiguousMemorySegment / getContiguousPackedMemorySegment. Additionally, new tests are also added for getContiguousPackedMemorySegment, which previously had no direct test coverage. relates elastic#143531
…lastic#143563) Elasticsearch wraps directories in multiple FilterDirectory layers in production (e.g. StoreDirectory -> ByteSizeCachingDirectory -> MMapDirectory), but unit tests typically pass bare directory instances. This mismatch allowed the ClassCastException fixed in elastic#143531 to go undetected by unit tests. This PR adds ESTestCase.maybeWrapDirectoryInFilterDirectory, a randomized test utility that wraps a Directory in 0-3 FilterDirectory layers. It can be used by any test where production code receives a Directory that may be wrapped. And we can expand up it later with specific ES wrapper types, if necessary, but minimally FilterDirectory is enough to catch most of these cases. MemorySegmentUtilsTests is updated to use this utility in all tests that pass a Directory to getContiguousMemorySegment / getContiguousPackedMemorySegment. Additionally, new tests are also added for getContiguousPackedMemorySegment, which previously had no direct test coverage. relates elastic#143531
MemorySegmentUtils directly cast the Directory to FSDirectory,
but Elasticsearch wraps directories in Store$StoreDirectory
(which extends FilterDirectory, not FSDirectory). When vector
data exceeds MMapDirectory's max chunk size, the fallback path
hit this cast and threw a ClassCastException, failing all shard merges.
ClassCastException:
Store$StoreDirectory cannot be cast to FSDirectory
(Store$StoreDirectory is in module
org.elasticsearch.server; FSDirectory is in module
org.apache.lucene.core)
Use FilterDirectory.unwrap() to peel through wrapper layers before casting.
Also fix log message formatting for segment size values.
Relates to #141872