Skip to content

Fix GPU merge ClassCastException with wrapped directories#143531

Merged
elasticsearchmachine merged 17 commits intoelastic:mainfrom
mayya-sharipova:fix/gpu-merge-unwrap-filter-directory
Mar 10, 2026
Merged

Fix GPU merge ClassCastException with wrapped directories#143531
elasticsearchmachine merged 17 commits intoelastic:mainfrom
mayya-sharipova:fix/gpu-merge-unwrap-filter-directory

Conversation

@mayya-sharipova
Copy link
Copy Markdown
Contributor

MemorySegmentUtils directly cast the Directory to FSDirectory,
but Elasticsearch wraps directories in Store$StoreDirectory
(which extends FilterDirectory, not FSDirectory). When vector
data exceeds MMapDirectory's max chunk size, the fallback path
hit this cast and threw a ClassCastException, failing all shard merges.

ClassCastException:
Store$StoreDirectory cannot be cast to FSDirectory
(Store$StoreDirectory is in module
org.elasticsearch.server; FSDirectory is in module
org.apache.lucene.core)

Use FilterDirectory.unwrap() to peel through wrapper layers before casting.

Also fix log message formatting for segment size values.

Relates to #141872

MemorySegmentUtils directly cast the Directory to FSDirectory,
but Elasticsearch wraps directories in Store$StoreDirectory
(which extends FilterDirectory, not FSDirectory). When vector
data exceeds MMapDirectory's max chunk size, the fallback path
hit this cast and threw a ClassCastException, failing all
shard merges.

  ClassCastException:
    Store$StoreDirectory cannot be cast to FSDirectory
    (Store$StoreDirectory is in module
    org.elasticsearch.server; FSDirectory is in module
    org.apache.lucene.core)

Use FilterDirectory.unwrap() to peel through wrapper layers
before casting.

Also fix log message formatting for segment size values.
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Mar 3, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @mayya-sharipova, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

Copy link
Copy Markdown
Contributor

@ldematte ldematte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks good, but can you please add an IT test (or possibly a javaRestTest?) that tests this is working as intended?

@ChrisHegarty
Copy link
Copy Markdown
Contributor

The changes look good to me. I wonder if we should start to conditionally add some of these "known" wrappers to our unit tests. It's quite annoying to have to catch these things with an IT - even though an IT is the only definitive way to know that things actually work as expected.

@ldematte
Copy link
Copy Markdown
Contributor

ldematte commented Mar 4, 2026

I wonder if we should start to conditionally add some of these "known" wrappers to our unit tests.

That's an interesting idea; we could add some utilities, perhaps a base class, and methods to get dir/in/out implementations with random filters from a set of known ones (and maybe even some unknown ones, to be robust in case we introduce new ones?)

@ChrisHegarty
Copy link
Copy Markdown
Contributor

That's an interesting idea; we could add some utilities, perhaps a base class, and methods to get dir/in/out implementations with random filters from a set of known ones (and maybe even some unknown ones, to be robust in case we introduce new ones?)

I opened this small PR with an initial step towards this. It also has some additions that reproduce the bug fixed by this PR. It's not a replacement for an IT, but a simple step towards closing the gap between unit tests and IT! ( @mayya-sharipova, feel free to fold anything from my PR into this one, or we can merge it separately as a follow up )

@mayya-sharipova mayya-sharipova added the test-gpu Run tests using a GPU label Mar 4, 2026
…ck test

In the else branch of mergeByteVectorField, each int8_hnsw vector occupies
dims+4 bytes (dims quantized bytes + 4-byte scalar quantization correction
constant). The previous code read only dims bytes per vector, causing all
vectors after the first to be read from the wrong offset and corrupting the
GPU graph.

Also splits the merge fallback integration test into two separate test methods
(hnsw and int8_hnsw), adds before/after quality assertion requiring at least
4/10 KNN results to overlap across the merge boundary, and fixes flush to
use flushAndRefresh so documents are visible to the NRT reader.
@mayya-sharipova
Copy link
Copy Markdown
Contributor Author

mayya-sharipova commented Mar 4, 2026

@ldematte @ChrisHegarty
I've added an Integration test.
Two findings from GPUMergeFallbackIT:

  • The test was intended to exercise the MemorySegmentUtils contiguous-mapping path, but the log shows
    "Cannot mmap merged raw vectors temporary file. IndexInput type [RandomAccessIndexInput]".
    This failure to reache the underlying MemorySegmentAccessInput looks to be only for tests.
  • On the bright side, hitting the fallback path (where we can't reach MemorySegment) exposed a bug in the int8_hnsw else-branch: each vector is dims + 4 bytes on disk, but the code read only dims bytes per vector, corrupting alignment for all subsequent vectors. Fixed by explicitly consuming the trailing 4 correction bytes.

Should we replace FilterIndexInput.unwrapOnlyTest(slice) with FilterIndexInput.unwrap(slice) in ES92GpuHnswVectorsWriter (both mergeFloatVectorField and mergeByteVectorField) for this test to exercise MemorySegment contiguous path?

1. Use FilterIndexInput.unwrap instead of unwrapOnlyTest in
   ES92GpuHnswVectorsWriter. unwrapOnlyTest stops at ES's
   RandomAccessIndexInput wrapper, so the instanceof
   MemorySegmentAccessInput check always failed. unwrap strips
   all FilterIndexInput layers, reaching MMapIndexInput.

2. Use FileChannel.map(MapMode, long, long) in
   MemorySegmentUtils.createFileBackedMemorySegment instead of
   the Java 21 Arena-based overload. The ES entitlement proxy
   wraps FileChannel without overriding the newer method, causing
   UnsupportedOperationException. MappedByteBuffer is wrapped via
   MemorySegment.ofBuffer() and kept in the holder to prevent GC
   from unmapping it prematurely.
   TODO: revert to Arena-based map once entitlement supports it.

3. Add file read_write entitlement on the indices data path for
   org.elasticsearch.gpu, required by FileChannel.open on the
   temp file created during the MemorySegment fallback.
MemorySegment mapped = fc.map(FileChannel.MapMode.READ_ONLY, 0L, dataSize, arena);
return new FileBackedMemorySegmentHolder(mapped, arena, dataFile);
}
try (FileChannel fc = FileChannel.open(dataFile, Set.of(READ))) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ldematte This was written by AI. Can you please check this is correct.

With this change and changes to entitlement I was able to successful execute GPuMergeFallBackIT, and could see in logs that we fall back to memory mapped file

try (FileChannel fc = FileChannel.open(dataFile, Set.of(READ))) {
// TODO: switch to FileChannel.map(MapMode, long, long, Arena) once the ES entitlement system supports it.
// The Arena-based overload currently throws UnsupportedOperationException because the entitlement proxy
// wraps FileChannel without overriding the Java 21 method. The MappedByteBuffer is kept in the holder
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very strange. Entitlements do not work like that, they do not wrap things. The change to the policy is absolutely correct, but this should be not needed. Have you tried without this change (but with the policy)?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've double checked; we actually do not cover map at all, so this change is not relevant entitlement wise. The policy change also should not be needed, but that's a bug and it should be, so I would leave that change in place. If FileChannel.map throws UnsupportedOperationException is for another reason. Are you sure you are executing these tests with runtime >= 22?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading https://docs.oracle.com/en/java/javase//24/docs/api/java.base/java/nio/channels/FileChannel.html#map(java.nio.channels.FileChannel.MapMode,long,long,java.lang.foreign.Arena)

  • the API is JDK 22+
  • the default implementation throws UnsupportedOperationException.
    I looked at the JDK code, and this is kind of misleading: the only implementation of FileChannel is sun.nio.ch.FileChannelImpl. But with a caveat: a custom FileSystemProvider implementation can return null from newFileChannel(). I suspect we might have some test filesystem here at play (maybe just in tests, maybe in production code too?)

That said, I'm happy to keep the ByteBuffer variant of map, but the comment is bogus so I'd remove it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I found it; it's probably lucene and wrapping/unwrapping again. FileChannel provides a default implementation of map with the Arena that simply throws UnsupportedOperationException. FileChannel is abstract, and its only implementation (sun.nio.ch.FileChannelImpl) overrides it. BUT any FileChannel subclass that doesn't explicitly override the Arena variant will throw.
Lucene's FilterFileChannel overrides only the 3-arg map variant. So if any wrapped FileChannel based on FilterFileChannel gets called, you get UnsupportedOperationException.
Lucene "knows" this (https://github.com/apache/lucene/blob/ba7f659b1d81dabbcb62bc3fff763f03d601c5fb/lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java#L326) and unwraps before; the proper fix would be to do the same (Unwrappable.unwrapAll).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the proper fix would be for FilterFileChannel to override the Arena overload too, but...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I think I've beaten AI this time :D )

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pointers, you indeed have bitter AI :)

I've used your suggestion: Path unwrappedPath = Unwrappable.unwrapAll(dataFile); and everything works now!!!

Unwrap the data file path with Unwrappable.unwrapAll
before opening FileChannel so test-only filesystem
layers are stripped and the real FileChannelImpl is
used. This replaces the MappedByteBuffer workaround
with deterministic Arena-based memory mapping. Also
use skipBytes instead of reading into a throwaway
buffer when skipping scalar quantization correction
bytes during int8_hnsw merge.
@mayya-sharipova mayya-sharipova force-pushed the fix/gpu-merge-unwrap-filter-directory branch from 0bfc154 to 4c1a806 Compare March 6, 2026 20:55
@mayya-sharipova
Copy link
Copy Markdown
Contributor Author

mayya-sharipova commented Mar 6, 2026

@ldematte I've addressed your last comments. With all these unwrappings, finally everything works as expected. Please continue to review.

I've also created a Lucene PR to add 4-arg map override to FilterFileChannel

elasticsearchmachine and others added 3 commits March 6, 2026 21:03
Use CuVSGPUSupport.instance().isSupported() instead of
GPUSupport.isSupported() since isSupported() is an instance method.
Copy link
Copy Markdown
Contributor

@ldematte ldematte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!! LGTM

Copy link
Copy Markdown
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mayya-sharipova mayya-sharipova added auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) auto-backport Automatically create backport pull requests when merged labels Mar 9, 2026
@mayya-sharipova
Copy link
Copy Markdown
Contributor Author

@mayya-sharipova
Copy link
Copy Markdown
Contributor Author

@elasticmachine run Elasticsearch Serverless Checks

@elasticsearchmachine elasticsearchmachine merged commit cd1cb4b into elastic:main Mar 10, 2026
37 checks passed
@mayya-sharipova mayya-sharipova deleted the fix/gpu-merge-unwrap-filter-directory branch March 10, 2026 02:49
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

💚 Backport successful

Status Branch Result
9.3

mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this pull request Mar 10, 2026
…3531)

MemorySegmentUtils directly cast the Directory to FSDirectory,  but
Elasticsearch wraps directories in Store$StoreDirectory  (which extends
FilterDirectory, not FSDirectory). When vector  data exceeds
MMapDirectory's max chunk size, the fallback path  hit this cast and
threw a ClassCastException, failing all shard merges.

ClassCastException:     Store$StoreDirectory cannot be cast to
FSDirectory     (Store$StoreDirectory is in module    
org.elasticsearch.server; FSDirectory is in module    
org.apache.lucene.core)

Use FilterDirectory.unwrap() to peel through wrapper layers before
casting.

Also fix log message formatting for segment size values.

Relates to elastic#141872
elasticsearchmachine pushed a commit that referenced this pull request Mar 10, 2026
) (#143924)

* Fix GPU merge ClassCastException with wrapped directories (#143531)

MemorySegmentUtils directly cast the Directory to FSDirectory,  but
Elasticsearch wraps directories in Store$StoreDirectory  (which extends
FilterDirectory, not FSDirectory). When vector  data exceeds
MMapDirectory's max chunk size, the fallback path  hit this cast and
threw a ClassCastException, failing all shard merges.

ClassCastException:     Store$StoreDirectory cannot be cast to
FSDirectory     (Store$StoreDirectory is in module    
org.elasticsearch.server; FSDirectory is in module    
org.apache.lucene.core)

Use FilterDirectory.unwrap() to peel through wrapper layers before
casting.

Also fix log message formatting for segment size values.

Relates to #141872

* Fix compilation error: update CuVSGPUSupport to GPUSupport in GPUMergeFallbackIT
ChrisHegarty added a commit that referenced this pull request Mar 26, 2026
…143563)

Elasticsearch wraps directories in multiple FilterDirectory layers in production (e.g. StoreDirectory -> ByteSizeCachingDirectory -> MMapDirectory), but unit tests typically pass bare directory instances. This mismatch allowed the ClassCastException fixed in #143531 to go undetected by unit tests.

This PR adds ESTestCase.maybeWrapDirectoryInFilterDirectory, a randomized test utility that wraps a Directory in 0-3 FilterDirectory layers. It can be used by any test where production code receives a Directory that may be wrapped. And we can expand up it later with specific ES wrapper types, if necessary, but minimally FilterDirectory is enough to catch most of these cases.

MemorySegmentUtilsTests is updated to use this utility in all tests that pass a Directory to getContiguousMemorySegment / getContiguousPackedMemorySegment. Additionally, new tests are also added for getContiguousPackedMemorySegment, which previously had no direct test coverage.

relates #143531
seanzatzdev pushed a commit to seanzatzdev/elasticsearch that referenced this pull request Mar 26, 2026
…lastic#143563)

Elasticsearch wraps directories in multiple FilterDirectory layers in production (e.g. StoreDirectory -> ByteSizeCachingDirectory -> MMapDirectory), but unit tests typically pass bare directory instances. This mismatch allowed the ClassCastException fixed in elastic#143531 to go undetected by unit tests.

This PR adds ESTestCase.maybeWrapDirectoryInFilterDirectory, a randomized test utility that wraps a Directory in 0-3 FilterDirectory layers. It can be used by any test where production code receives a Directory that may be wrapped. And we can expand up it later with specific ES wrapper types, if necessary, but minimally FilterDirectory is enough to catch most of these cases.

MemorySegmentUtilsTests is updated to use this utility in all tests that pass a Directory to getContiguousMemorySegment / getContiguousPackedMemorySegment. Additionally, new tests are also added for getContiguousPackedMemorySegment, which previously had no direct test coverage.

relates elastic#143531
seanzatzdev pushed a commit to seanzatzdev/elasticsearch that referenced this pull request Mar 27, 2026
…lastic#143563)

Elasticsearch wraps directories in multiple FilterDirectory layers in production (e.g. StoreDirectory -> ByteSizeCachingDirectory -> MMapDirectory), but unit tests typically pass bare directory instances. This mismatch allowed the ClassCastException fixed in elastic#143531 to go undetected by unit tests.

This PR adds ESTestCase.maybeWrapDirectoryInFilterDirectory, a randomized test utility that wraps a Directory in 0-3 FilterDirectory layers. It can be used by any test where production code receives a Directory that may be wrapped. And we can expand up it later with specific ES wrapper types, if necessary, but minimally FilterDirectory is enough to catch most of these cases.

MemorySegmentUtilsTests is updated to use this utility in all tests that pass a Directory to getContiguousMemorySegment / getContiguousPackedMemorySegment. Additionally, new tests are also added for getContiguousPackedMemorySegment, which previously had no direct test coverage.

relates elastic#143531
mamazzol pushed a commit to mamazzol/elasticsearch that referenced this pull request Mar 30, 2026
…lastic#143563)

Elasticsearch wraps directories in multiple FilterDirectory layers in production (e.g. StoreDirectory -> ByteSizeCachingDirectory -> MMapDirectory), but unit tests typically pass bare directory instances. This mismatch allowed the ClassCastException fixed in elastic#143531 to go undetected by unit tests.

This PR adds ESTestCase.maybeWrapDirectoryInFilterDirectory, a randomized test utility that wraps a Directory in 0-3 FilterDirectory layers. It can be used by any test where production code receives a Directory that may be wrapped. And we can expand up it later with specific ES wrapper types, if necessary, but minimally FilterDirectory is enough to catch most of these cases.

MemorySegmentUtilsTests is updated to use this utility in all tests that pass a Directory to getContiguousMemorySegment / getContiguousPackedMemorySegment. Additionally, new tests are also added for getContiguousPackedMemorySegment, which previously had no direct test coverage.

relates elastic#143531
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >bug :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch test-gpu Run tests using a GPU v9.3.2 v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants