Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bytes offset bug and duplicate readers and add uTs for derived source #2494

Merged
merged 3 commits into from
Feb 6, 2025

Conversation

jmazanec15
Copy link
Member

@jmazanec15 jmazanec15 commented Feb 5, 2025

Description

Fixes a bug in the derived source writer where we are reading the entire bytes array from the bytes ref instead of just the offset+length.

This was the error:

[2025-02-04T23:43:54,004][WARN ][o.o.i.e.Engine           ] [test] [target_index][0] failed engine [already closed by tragic event on the index writer]
org.opensearch.core.compress.NotXContentException: Compressor detection can only be called on some xcontent bytes or compressed xcontent bytes
	at org.opensearch.core.compress.CompressorRegistry.compressor(CompressorRegistry.java:75) ~[opensearch-core-2.19.0.jar:2.19.0]
	at org.opensearch.common.xcontent.XContentHelper.convertToMap(XContentHelper.java:166) ~[opensearch-2.19.0.jar:2.19.0]
	at org.opensearch.knn.index.codec.KNN9120Codec.DerivedSourceStoredFieldsWriter.writeField(DerivedSourceStoredFieldsWriter.java:81) ~[?:?]
	at org.apache.lucene.index.StoredFieldsConsumer.writeField(StoredFieldsConsumer.java:80) ~[lucene-core-9.12.1.jar:9.12.1 7a97a05a239d6fb9f1f347aa09bfa52e875be092 - 2024-12-09 16:47:48]
	at org.apache.lucene.index.IndexingChain.processField(IndexingChain.java:754) ~[lucene-core-9.12.1.jar:9.12.1 7a97a05a239d6fb9f1f347aa09bfa52e875be092 - 2024-12-09 16:47:48]
	at org.apache.lucene.index.IndexingChain.processDocument(IndexingChain.java:609) ~[lucene-core-9.12.1.jar:9.12.1 7a97a05a239d6fb9f1f347aa09bfa52e875be092 - 2024-12-09 16:47:48]
	at org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:263) ~[lucene-core-9.12.1.jar:9.12.1 7a97a05a239d6fb9f1f347aa09bfa52e875be092 - 2024-12-09 16:47:48]
	at org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:425) ~[lucene-core-9.12.1.jar:9.12.1 7a97a05a239d6fb9f1f347aa09bfa52e875be092 - 2024-12-09 16:47:48]
	at org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1562) ~[lucene-core-9.12.1.jar:9.12.1 7a97a05a239d6fb9f1f347aa09bfa52e875be092 - 2024-12-09 16:47:48]
	at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1847) ~[lucene-core-9.12.1.jar:9.12.1 7a97a05a239d6fb9f1f347aa09bfa52e875be092 - 2024-12-09 16:47:48]
	at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1487) ~[lucene-core-9.12.1.jar:9.12.1 7a97a05a239d6fb9f1f347aa09bfa52e875be092 - 2024-12-09 16:47:48]
	at org.opensearch.index.engine.InternalEngine.addDocs(InternalEngine.java:1295) ~[opensearch-2.19.0.jar:2.19.0]
	at org.opensearch.index.engine.InternalEngine.indexIntoLucene(InternalEngine.java:1231) ~[opensearch-2.19.0.jar:2.19.0]
	at org.opensearch.index.engine.InternalEngine.index(InternalEngine.java:1012) ~[opensearch-2.19.0.jar:2.19.0]
	at org.opensearch.index.shard.IndexShard.index(IndexShard.java:1216) ~[opensearch-2.19.0.jar:2.19.0]
	at org.opensearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:1161) ~[opensearch-2.19.0.jar:2.19.0]
	at org.opensearch.index.shard.IndexShard.applyIndexOperationOnPrimary(IndexShard.java:1052) ~[opensearch-2.19.0.jar:2.19.0]
	at org.opensearch.action.bulk.TransportShardBulkAction.executeBulkItemRequest(TransportShardBulkAction.java:625) ~[opensearch-2.19.0.jar:2.19.0]
	at org.opensearch.action.bulk.TransportShardBulkAction$2.doRun(TransportShardBulkAction.java:471) ~[opensearch-2.19.0.jar:2.19.0]
	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) ~[opensearch-2.19.0.jar:2.19.0]
	at org.opensearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:535) ~[opensearch-2.19.0.jar:2.19.0]
	at org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:416) ~[opensearch-2.19.0.jar:2.19.0]
	at org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:125) ~[opensearch-2.19.0.jar:2.19.0]
	at org.opensearch.action.support.replication.TransportWriteAction$1.doRun(TransportWriteAction.java:275) ~[opensearch-2.19.0.jar:2.19.0]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1014) ~[opensearch-2.19.0.jar:2.19.0]
	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) ~[opensearch-2.19.0.jar:2.19.0]
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) [?:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) [?:?]
	at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]

In order to hit this error, I was using OpenSearch Benchmarks. However, when I dont use OpenSearch benchmarks (i.e. curl or plugin iTs), it seems to work fine. But, with OpenSearch benchmarks locally, I verified the fix.

Also, fixes readers to only open one (extra) per segment

Along with that, touches up the ParentChildHelper (no prod impact) and also adds some unit tests.

Check List

  • Commits are signed per the DCO using --signoff.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Collaborator

@shatejas shatejas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, let me know if the UT needs to be added

verify(derivedSourceVectorInjector, times(0)).injectVectors(anyInt(), any());
verify(delegate, times(1)).binaryField(any(), any());

// When field is not _source, then do call the injector
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: When the field is _source, then do call the injector

FieldInfo fieldInfo = KNNCodecTestUtil.FieldInfoBuilder.builder(FIELD_NAME).build();
try (MockedStatic<KNNVectorValuesFactory> mockedKnnVectorValues = Mockito.mockStatic(KNNVectorValuesFactory.class)) {
mockedKnnVectorValues.when(() -> KNNVectorValuesFactory.getVectorValues(fieldInfo, null, null))
.thenReturn(new KNNVectorValues<float[]>(new KNNVectorValuesIterator() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: You can probably use TestVectorValues

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to use it - but seems that it might be easier to just mock the iterator. Ill update.

@@ -79,7 +79,7 @@ public void writeField(FieldInfo fieldInfo, BytesRef bytesRef) throws IOExceptio
// Reference:
// https://github.com/opensearch-project/OpenSearch/blob/2.18.0/server/src/main/java/org/opensearch/index/mapper/SourceFieldMapper.java#L322
Tuple<? extends MediaType, Map<String, Object>> mapTuple = XContentHelper.convertToMap(
BytesReference.fromByteBuffer(ByteBuffer.wrap(bytesRef.bytes)),
BytesReference.fromByteBuffer(ByteBuffer.wrap(bytesRef.bytes, bytesRef.offset, bytesRef.length)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a UT for this case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense - added and validated fix before and after

Vikasht34
Vikasht34 previously approved these changes Feb 5, 2025
Copy link
Collaborator

@Vikasht34 Vikasht34 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Fixes a bug in the derived source writer where we are reading the entire
bytes array from the bytes ref instead of just the offset+length.

Along with that, touches up the ParentChildHelper (no prod impact) and
also adds some unit tests.

Signed-off-by: John Mazanec <[email protected]>
@jmazanec15 jmazanec15 changed the title Fix bytes bug and add uTs for derived source Fix bytes offset bug and duplicate readers and add uTs for derived source Feb 6, 2025
Signed-off-by: John Mazanec <[email protected]>
@@ -23,7 +24,8 @@
*/
@RequiredArgsConstructor
@Getter
public class DerivedSourceReaders implements Closeable {
@Log4j2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit-pick] you are not using Log4j2 in the code

@jmazanec15 jmazanec15 merged commit ab33538 into opensearch-project:2.19 Feb 6, 2025
99 of 103 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to main failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-main main
# Navigate to the new working tree
cd .worktrees/backport-main
# Create a new branch
git switch --create backport/backport-2494-to-main
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 ab33538a23db95be4c1f41f41c3a4a9d8f848a3e
# Push it to GitHub
git push --set-upstream origin backport/backport-2494-to-main
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-main

Then, create a pull request where the base branch is main and the compare/head branch is backport/backport-2494-to-main.

opensearch-trigger-bot bot pushed a commit that referenced this pull request Feb 6, 2025
…urce (#2494)

Fixes a bug in the derived source writer where we are reading the entire
bytes array from the bytes ref instead of just the offset+length.

Also reuses readers to prevent memory leak

Along with that, touches up the ParentChildHelper (no prod impact) and
also adds some unit tests.

Signed-off-by: John Mazanec <[email protected]>
(cherry picked from commit ab33538)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport main backport 2.x Bug Fixes Changes to a system or product designed to handle a programming bug/glitch skip-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants