-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bytes offset bug and duplicate readers and add uTs for derived source #2494
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, let me know if the UT needs to be added
verify(derivedSourceVectorInjector, times(0)).injectVectors(anyInt(), any()); | ||
verify(delegate, times(1)).binaryField(any(), any()); | ||
|
||
// When field is not _source, then do call the injector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: When the field is _source, then do call the injector
FieldInfo fieldInfo = KNNCodecTestUtil.FieldInfoBuilder.builder(FIELD_NAME).build(); | ||
try (MockedStatic<KNNVectorValuesFactory> mockedKnnVectorValues = Mockito.mockStatic(KNNVectorValuesFactory.class)) { | ||
mockedKnnVectorValues.when(() -> KNNVectorValuesFactory.getVectorValues(fieldInfo, null, null)) | ||
.thenReturn(new KNNVectorValues<float[]>(new KNNVectorValuesIterator() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: You can probably use TestVectorValues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to use it - but seems that it might be easier to just mock the iterator. Ill update.
@@ -79,7 +79,7 @@ public void writeField(FieldInfo fieldInfo, BytesRef bytesRef) throws IOExceptio | |||
// Reference: | |||
// https://github.com/opensearch-project/OpenSearch/blob/2.18.0/server/src/main/java/org/opensearch/index/mapper/SourceFieldMapper.java#L322 | |||
Tuple<? extends MediaType, Map<String, Object>> mapTuple = XContentHelper.convertToMap( | |||
BytesReference.fromByteBuffer(ByteBuffer.wrap(bytesRef.bytes)), | |||
BytesReference.fromByteBuffer(ByteBuffer.wrap(bytesRef.bytes, bytesRef.offset, bytesRef.length)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a UT for this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense - added and validated fix before and after
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Fixes a bug in the derived source writer where we are reading the entire bytes array from the bytes ref instead of just the offset+length. Along with that, touches up the ParentChildHelper (no prod impact) and also adds some unit tests. Signed-off-by: John Mazanec <[email protected]>
c13bc82
to
4bce385
Compare
Signed-off-by: John Mazanec <[email protected]>
23ecfc0
to
91365c9
Compare
Signed-off-by: John Mazanec <[email protected]>
@@ -23,7 +24,8 @@ | |||
*/ | |||
@RequiredArgsConstructor | |||
@Getter | |||
public class DerivedSourceReaders implements Closeable { | |||
@Log4j2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit-pick] you are not using Log4j2 in the code
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-main main
# Navigate to the new working tree
cd .worktrees/backport-main
# Create a new branch
git switch --create backport/backport-2494-to-main
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 ab33538a23db95be4c1f41f41c3a4a9d8f848a3e
# Push it to GitHub
git push --set-upstream origin backport/backport-2494-to-main
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-main Then, create a pull request where the |
…urce (#2494) Fixes a bug in the derived source writer where we are reading the entire bytes array from the bytes ref instead of just the offset+length. Also reuses readers to prevent memory leak Along with that, touches up the ParentChildHelper (no prod impact) and also adds some unit tests. Signed-off-by: John Mazanec <[email protected]> (cherry picked from commit ab33538)
Description
Fixes a bug in the derived source writer where we are reading the entire bytes array from the bytes ref instead of just the offset+length.
This was the error:
In order to hit this error, I was using OpenSearch Benchmarks. However, when I dont use OpenSearch benchmarks (i.e. curl or plugin iTs), it seems to work fine. But, with OpenSearch benchmarks locally, I verified the fix.
Also, fixes readers to only open one (extra) per segment
Along with that, touches up the ParentChildHelper (no prod impact) and also adds some unit tests.
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.