-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HADOOP-18391. Improvements in VectoredReadUtils#readVectored() for direct buffers #4787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-18391. Improvements in VectoredReadUtils#readVectored() for direct buffers #4787
Conversation
…rect buffers part of HADOOP-18103.
steveloughran
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will scale better. regarding the issue about handling 0 byte ranges, the key thing is "no need to allocate a big buffer". we get that wrong with our block output stream, where we could postpone memory allocation until the first byte is written.
| ByteBuffer buffer) throws IOException { | ||
| int readBytes = 0; | ||
| int offset = 0; | ||
| byte[] tmp = new byte[TMP_BUFFER_MAX_SIZE]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make a min of length, TMP_BUFFER_MAX_SIZE for more efficiency on small reads
that is why I haven't added comments in the code yet
steveloughran
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good; one minor change on the new function interface, plus the yetus comments...
...oject/hadoop-common/src/main/java/org/apache/hadoop/util/functional/Function4RaisingIOE.java
Outdated
Show resolved
Hide resolved
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
steveloughran
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
…rect buffers (#4787) part of HADOOP-18103. Contributed By: Mukund Thakur
…rect buffers (apache#4787) part of HADOOP-18103. Contributed By: Mukund Thakur
part of HADOOP-18103.
Description of PR
VectoredReadUtils.readInDirectBuffer should allocate a max buffer size, .e.g 4mb, then do repeated reads and copies; this ensures that you don't OOM with many threads doing ranged requests.
How was this patch tested?
Ran the existing tests. Also added a UT for zero byte file ranges.
For code changes:
LICENSE,LICENSE-binary,NOTICE-binaryfiles?