-
Notifications
You must be signed in to change notification settings - Fork 5.5k
Reduce wasted memory in ChunkedSliceOutput #23707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -53,10 +53,9 @@ public class OrcOutputBuffer | |
| private static final int INSTANCE_SIZE = ClassLayout.parseClass(OrcOutputBuffer.class).instanceSize(); | ||
| private static final int PAGE_HEADER_SIZE = 3; // ORC spec 3 byte header | ||
| private static final int INITIAL_BUFFER_SIZE = 256; | ||
| private static final int MINIMUM_OUTPUT_BUFFER_CHUNK_SIZE = 4 * 1024; | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm confused. You are claiming that the old min chunk size was 8kb, but here in the Orc compression output slice it's 4kb. Can you please clarify this?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added more context in Summary and the comment on ChunkedSliceOutput (line 374). |
||
| private static final int MAXIMUM_OUTPUT_BUFFER_CHUNK_SIZE = 1024 * 1024; | ||
|
|
||
| private final int maxBufferSize; | ||
| private final int minOutputBufferChunkSize; | ||
| private final int maxOutputBufferChunkSize; | ||
| private final int minCompressibleSize; | ||
|
|
||
| private final CompressionBufferPool compressionBufferPool; | ||
|
|
@@ -86,6 +85,8 @@ public OrcOutputBuffer(ColumnWriterOptions columnWriterOptions, Optional<DwrfDat | |
|
|
||
| CompressionKind compressionKind = columnWriterOptions.getCompressionKind(); | ||
| this.maxBufferSize = compressionKind == CompressionKind.NONE ? maxBufferSize : maxBufferSize - PAGE_HEADER_SIZE; | ||
| this.minOutputBufferChunkSize = columnWriterOptions.getMinOutputBufferChunkSize(); | ||
| this.maxOutputBufferChunkSize = columnWriterOptions.getMaxOutputBufferChunkSize(); | ||
| this.minCompressibleSize = compressionKind.getMinCompressibleSize(); | ||
|
|
||
| this.buffer = new byte[INITIAL_BUFFER_SIZE]; | ||
|
|
@@ -470,7 +471,7 @@ private void flushBufferToOutputStream() | |
| private void initCompressedOutputStream() | ||
| { | ||
| checkState(compressedOutputStream == null, "compressedOutputStream is already initialized"); | ||
| compressedOutputStream = new ChunkedSliceOutput(MINIMUM_OUTPUT_BUFFER_CHUNK_SIZE, MAXIMUM_OUTPUT_BUFFER_CHUNK_SIZE); | ||
| compressedOutputStream = new ChunkedSliceOutput(minOutputBufferChunkSize, maxOutputBufferChunkSize); | ||
| } | ||
|
|
||
| private void writeChunkToOutputStream(byte[] chunk, int offset, int length) | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert this unnecessary change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previous behaviour is double the currentSize before allocation which mean even through the initial size was configured at 4kb, the actual initial allocation is at 8kb.
I think this is a bug, and this PR double the currentSize after allocation so the initial allocation is based on minOutputBufferChunkSize