Skip to content

Reduce memory usage in writer with more memory efficient output buffer implementation #24913

Merged
sdruzkin merged 1 commit intoprestodb:masterfrom
chenyangfb:orc_output_buffer
Jul 5, 2025
Merged

Reduce memory usage in writer with more memory efficient output buffer implementation #24913
sdruzkin merged 1 commit intoprestodb:masterfrom
chenyangfb:orc_output_buffer

Conversation

@chenyangfb
Copy link
Contributor

@chenyangfb chenyangfb commented Apr 14, 2025

Description

Currently ChunkedSliceOutput is used for storing compressed output in writer. It managed list of buffers with size of power of 2 (e.g. 8k, 16k, 32k), and reuse buffers after flushing. It could leads to extra memory usage and OOM due to 1) mismatch in compressed output size and buffer size, 2) reusing buffers and not freeing buffers leads to extra memory usage by design.

Common scenario which leads to OOM includes

  1. large number of streams with small amount of data (100k stream with 1k compressed bytes), each using minimal buffer size (e.g. 8k)
  2. Each stream is wasting half of largest buffer (e.g. 8M out of 16M buffer)
  3. Writer memory usage is high even after flushing (Reduce memory usage in writer by freeing unused buffers #23724 support freeing unused buffer in chunk supplier during reset)

This PR introduce OrcLazyChunkedOutputBuffer which focus on avoiding used memory.

  1. Create buffer size based on the size of compressed output, this avoid the issue 1) and 2) mentioned above
  2. lazy initialization in OrcLazyChunkedOutputBuffer and OrcOutputBuffer
  3. Reset all the closed buffers, only keep the active buffer.

This behavior is controlled by lazyOutputBuffer in OrcWriterOptions, and it's disabled by default.

Impact

Reduce memory usage in writer.

Test Plan

Tested with Spark workload with high memory usage.
~10% improvement in run time and resource usage (memory reservation time), reduced GC time.
Tested with general Spark workload
No change in cpu time, slight reduction in run time and GC time.

== RELEASE NOTES ==
General Changes
* Improve efficiency of output buffer implementation to reduce memory usage in writer.

@chenyangfb chenyangfb force-pushed the orc_output_buffer branch 4 times, most recently from b88e026 to 04a0df7 Compare April 14, 2025 17:13
@chenyangfb chenyangfb changed the title Add OrcLazyChunkedOutputBuffer which is more memory efficient Reduce memory usage in writer with more memory efficient output buffer implementation Apr 14, 2025
@chenyangfb chenyangfb marked this pull request as ready for review April 14, 2025 18:18
@chenyangfb chenyangfb requested review from a team and sdruzkin as code owners April 14, 2025 18:18
@chenyangfb chenyangfb requested a review from presto-oss April 14, 2025 18:18
@steveburnett
Copy link
Contributor

Thanks for the release note! Formatting and rephrasing suggestions to help follow the Release Notes Guidelines:

Release Notes

General Changes
* Improve efficiency of output buffer implementation to reduce memory usage in writer. 

@chenyangfb
Copy link
Contributor Author

Thanks for the release note! Formatting and rephrasing suggestions to help follow the Release Notes Guidelines:

Release Notes

General Changes
* Improve efficiency of output buffer implementation to reduce memory usage in writer. 

Thanks. updated

@steveburnett
Copy link
Contributor

Please format the release note with a row of three ` above and below, like this:

== RELEASE NOTES ==

General Changes
* Improve efficiency of output buffer implementation to reduce memory usage in writer.

@chenyangfb chenyangfb force-pushed the orc_output_buffer branch 2 times, most recently from 5eeb72e to 48e410b Compare May 5, 2025 21:56
bufferPool.size(),
bufferPool.stream().mapToInt(b -> b.length).sum());
bufferPool.clear();
System.setProperty("RESET_OUTPUT_BUFFER", "RESET_OUTPUT_BUFFER");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do?

@sdruzkin
Copy link
Collaborator

sdruzkin commented Jun 5, 2025

Can you please rebase it? Also would be nice to run this change in Vader to get 1-2k samples. I'll do a second round after that.

@chenyangfb chenyangfb force-pushed the orc_output_buffer branch from 48e410b to 8323da4 Compare June 6, 2025 16:08
@chenyangfb
Copy link
Contributor Author

Rebased. What is Vader? How do I run this change in Vader?

Can you please rebase it? Also would be nice to run this change in Vader to get 1-2k samples. I'll do a second round after that.

@chenyangfb
Copy link
Contributor Author

Discussed with Sergii offline, ran Validation Service (Vader), results looks good, ~2k successful samples without failures https://fburl.com/scuba/dwrf_reader_checksum/dzgyy5xi

@sdruzkin
Copy link
Collaborator

sdruzkin commented Jun 26, 2025

LGTM, please do the minor change I requested and we will land it.

@chenyangfb chenyangfb force-pushed the orc_output_buffer branch from 7bc6321 to 5252922 Compare July 3, 2025 21:37
@chenyangfb
Copy link
Contributor Author

rebased

@Override
public void writeHeader(int value)
{
buffer[bufferPosition] = (byte) (value & 0x00_00FF);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change to a more concise:

buffer[bufferPosition++] = (byte) (value & 0x00_00FF);
buffer[bufferPosition++] = (byte) ((value & 0x00_FF00) >> 8);
buffer[bufferPosition++] = (byte) ((value & 0xFF_0000) >> 16);

@sdruzkin sdruzkin merged commit 75243fd into prestodb:master Jul 5, 2025
108 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants