Reduce memory usage in writer by freeing unused buffers#23724
Reduce memory usage in writer by freeing unused buffers#23724sdruzkin merged 1 commit intoprestodb:masterfrom
Conversation
01405f7 to
3f8ec38
Compare
3f8ec38 to
7cab405
Compare
a6fd141 to
2d5d698
Compare
2d5d698 to
68885be
Compare
|
Memory allocation is expensive and we added the pool for this reason. Please add performance benchmark results for a case comparing writing a 500MB file with 5-7 columns with the new option enabled and disabled. |
76c15f2 to
e78aef2
Compare
|
Thanks for the release note! Just a nit or two suggested. |
| bufferPool.clear(); | ||
| System.setProperty("RESET_OUTPUT_BUFFER", "RESET_OUTPUT_BUFFER"); | ||
| } | ||
| bufferPool.addAll(0, usedBuffers); |
There was a problem hiding this comment.
If you want to save on the memory, do you really want to keep the last batch of chunks? I suppose you don't.
| usedBuffers.stream().mapToInt(b -> b.length).sum(), | ||
| bufferPool.size(), | ||
| bufferPool.stream().mapToInt(b -> b.length).sum()); | ||
| bufferPool.clear(); |
There was a problem hiding this comment.
This is a tricky place. If you look at the ChunkSupplier.get method you will see that ChunkSupplier scales up chunks from 256 bytes all the way to 16MB when it needs to produce a new chunk. If you reset the pool and start producing new chunks you will eventually end up only with big 16MB new chunks. Which is probably opposite of your goal.
I don't know what is a best strategy here, because it's really depends on the data shapes, but I see a few options:
- Have chunk supplier that does not have a buffer, and alway produces smaller fixed size chunks. Will perform best in terms of overhead, but will regress for small streams.
- Have chunk supplier that does not have a buffer, and alway produces scaled up chunks with
resetresetting the current size to min size. Good middle ground option. - Have chunk supplier that does not have a buffer, and alway produces scaled up chunks with
resetresetting the current size to min size AND using smaller max chunk size. Will perform even better in terms of overhead, won't regress as much as 1 for small streams.
e78aef2 to
3fb5783
Compare
|
Saved that user @chenyangfb is from Meta |
|
Discussed with Sergii offline, ran Validation Service (Vader), results looks good, ~3.5k successful samples without failures https://fburl.com/scuba/dwrf_reader_checksum/crm55ef8 |
Description
Currently ChunkedSliceOutput keep buffers in bufferPool and usedBuffers, it never free those buffer, even with reset(), leads to extra memory usage and OOM.
This PR avoid those extra memory usage by freeing unused buffer in chunk supplier during reset. This behavior is controlled by resetOutputBuffer in OrcWriterOptions, and it's disabled by default.
Example ChunkedSliceOutput behaviour before and after the change
Assume first batch of output is large, it used 3.75M
after writing those output, and before we call reset()
the usedBuffers contains a list of buffer with increasing sizes (assuming min buffer is 256k, max buffer is 1M for illustration purpose)
256k, 512k, 1M, 1M, 1M
The second batch of output is smaller, it will only used 1.75M
after writing those output, and before we call reset()
usedBuffers have 256k, 512k, 1M,
bufferPool have 1M, 1M
before the change
we will keep all 5 buffers, and will never free them
after the change
we will keep the 3 buffer which are used (256k, 512k, 1M) and free up the 2 buffer which are unused (1M, 1M)
In other word, previous behaviour only scale up number of buffers, new behaviour also can scale down number of buffer based on memory usage
Impact
Reduce memory usage in writer.
Test Plan
Tested with Spark workload
Example output showing unused buffer getting freed after the change