-
Notifications
You must be signed in to change notification settings - Fork 792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(5851): ArrowWriter memory usage #5967
Changes from all commits
b796982
b9a002f
23775bd
0702756
3f8681b
dcefe9e
678766b
1d289e6
aced4a1
3ca8839
81e5efb
27ceff8
9043f20
6485b3c
099b00e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -176,7 +176,16 @@ impl<W: AsyncFileWriter> AsyncArrowWriter<W> { | |
self.sync_writer.flushed_row_groups() | ||
} | ||
|
||
/// Returns the estimated length in bytes of the current in progress row group | ||
/// Estimated memory usage, in bytes, of this `ArrowWriter` | ||
/// | ||
/// See [ArrowWriter::memory_size] for more information. | ||
pub fn memory_size(&self) -> usize { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I also added these wrappers for symmetry with the other AsyncWriter methods |
||
self.sync_writer.memory_size() | ||
} | ||
|
||
/// Anticipated encoded size of the in progress row group. | ||
/// | ||
/// See [ArrowWriter::memory_size] for more information. | ||
pub fn in_progress_size(&self) -> usize { | ||
self.sync_writer.in_progress_size() | ||
} | ||
|
@@ -419,16 +428,30 @@ mod tests { | |
let initial_size = writer.in_progress_size(); | ||
assert!(initial_size > 0); | ||
assert_eq!(writer.in_progress_rows(), batch.num_rows()); | ||
let initial_memory = writer.memory_size(); | ||
// memory estimate is larger than estimated encoded size | ||
assert!( | ||
initial_size <= initial_memory, | ||
"{initial_size} <= {initial_memory}" | ||
); | ||
|
||
// updated on second write | ||
writer.write(&batch).await.unwrap(); | ||
assert!(writer.in_progress_size() > initial_size); | ||
assert_eq!(writer.in_progress_rows(), batch.num_rows() * 2); | ||
assert!(writer.memory_size() > initial_memory); | ||
assert!( | ||
writer.in_progress_size() <= writer.memory_size(), | ||
"in_progress_size {} <= memory_size {}", | ||
writer.in_progress_size(), | ||
writer.memory_size() | ||
); | ||
|
||
// in progress tracking is cleared, but the overall data written is updated | ||
let pre_flush_bytes_written = writer.bytes_written(); | ||
writer.flush().await.unwrap(); | ||
assert_eq!(writer.in_progress_size(), 0); | ||
assert_eq!(writer.memory_size(), 0); | ||
assert_eq!(writer.in_progress_rows(), 0); | ||
assert!(writer.bytes_written() > pre_flush_bytes_written); | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -93,10 +93,17 @@ pub trait ColumnValueEncoder { | |
/// Returns true if this encoder has a dictionary page | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. likewise, while this trait is marked |
||
fn has_dictionary(&self) -> bool; | ||
|
||
/// Returns an estimate of the dictionary page size in bytes, or `None` if no dictionary | ||
/// Returns the estimated total memory usage of the encoder | ||
alamb marked this conversation as resolved.
Show resolved
Hide resolved
|
||
/// | ||
fn estimated_memory_size(&self) -> usize; | ||
|
||
/// Returns an estimate of the encoded size of dictionary page size in bytes, or `None` if no dictionary | ||
fn estimated_dict_page_size(&self) -> Option<usize>; | ||
|
||
/// Returns an estimate of the data page size in bytes | ||
/// Returns an estimate of the encoded data page size in bytes | ||
/// | ||
/// This should include: | ||
/// <already_written_encoded_byte_size> + <estimated_encoded_size_of_unflushed_bytes> | ||
fn estimated_data_page_size(&self) -> usize; | ||
|
||
/// Flush the dictionary page for this column chunk if any. Any subsequent calls to | ||
|
@@ -227,6 +234,24 @@ impl<T: DataType> ColumnValueEncoder for ColumnValueEncoderImpl<T> { | |
self.dict_encoder.is_some() | ||
} | ||
|
||
fn estimated_memory_size(&self) -> usize { | ||
let encoder_size = self.encoder.estimated_memory_size(); | ||
|
||
let dict_encoder_size = self | ||
.dict_encoder | ||
.as_ref() | ||
.map(|encoder| encoder.estimated_memory_size()) | ||
.unwrap_or_default(); | ||
|
||
let bloom_filter_size = self | ||
.bloom_filter | ||
.as_ref() | ||
.map(|bf| bf.estimated_memory_size()) | ||
.unwrap_or_default(); | ||
|
||
encoder_size + dict_encoder_size + bloom_filter_size | ||
} | ||
|
||
fn estimated_dict_page_size(&self) -> Option<usize> { | ||
Some(self.dict_encoder.as_ref()?.dict_encoded_size()) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me a while to understand that this calculation actually also includes the
in_progress_size
tooWhat do you think about making this more explicit like
And then change code like
to only include the estimated memory size:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
in_progress_size
includes the flushed_bytes + unflushed_bytes.The
memory_size
includes flushed_bytes + unflushed_bytes + processing_size.At first glance, it does look like we could do
memory_size = in_progress_size + processing_size
. But what the calculation actually is:in_progress_size = flushed_bytes +
unflushed_bytes_anticipated_size_after_flush
memory_size = flushed_bytes +
unflushed_bytes_current_mem_size
+ processing_size.Per our approach to memory limiting, we should have unflushed bytes (in buffer) already be encoded. However, I believe that's not the case for the indices on the dict encoder? As such, the accounting in this PR specifically considers unflush_bytes seperately -- and pushes down the new
memory_size()
interface until where we can delineate the DictEncoder vs other encoders.I added a commit to help explain. Please let me know if I misunderstood. 🙏🏼
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking from a user's point of view (e.g. our usecase in InfluxDB )
If I want to know how much memory the ArrowWriter is using (so I can track it, against a budget for example) what API should I use?
I was worried that I couldn't use
ArrowWriter::memory_size
because that does not include the estimated data pages sizes AND I couldn't useArrowWriter::memory_size() + ArrowWriter::in_progress_size()
because that would double count perviously buffered data.However, after reviewing the code more closely I see the difference is that
ArrowWriter::in_progress_size
includes an estimate of how large the parquet encoded data would be after flush (which is not actually memory currently used) which presumably in most cases will be smaller than the actual bytes used. I will try and updated the comments as well to clarify this