Compact deserialized page in SingleStreamSpiller#16338
Compact deserialized page in SingleStreamSpiller#16338rschlussel merged 1 commit intoprestodb:masterfrom
Conversation
There was a problem hiding this comment.
nit:
Iterator<Page> compactedPages = transform(deserializedPages, page -> {
page.compact();
return page;
});
or simply Iterator<Page> compactedPages = transform(deserializedPages, Page::compact) if you change the Page::compact implementation to return this as in other methods in the Page interface.
Deserialized pages may contain block that is not compact (e.g.,VariableWidthBlock might contain a slice that is a view of all columns), which results in incorrect retained size and memory accounting issue. When a block holds a slice of the entire page, depends on number of affected blocks in a page, memory could be off by N times (N >= count of VariableWidthBlock).
fdb3867 to
8c4db97
Compare
|
Can you explain why the retainedSizeInBytes would be incorrect for non-compacted pages? |
|
@rschlussel See commit message, VariableWidthBlock holds a slice, when it is compact, this slice is the block itself, when it is not compact, this slice is a reference of the whole deserialized page. How the retained size of a page is calculated? it adds up the retained size of each block, thus the slice reference could be added multiple times by multiple VariableWidthBlock. |
Got it, thanks. So without this, we would allocate more memory than needed, and the query would fail with oom when it didn't have to. |
Yes, that is why I started to investigate why it reserved so much memory. 10 references means 10X than expected size. |
Deserialized pages may contain block that is not
compact (e.g.,VariableWidthBlock might contain
a slice that is a view of all columns), which
results in incorrect retained size and memory
accounting issue.
When a block holds a slice of the entire page,
depends on number of affected blocks in a page,
memory could be off by N times (N >= count of
VariableWidthBlock).