-
Notifications
You must be signed in to change notification settings - Fork 602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage
: fixes for book keeping of dirty
and closed
bytes
#25076
base: dev
Are you sure you want to change the base?
Conversation
No functional changes in this PR, just a new accessor function.
There were a number of bugs in the book-keeping that are fixed by this commit: 1. Some calls to `subtract_dirty_segment_bytes` which did not check if the segment in question was already cleanly compacted. This would leave to over-deduction of dirty bytes. 2. In `do_compact_adjacent_segments()`, it is possible to merge a cleanly compacted segment with a non-cleanly compacted segment. Later on, when this replacement segment is finally marked cleanly compacted, the sum of bytes (including those that have already been removed when the cleanly compacted segment was marked) will once again be deducted. We must add these back and consider them as "dirty" to avoid this over-deduction. 3. Also in `do_compact_adjacent_segments()`, a call to `remove_segment_permanently()` for one of the segments being merged would remove its dirty and closed bytes. These bytes are still present in the replacement segment and should not be removed. 4. Finally, in `rewrite_segment_with_offset_map()`, we would again over-deduct dirty bytes for a segment that was already marked as cleanly compacted.
CI test resultstest results on build#61806
|
/ci-repeat 1 |
|
||
add_dirty_segment_bytes(dirty_bytes_to_readd); | ||
const auto removed_bytes = ret.size_before - ret.size_after; | ||
if (!target->has_clean_compact_timestamp()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disclaimer: I'm still confused about dirty/clean segments but trying to make sense of this as a learning exercise too.
I'm confused by this if statement. To me it looks like a copy paste mistake. target
here is the newly created segment. has_clean_compact_timestamp
is the property on that new segment?
Why we're subtracting removed_bytes based on the "clean state" of the new segment? What does this mean?
let's say we started with 2 segments:
- a) clean 0 - dirty bytes; b) non-clean - 100 dirty bytes
the dirty bytes counter is at 100
dirty_bytes_to_readd
is at 100 too
the newly compacted segment is 75 bytes for example, therefore, removed_bytes = 25
target in this case is considered non-clean? if so
we add 100 to dirty bytes (counter is at 200)
we subtract 25 * 2
counter for dirty is at 150?
Questions:
* is the math right?
* are you sure you want to check target here rather thansegments.back()
?
On the second look the math looks all right.
Last question: non-trivial to add a test for these "metrics"? The logic doesn't seem very simple so maybe we can add a test to make sure we don't break it in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still confused about dirty/clean segments but trying to make sense of this as a learning exercise too.
A "clean" segment is a closed segment that has had all of its keys fully deduplicated in its prefix of the log. Clean segments can only be produced as a result of being fully indexed (and using the key-offset map to eliminate all duplicates in earlier segments in the log) during the sliding window process.
A "dirty' segment is a closed segment that has not yet fulfilled the above requirement.
Why we're subtracting removed_bytes based on the "clean state" of the new segment? What does this mean?
The new segment, target
, is only designated as "clean" if both of the segments merged together were designated as "clean". As an aside, if segment one had n_1
bytes and segment two had n_2
bytes, target
will have exactly n_t = n_1 + n_2
bytes.
In the case that target
is not clean (i.e dirty), after the self-compaction process of target
, we must subtract the removed bytes from dirty segment bytes. Otherwise, we would be miscounting bytes.
Take the following example in which we have only two closed segments in the log.
Segment 1 (150 bytes, clean) and Segment 2 (100 bytes, dirty). Therefore, _dirty_segment_bytes = 100
, _closed_segment_bytes = 250
.
S1
andS2
are selected for adjacent compactiondirty_bytes_to_readd
is therefore 150.target
is created by mergingS1
andS2
, size is 250 bytes, segment is designated as "dirty" due toS2
being dirty.- Self compaction of
target
removes 50 bytes, size oftarget
is now 200 bytes - We add back
dirty_bytes_to_readd
;_dirty_segment_bytes
is now 250. - We subtract bytes removed,
_dirty_segment_bytes
is now200
,_closed_segment_bytes
is now 200.
This makes sense, as we only have one closed segment in the log, and it is dirty. When it is eventually marked as clean by sliding window compaction the entire 200 bytes will be subtracted from _dirty_segment_bytes
, leaving it as 0, while _closed_segment_bytes
remains 200.
I hope this explains it a bit better, please let me know if there's anything else unclear.
Last question: non-trivial to add a test for these "metrics"? The logic doesn't seem very simple so maybe we can add a test to make sure we don't break it in the future?
There is currently checks in log_compaction_test.py
, though not very in depth. There are also fixture tests in this PR.
The most ideal test would be a fixture test in which producing, compaction and other processes are happening, in which we could loop over the segments in a log and manually compute dirty_segment_bytes
and closed_segment_bytes
and then compare against the book-kept values.
I'd be happy to look at adding it to make sure the book-keeping is now bug free.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking something very simple, more like a unit test: create few segments of precise size, compact, check the metrics equal to expected size. I haven’t worked on this part of the codebase so don’t know what’s available in terms of testing. I’ll defer to you the decision of how and when.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And thanks for explanation! 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re the if statement: it is basically the same as saying !all_segments_cleanly_compacted
or bytes_to_readd > 0
but in an indirect way? If so, can we remove the if statement completely? If target is cleanly compacted then bytes to readd will be 0 and calls to subtract will have no effect.
Less cognitive load for the reader.
auto removed_segment_bytes = s->size_bytes(); | ||
if (!s->index().has_clean_compact_timestamp()) { | ||
subtract_dirty_segment_bytes(removed_segment_bytes); | ||
if (deduct_bytes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this would be cleaner if we didn't have the bool flag. Instead, in the caller when we create the new segment, we can increment dirty bytes accounting for what is going to be subtracted. You already added a special case there so can be extended to cover this case as well.
We're replace [a, b] with [c] and removing [b]. [c] is all dirty. If b was dirty then increment dirty bytes by [c] size, else by [c] - [a]. Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can change this to over-add to dirty bytes, expecting the subtraction later.
} | ||
|
||
add_dirty_segment_bytes(dirty_bytes_to_readd); | ||
const auto removed_bytes = ret.size_before - ret.size_after; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think removed_bytes
can be negative 🤔 subtract_dirty_segment_bytes
accepts an uint.
We do recompress batches during compaction. If they were produced to us with the highest compression ration but we recompress with a lower compression ratio then, I imagine, the size could become larger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I was aware we recompress, but I didn't think this would ever lead to a situation where the recompressed batch could be larger than the original compressed batch...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar behaviour can be caused by compression library upgrade. There is no guarantee that compression output is the same across different versions of compressor. Or just settings changes of course like block sizes, etc.
There were a number of bugs in the
disk_log_impl
dirty
andclosed
bytes book-keeping that are fixed by this PR:subtract_dirty_segment_bytes
which did not check if the segment in question was already cleanly compacted. This would leave to over-deduction of dirty bytes.do_compact_adjacent_segments()
, it is possible to merge a cleanly compacted segment with a non-cleanly compacted segment. Later on, when this replacement segment is finally marked cleanly compacted, the sum of bytes (including those that have already been removed when the cleanly compacted segment was marked) will once again be deducted. We must add these back and consider them as "dirty" to avoid this over-deduction.do_compact_adjacent_segments()
, a call toremove_segment_permanently()
for one of the segments being merged would remove its dirty and closed bytes. These bytes are still present in the replacement segment and should not be removed.rewrite_segment_with_offset_map()
, we would again over-deduct dirty bytes for a segment that was already marked as cleanly compacted.Backports Required
Release Notes