-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize file ingestion checks for range deletion overlap #3179
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@ajkr has updated the pull request. View: changes, changes since last import |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks reasonable to me. Just a couple of questions before I give my stamp of approval :)
&f, memtable_range_del_iter.get(), overlap); | ||
if (!status.ok() || *overlap == true) { | ||
break; | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As no new locks are created here, why do you need a new scope?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was to make memtable_range_del_iter
go out of scope asap since it's been std::move
d and shouldn't be used. IDK if this is a good practice or not.
db/range_del_aggregator.cc
Outdated
if (icmp_.user_comparator()->Compare(start, tombstone.end_key_) < 0 && | ||
icmp_.user_comparator()->Compare(tombstone.start_key_, end) <= 0 && | ||
icmp_.user_comparator()->Compare(tombstone.start_key_, | ||
tombstone.end_key_) != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am assuming start_key_ can never be less than end_key, in which case, shouldn't this condition be <= 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah except maybe < 0
since we consider the range with start_key_ == end_key_
as empty, so it shouldn't overlap anything.
db/external_sst_file_basic_test.cc
Outdated
ASSERT_OK(GenerateAndAddExternalFile( | ||
options, {10, 40}, {ValueType::kTypeValue, ValueType::kTypeValue}, | ||
file_id++, &true_data)); | ||
ASSERT_EQ(dbfull()->GetLatestSequenceNumber(), ++last_seqno); | ||
ASSERT_EQ(2, NumTableFilesAtLevel(0)); | ||
ASSERT_EQ(1, NumTableFilesAtLevel(kNumLevels - 2)); | ||
ASSERT_EQ(1, NumTableFilesAtLevel(options.num_levels - 1)); | ||
|
||
// overlaps with memtable, so flush is triggered (thus file count increases by | ||
// two at this step). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does the L0 file count increase by 2 here? I was assuming only one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When ingesting an external file we first flush memtable if the memtable's key-range overlaps with the ingested file. That's done so the ingested file's data will be newer than any existing data. So the first one is memtable being flushed, and second one is file being ingested.
@ajkr has updated the pull request. View: changes, changes since last import |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@ajkr has updated the pull request. View: changes, changes since last import |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajkr is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Before we were checking every file in the level which was unnecessary. We can piggyback onto the code for checking point-key overlap, which already opens all the files that could possibly contain overlapping range deletions. This PR makes us check just the range deletions from those files, so no extra ones will be opened.