Skip to content

Fix a bug that can retain old WAL longer than needed #13127

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

cbi42
Copy link
Member

@cbi42 cbi42 commented Nov 8, 2024

Summary: The bug only happens for transaction db with 2pc. The main change is in MemTableList::TryInstallMemtableFlushResults. Before this fix, memtables_to_flush may not include all flushed memtables, and it causes the min_log_number for the flush to be incorrect. The code path for calculating min_log_number is MemTableList::TryInstallMemtableFlushResults() -> GetDBRecoveryEditForObsoletingMemTables() -> PrecomputeMinLogNumberToKeep2PC() -> FindMinPrepLogReferencedByMemTable(). Inside FindMinPrepLogReferencedByMemTable(), we need to exclude all memtables being flushed.

The PR also includes some documentation changes.

Test plan: added a new unit that fails before this change.

// TODO(myabandeh): Not sure how batch_count could be 0 here.
if (batch_count > 0) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed for better understanding.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually seems more than a rename. I think the number of "memtable batches" and flushed memtables can actually be different if min_write_buffer_number_to_merge != 1. I'm wondering if that option has been broken all along?

Copy link
Member Author

@cbi42 cbi42 Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the variable is not named correctly since it was introduced: 5aaef91#diff-398125fa2b06616bca3e9aaeeacb38d0f251321cc15722e4b636d20d1dfb1d6a. The way it's used seem to be number of memtables flushed.

edit: TODO: the logging in

rocksdb/db/memtable_list.cc

Lines 767 to 770 in 1f0ccd9

while (batch_count-- > 0) {
ReadOnlyMemTable* m = current_->memlist_.back();
if (m->edit_.GetBlobFileAdditions().empty()) {
ROCKS_LOG_BUFFER(log_buffer,
is redundant when multiple memtables are flushed into a same file.

@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cbi42 cbi42 requested a review from ltamasi November 8, 2024 20:36
// TODO(myabandeh): Not sure how batch_count could be 0 here.
if (batch_count > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually seems more than a rename. I think the number of "memtable batches" and flushed memtables can actually be different if min_write_buffer_number_to_merge != 1. I'm wondering if that option has been broken all along?

uint64_t PrecomputeMinLogContainingPrepSection(
const std::unordered_set<ReadOnlyMemTable*>* memtables_to_flush =
nullptr);
nullptr) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍👍

@facebook-github-bot
Copy link
Contributor

@cbi42 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cbi42 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@cbi42 merged this pull request in 925435b.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants