-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mitigate the overhead of building the hash of file locations #9504
Conversation
…ateAccumulatedStats if update_stats flag is set
…ateAccumulatedStats if update_stats flag is set
@ltamasi has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
db/version_set.cc
Outdated
for (size_t i = 0; i < new_last_level.size(); ++i) { | ||
const FileMetaData* const meta = new_last_level[i]; | ||
assert(meta); | ||
|
||
const uint64_t file_number = meta->fd.GetNumber(); | ||
|
||
vstorage->file_locations_[file_number] = | ||
VersionStorageInfo::FileLocation(new_levels - 1, i); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not update the index after ReduceNumberOfLevels
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be done by PrepareForVersionAppend
, which is done as part of the LogAndApply
call below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. Thanks for the clarification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe some comment about the index not consistent at this point will be informative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked into this a bit more and realized that this method is only called by ldb reduce_levels
, and the DB is not even open during the call (which I suppose is why manipulating the current version directly is "safe"). Still, I tend to think it would be nice to clear the index here and let LogAndApply
and friends rebuild it from scratch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, on second (third? fourth?) thought, LogAndApply
creates a new Version
by "applying" an empty VersionEdit
, so the contents of the index in the original Version
do not matter at all. So we might as well leave this as-is for consistency
@ltamasi has updated the pull request. You must reimport the pull request before landing. |
@ltamasi has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary:
The patch builds on the refactoring done in #9494
and improves the performance of building the hash of file
locations in
VersionStorageInfo
in two ways. First, the hashbuilding is moved from
AddFile
(which is called under the DB mutex)to a separate post-processing step done as part of
PrepareForVersionAppend
(during which the mutex is not held). Second, the space necessary
for the hash is preallocated to prevent costly reallocation/rehashing
operations. These changes mitigate the overhead of the file location hash,
which can be significant with certain workloads where the baseline CPU usage
is low (see #9351,
which is a workload where keys are sorted, WAL is turned
off, the vector memtable implementation is used, and there are lots of small
SST files).
Fixes #9351
Test Plan:
make check
Final statistics before this patch:
With the patch: