Skip to content

v2.1: Refactors get_snapshot_storages() (backport of #3760)#3785

Merged
brooksprumo merged 1 commit intov2.1from
mergify/bp/v2.1/pr-3760
Dec 6, 2024
Merged

v2.1: Refactors get_snapshot_storages() (backport of #3760)#3785
brooksprumo merged 1 commit intov2.1from
mergify/bp/v2.1/pr-3760

Conversation

@mergify
Copy link
Copy Markdown

@mergify mergify Bot commented Nov 25, 2024

Problem

AccountsDb::get_snapshot_storages() is due for a refactor. We can speed it up for the common case, and also simplify.

Since #3737, we now call get_snapshot_storages() every time we clean. The observation here is we'll only need about 100 storages (on average), yet the current impl of get_snapshot_storages() Arc::clone's all the storages, and then filters out the unneeded ones. We can change this to only Arc::clone the useful ones instead.

Additionally, the filter step is done in parallel. When we only have 100 storages, the parallel execution does not help. In fact, with a chunk size of 5000, we end up getting zero benefit, but do have to pay the cost of running in the thread pool.

Summary of Changes

  • When getting storages, only Arc::clone the ones we need
  • When filtering, do not use a thread pool

Results

I ran this on against mnb and saw good results.

Since get_snapshot_storages() is called in two places, I wanted to look at the perf results in both.

  1. clean

Here, we only need ~100 storages each time. So not Arc::cloning and not using the thread pool really helps. The PR ends up running consistently, and meaningfully, faster:

branch get storages filter total
master 30-50 ms 100-400 us 30-50 ms
pr v1 11-13 ms 10-30 us 11-13 ms
pr v2 n/a n/a 7-9 ms
pr v3 n/a n/a 6-8 ms
  1. taking full snapshots

Full snapshots do need all the storages, so we will end up Arc::cloning almost all the storages. And it's possible the thread pool does help here. For the most part, runtimes are pretty similar. The PR does have a worse worst-case filter time.

branch get storages filter total
master 30-50 ms 30-40 ms 60-90 ms
pr v1 30-50 ms 30-50 ms 60-100 ms
pr v2 n/a n/a 40-60 ms
pr v3 n/a n/a 40-60 ms

Overall, I think the common case of clean makes this change clearly a win. There is maybe a slight slow down for full snapshots, but that code is both infrequent, and in the background, so I don't think it matters much. Additionally, by not using a thread pool, we may reduce resource usage for the system as a whole.

Justification to Backport

As per the Problem section, we now call get_snapshot_storages() during clean. Until the skipping rewrites feature is enabled, that means we'll be doing this extra work. The feature is in v2.1, and so mnb on v2.0 and v2.1 will be paying the extra cost. Thus, improving the performance for get_snapshot_storages() benefits the validator as a whole.

In theory we could also backport to v2.0 as well, but I'm not currently advocating for that yet.


This is an automatic backport of pull request #3760 done by [Mergify](https://mergify.com).

@mergify mergify Bot requested a review from a team as a code owner November 25, 2024 21:18
@brooksprumo
Copy link
Copy Markdown

Blocked until the version bump pr (#3782) is merged.

@brooksprumo
Copy link
Copy Markdown

@Mergifyio rebase

@mergify
Copy link
Copy Markdown
Author

mergify Bot commented Nov 26, 2024

rebase

❌ Base branch update has failed

Details

brooksprumo token is invalid, make sure brooksprumo can still log in on the Mergify dashboard.

@brooksprumo
Copy link
Copy Markdown

@Mergifyio rebase

@mergify
Copy link
Copy Markdown
Author

mergify Bot commented Nov 26, 2024

rebase

✅ Branch has been successfully rebased

@brooksprumo brooksprumo force-pushed the mergify/bp/v2.1/pr-3760 branch from 3976744 to 7d7f7af Compare November 26, 2024 17:08
@brooksprumo
Copy link
Copy Markdown

brooksprumo commented Nov 26, 2024

Blocked until #3791 #3800 is merged, to addess the advisory.

@brooksprumo
Copy link
Copy Markdown

@Mergifyio rebase

@mergify
Copy link
Copy Markdown
Author

mergify Bot commented Nov 27, 2024

rebase

✅ Branch has been successfully rebased

@brooksprumo brooksprumo force-pushed the mergify/bp/v2.1/pr-3760 branch from 7d7f7af to 9d62df6 Compare November 27, 2024 16:16
@brooksprumo
Copy link
Copy Markdown

Blocked until version bump PR is merged: #3819

@brooksprumo
Copy link
Copy Markdown

@Mergifyio rebase

@mergify
Copy link
Copy Markdown
Author

mergify Bot commented Dec 2, 2024

rebase

✅ Branch has been successfully rebased

Copy link
Copy Markdown

@jeffwashington jeffwashington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. I'm not a strong advocate for backport, but i'm ok with it. I don't imagine any problems with backporting.

Copy link
Copy Markdown

@HaoranYi HaoranYi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems little risk but potential good performance improvement.
lgtm.

&self,
predicate: impl Fn(&Slot, &AccountStorageEntry) -> bool,
) -> Box<[(Slot, Arc<AccountStorageEntry>)]> {
assert!(self.no_shrink_in_progress());
Copy link
Copy Markdown

@brooksprumo brooksprumo Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assert does exist in the old version as well.

In the old version we do:

/// Get storages to use for snapshots, for the requested slots
pub fn get_snapshot_storages(
&self,
requested_slots: impl RangeBounds<Slot> + Sync,
) -> (Vec<Arc<AccountStorageEntry>>, Vec<Slot>) {
let mut m = Measure::start("get slots");
let mut slots_and_storages = self
.storage
.iter()

and that .iter() also has the assert:

/// iterate through all (slot, append-vec)
pub(crate) fn iter(&self) -> AccountStorageIter<'_> {
assert!(self.no_shrink_in_progress());
AccountStorageIter::new(self)
}

Comment on lines +8296 to +8298
.into_vec()
.into_iter()
.unzip();
Copy link
Copy Markdown

@brooksprumo brooksprumo Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This .into_vec() won't allocate, it'll steal the allocation from the Box, so we're good there.

For the unzip, we can refactor that as well. I'd love to do that. I didn't want to do that in a backport though. So we keep the same return types and keep the PR small. (Note in the original we also unzip, so we're not adding an unzip here in the backport. We also know that the backport is faster too.)

@brooksprumo
Copy link
Copy Markdown

@t-nelson I tried to address the code that I remember you calling out. Were there other areas of concern that I missed?

@brooksprumo
Copy link
Copy Markdown

I'll plan to merge tomorrow unless there are new comments/concerns.

@brooksprumo brooksprumo merged commit 24bd7c3 into v2.1 Dec 6, 2024
@brooksprumo brooksprumo deleted the mergify/bp/v2.1/pr-3760 branch December 6, 2024 14:21
KirillLykov pushed a commit that referenced this pull request Dec 9, 2024
Refactors get_snapshot_storages() (#3760)

(cherry picked from commit 8c7ae80)

Co-authored-by: Brooks <brooks@anza.xyz>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants