v2.1: Marks old storages as dirty and uncleaned in clean_accounts() (backport of #3737)#3748
v2.1: Marks old storages as dirty and uncleaned in clean_accounts() (backport of #3737)#3748brooksprumo merged 2 commits intov2.1from
Conversation
(cherry picked from commit 31742ca) # Conflicts: # accounts-db/src/accounts_db/tests.rs
|
Cherry-pick of 31742ca has failed: To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally |
bw-solana
left a comment
There was a problem hiding this comment.
Looks reasonable to me.
Although it's hard for me to keep all these cases in my head at once 🫠 :
- Clean old accounts based on ancient append vec offset for tests
- some tests clean old accounts based on passing OldStoragesPolicy directly
- Don't clean old accounts during shrink at startup
- Clean old accounts if ancient storages are disabled (determined by feature gate + special accountsdb flag)
| epoch_schedule, | ||
| // Leave any old storages alone for now. Once the validator is running | ||
| // normal, calls to clean_accounts() will have the correct policy based | ||
| // on if ancient storages are enabled or not. |
There was a problem hiding this comment.
is it because we don't know what the right policy is at this point?
There was a problem hiding this comment.
Correct. we don't have a Bank instance to query. It's not wrong to use Leave here, but it could be potentially wrong to use Clean.
Yeah... it's a lot... Hence the difficulty in stabilizing skipping rewrites 😬
Ancient append vecs will always use
Most tests don't have more than 432,000 slots+storages, so this value is not really an issue. Using
Yep... startup is special. Unfortunately we've added yet another special case...
This is the majority/common case. Getting this one right is what matters most. |
Problem
Copied from #3702
We do not clean up old storages.
More context: when calculating a full accounts hash, we call
mark_old_slots_as_dirty()as a way to ensure we do not forget or miss cleaning up really old storages (i.e. ones that are older than an epoch old). But, when we enable skipping rewrites, we don't want to clean up those old storages, as they'll intentionally be treated as ancient append vecs. So insidemark_old_slots_as_dirty()we conditionally mark old slots as dirty. This is based on the value ofancient_append_vec_offset, which should be None unless ancient append vecs are enabled.Unfortunately, normal running validators, we end up never marking old slots as dirty, because the ancient append vec offset is always Some. And thus we don't clean up old storages.
Summary of Changes
Mark old storages as dirty, and add to the uncleaned roots list in clean_accounts().
We still check if ancient append vecs are enabled, but not with the
ancient_append_vec_offset. Instead we look at the skipping rewrites feature gate and the cli arg.By moving this marking into clean_accounts(), we also decouple it from accounts hash calculation, which is not necessary anymore. This also removes behavioral differences based on if snapshots are enabled or not.
Justification to Backport
Without this fix, nodes may never clean up old account storage files, leading to eventual crashes due to running out of file descriptors/mmaps. There's also the general performance regressions that occur as these old account storage files are unexpectedly kept around forever.
Additional Testing
I started up a node running this PR, and used a snapshot containing over 800k account storage files. The node was quickly able to remove all the old storage files and resume normal behavior.
Here's a graph of the node's count of storages. It starts around 850k and quickly drops to the correct ~432k:

This is an automatic backport of pull request #3737 done by [Mergify](https://mergify.com).