Support out of band dumping of unrooted slots in AccountsDb#17269
Support out of band dumping of unrooted slots in AccountsDb#17269mergify[bot] merged 8 commits intosolana-labs:masterfrom
Conversation
9126345 to
cf8ae73
Compare
c1b01f8 to
8ccb077
Compare
Codecov Report
@@ Coverage Diff @@
## master #17269 +/- ##
=========================================
- Coverage 82.7% 82.7% -0.1%
=========================================
Files 428 430 +2
Lines 119963 120568 +605
=========================================
+ Hits 99324 99796 +472
- Misses 20639 20772 +133 |
ef09ea7 to
8e29975
Compare
| // Reads will then always read the latest version of a slot. Scans will also know | ||
| // which version their parents because banks will also be augmented with this version, | ||
| // which handles cases where a deletion of one version happens in the middle of the scan. | ||
| // TODO: This is currently unsafe with scan because it can remove a slot in the middle |
There was a problem hiding this comment.
What is your plan for this now? Does this unsafeness exist today and you're just calling it out here or does this PR add this unsafeness? Is the unsafeness worth whatever this is fixing? I'm missing some context.
I'm confused with 'currently'. Prior to your change, in master? Or, 'currently' because of your change?
I should have read the comment above better before commenting ;-) It was unsafe. It is still unsafe. Nothing to see here!
There was a problem hiding this comment.
Yeah, this unsafeness does exist today, but luckily this function isn't called anywhere by the validator 😃
Yeah it's definitely worth fixing, and is addressed in this next PR here: #17471, a peek into the future if you will 👀
brooksprumo
left a comment
There was a problem hiding this comment.
OK! I made it through!
I'm still not at a point where I understand all the uses and places when flushes/purges/cleans/removals happen, so I'm not sure if there are other designs that would also work here. But this design looks good to me.
I had one question about a variable copy, but that's it. I can give an Approve after that's resolved.
Also, nice test; pretty gnarly!
| // For each slot the cache flush has finished, mark that we're about to start | ||
| // purging these slots by reserving it in `contended_slots`. | ||
| contended_cache_flush_slots.retain(|flushing_slot| { | ||
| let is_being_flushed = contended_slots.contains(flushing_slot); |
There was a problem hiding this comment.
In the beginning of the loop, contended_cache_flush_slots contains the slots only found in contended_slots.
When it wakes up, how come the slot in contended_cache_flush_slots is not found in contended_slots? Who removes it? If the flush thread is removing it, then would not we risk on spinning on the same slots here if adding it back? Or am I missing something?
There was a problem hiding this comment.
When it wakes up, how come the slot in
contended_cache_flush_slotsis not found in contended_slots
The signal.wait(contended_slots).unwrap() above releases the contended_slots and then regrabs and returns the locked list when the condition variable gets a signal, see documentation here: https://doc.rust-lang.org/std/sync/struct.Condvar.html#method.wait. So in the meantime, the flush thread may have finished flushing and removed it from the list.
When it wakes up, how come the slot in
contended_cache_flush_slotsis not found in contended_slots
So the flush thread is the thread removing the slots from contended_slots, which is the shared variable under the lock.
The contended_cache_flush_slots is a local snapshot of the slots in contended_slots at the time we checked it at the beginning of the function, and is never added to again after we filter from it, so it's always shrinking after each iteration of the retain above. We only add a slot to the contended_slots throughcontended_slots.insert(*flushing_slot) after we've confirmed the retain will remove the slot because !is_being_flushed is true.
There was a problem hiding this comment.
Thanks explaining. Still have a question. In line 3445, contended_cache_flush_slots only has the slots which is in the also in contended_slots. Doesn't line 3471 re-add it again?
There was a problem hiding this comment.
ha yeah it does, I don't think it's strictly necessary, but I wanted to uphold the invariant that any slot that is either undergoing flush or remove_unrooted_slots are both present in contended_slots, so we always have the working set of slots observable by any thread.
| } | ||
| }; | ||
| if !is_being_purged { | ||
| let flush_stats = self.do_flush_slot_cache(slot, &slot_cache, should_flush_f); |
There was a problem hiding this comment.
This might be racy. In the above section lock was taken on slots_under_contention, and then dropped. Cannot the other thread running remove_unrooted_slots goes in and actually purge it?
There was a problem hiding this comment.
so we held the lock until we inserted here: remove_unrooted_slots.insert(slot);, then we dropped the lock.
The thread running the remove_unrooted_slots() function will block/be stuck looping on the condition variable until this flush thread removes the slot from the contended_slots list: https://github.com/solana-labs/solana/pull/17269/files#diff-1090394420d51617f3233275c2b65ed706b35b53b115fe65f82c682af8134a6fR3468
There was a problem hiding this comment.
Got it. Maybe rename let mut remove_unrooted_slots to let mut slots_under_contention make it more clear.
lijunwangs
left a comment
There was a problem hiding this comment.
Looks good to me. Thanks for answering my questions.
brooksprumo
left a comment
There was a problem hiding this comment.
Looks good! I think the doc comment updates would help future-Brooks too!
| @@ -753,6 +753,13 @@ impl RecycleStores { | |||
| } | |||
| } | |||
|
|
|||
There was a problem hiding this comment.
Maybe some doc comment like this?
| /// Removing unrooted slots in Accounts Background Service needs to be synchronized with flushing slots from the Accounts Cache. This keeps track of those slots and the Mutex + Condvar for synchronization. |
| // Set of slots currently being flushed by `flush_slot_cache()` or removed | ||
| // by `remove_unrooted_slot()`. Used to ensure `remove_unrooted_slots(slots)` | ||
| // can safely clear the set of unrooted slots `slots`. |
There was a problem hiding this comment.
I love the comments. Could they be /// instead?
Pull request has been modified.
| if !rooted_slots.is_empty() { | ||
| panic!( | ||
| "Trying to remove accounts for rooted slots {:?}", | ||
| rooted_slots | ||
| ); | ||
| } |
There was a problem hiding this comment.
nits:
assert!(rooted_slots.is_empty(), "Trying to remove accounts for rooted slots {:?}", rooted_slots);| // Mark that we're about to delete this slot now | ||
| contended_slots.insert(*flushing_slot); | ||
| } | ||
| !is_being_flushed |
There was a problem hiding this comment.
maybe, this condition needs to be flipped?
i mean, we need to retain() elements of contended_cache_flush_slots which have yet to be flushed (i.e. is_being_flushed to re-evaluate its .empty()-ness later in this loop; slots that are still contain()-ed in contended_slots (and will be remove()-ed by the flusher thread later)?
There was a problem hiding this comment.
Yeah, you're right, great catch.
It's bad the test didn't catch this, I'll have to fix that up as well to be more robust
There was a problem hiding this comment.
So the test didn't catch this because the remove_unrooted_slots() thread will not wake up from waiting on the condition variable, until the flush thread has finished flushing and removed the slot from the contended slots, at which point the remove_unrooted_slots() thread will itself add the slot to the contended slots list, and then exit on the next iteration of the loop.
Updated the test to have an extra thread that simulates spurious wake up calls on the condition variable and the test now fails appropriately
| let is_cache_flushing_slot = contended_slots.contains(remove_slot); | ||
| if !is_cache_flushing_slot { | ||
| // Reserve the slots that we want to purge that aren't currently | ||
| // being flushed to prevent cache from flushing those slots in | ||
| // the future | ||
| contended_slots.insert(**remove_slot); |
There was a problem hiding this comment.
this exclusive control-flow logic assumes same (detected as duplicate) remove_slot won't be ever given to remove_unrooted_slots() from multiple replayingstage threads. Is this correct assumption? Maybe replayingstage always process a slot at each time? How about commenting about the underlying assumption somewhere?
There was a problem hiding this comment.
Yeah, there is only one replay thread, and one version of the remove_slot needs to be removed before the next version can be replayed. Added a comment!
|
|
||
| // For each slot the cache flush has finished, mark that we're about to start | ||
| // purging these slots by reserving it in `contended_slots`. | ||
| contended_cache_flush_slots.retain(|flushing_slot| { |
There was a problem hiding this comment.
nit: how about this naming?:
remaining_contended_flush_slots.retain(|flush_slot|remaining_is added to better describe this variable isretained()iteratively until.is_empty().flush_slotis used to be consistent withremove_slotfor english grammar.
| let mut contended_cache_flush_slots: Vec<Slot> = remove_slots | ||
| .iter() | ||
| .filter(|remove_slot| { | ||
| let is_cache_flushing_slot = contended_slots.contains(remove_slot); |
There was a problem hiding this comment.
nits: rename to is_being_flushed to be consistent with https://github.com/solana-labs/solana/pull/17269/files#r641874795 (let is_being_flushed = contended_slots.contains(flushing_slot);) and better contrast with is_being_purged, too :)
imo, semantically same expression should result in same name unless there is good reason (like state transition between the occurences). In this case, I think its usage is same for both occurrences in that it's used to reserve slot with contended_slots for removal.
|
|
||
| { | ||
| // Slots that are currently being flushed by flush_slot_cache() | ||
| let mut contended_slots = slots_under_contention.lock().unwrap(); |
There was a problem hiding this comment.
nits: rename to currently_contended_slots to indicate volatility or mutation of this variable via CondVar?
| struct RemoveUnrootedSlotsSynchronization { | ||
| // slots being flushed from the cache or being purged | ||
| slots_under_contention: Mutex<HashSet<Slot>>, | ||
| signal: Condvar, |
There was a problem hiding this comment.
fyi, our first usage of Condvar in the codebase. :)
There was a problem hiding this comment.
I know. I keep running into situations where this has almost been the piece I needed. Glad to know it is there.
Pull request has been modified.
| // Note that the single replay thread has to remove a specific slot `N` | ||
| // before another version of the same slot can be replayed. This means | ||
| // multiple threads should not call `remove_unrooted_slots()` simultaneously | ||
| // with the same slot. |
ryoqun
left a comment
There was a problem hiding this comment.
lgtm; nice solid work!
I like the logically chunked series of prs towards the duplicate slot implementation. :)
Thanks for the review! |
* Accounts dumping logic * Add test for interaction between cache flush and remove_unrooted_slot() * Update comments * Rename * renaming * Add more comments * Renaming * Fixup test and bad check (cherry picked from commit bbcdf07) # Conflicts: # runtime/src/accounts_db.rs
* Accounts dumping logic * Add test for interaction between cache flush and remove_unrooted_slot() * Update comments * Rename * renaming * Add more comments * Renaming * Fixup test and bad check (cherry picked from commit bbcdf07)
Problem
Follow-up to #17319.
Bad versions of duplicate blocks can be replayed, and thus need to be cleared from Accounts state before replaying the good version.
Summary of Changes
Support safe block dumping by:
RemoveUnrootedSlotssynchronization mechanism which keeps a list of slots being flushed from cache OR being removed by this new dumping path. This dumping path blocks until a flush is completed, whereas the cache flushing path will ignore requests to flush slots that are currently/about to be dumped.purge_slots_from_cache_and_store().Follow-up PR:
Notifying relevant scans that their results may be outdated, and abort those results, noted here:
solana/runtime/src/accounts_db.rs
Line 3411 in 3d0cacc
Fixes #