Faster status cache by alessandrod · Pull Request #3796 · anza-xyz/agave

alessandrod · 2024-11-26T11:43:20Z

This PR removes the global RwLock around the status cache, and introduces more granular RwLocks per-blockhash and per-slot. Additionally, it changes the internal hash tables from std HashMap to Dashmap, so that operations at the blockhash and slot level can be done only holding read locks.

This is not the final design of A Performant Status Cache - which, I think, can make check and update go straight to 0 - but it's a good incremental improvement.

Results are pretty good: check_transactions is ~6x faster, and update_transaction_statuses is ~2.5x faster.

alessandrod · 2024-12-09T13:52:55Z

bench-tps against a single node

scheduler before and after

workers

jstarry · 2025-09-11T06:02:35Z

I strongly think you should consider other options and yield on this approach. Finding others to review and merge this code feels subversive to me. If others are ok with the complexity and comfortable with the idea of needing to debug this code in the future, then feel free to move forward, though. I think the goal here is to reduce the time spent on lock contention and moving to ahash / removing results both help in that regard.

alessandrod · 2025-09-11T06:16:20Z

I strongly think you should consider other options and yield on this approach. Finding others to review and merge this code feels subversive to me.

why would it be subversive? We don’t have to agree on everything.

If others are ok with the complexity and comfortable with the idea of needing to debug this code in the future, then feel free to move forward, though. I think the goal here is to reduce the time spent on lock contention and moving to ahash / removing results both help in that regard.

That is not the goal. The goal is to remove the contention not reduce it. I’m fully in favor of doing ahash, I didn’t do it in this PR because it’s orthogonal. Same with the result.

alessandrod · 2025-09-11T06:32:59Z

Also it’s worth noting that frankendancer has been running this on mnb for months, I’ve ran it for months on my node, firedancer uses a similar approach.

I acknowledge that the code is more complex and happy to discuss concerns with it as I’ve been doing so far, but it’s not like I’m being unreasonable and trying to be subversive 😅

jstarry · 2025-09-11T07:17:38Z

That is not the goal. The goal is to remove the contention not reduce it.

Hmm ok, maybe to rephrase.. the goal can be to remove contention entirely but any progress towards reducing is also productive. If we get to a point where incrementally reducing contention further (ie to zero) results in a lot of complexity and maintenance, then we should be mindful of the tradeoffs.

I’m fully in favor of doing ahash, I didn’t do it in this PR because it’s orthogonal. Same with the result.

Now I'm confused. You did apply ahash in this PR, no?

Removing result reduces contention as well because it's one less hash (of the signature) under write lock. I think it's better to do such a change before making the status cache more complex, not after.

alessandrod · 2025-09-11T09:41:53Z

Now I'm confused. You did apply ahash in this PR, no?

I really shouldn't be on github from my phone 😅

When I was young and naive and started this change I had everything nice and split in a bunch of commits, including ahash before the rwlock removal. Then as I rebased the PR 600 times I got tired of having to resolve conflicts and ended up squashing all together.

I'm happy to take ahash out and land it before this change tho, I'll do that next.

apfitzge · 2025-09-11T13:13:08Z

+    K: Eq + Hash,
+    S: BuildHasher,
+{
+    shards: Vec<RwLock<HashMap<K, V, S>>>,


Looking at the internals of DashMap (what the previous implementation used) which has:

shards: Box<[CachePadded<RwLock<HashMap<K, V>>>]>,

Would we benefit from cache-padding the inner write-locks?

I don't think Box vs Vec will change much, just type guarnatees we don't re-size at runtime...but meh, that's obvious from the code imo.

this is only used in tests to provide shuttle with context switch points, we don't really use this in production, so I didn't spend any time making the code good.

Basically shuttle intercepts anything that may put a thread to sleep or cause thread synchronization, and explicitly handles the state transition according to a given scheduler (right now I'm using the random scheduler in the tests).

This code is needed to simulate a bunch of interleaving sequences in our code and see if anything breaks.

apfitzge · 2025-09-11T13:20:57Z

@@ -1,118 +1,195 @@
+// Use shuttle's synchronization primitives when cfg(feature = "shuttle-test") is set, otherwise use std.


can we put these imports down with the other conditional imports?

it's easier to see what is changing based on shuttle-test flag if they are close to eachother.

lmao for some reaon rustfmt put it there if I didn't add strategic newlines

nope, looks like rustfmt wants it at the top or CI fails

apfitzge · 2025-09-11T14:09:53Z

+                // Safety:
+                // - we explicitly check that the blockhash isn't referenced by other threads.
+                // Checking Arc::strong_count() is safe because retain() holds a write lock on
+                // the shard and get_or_insert_with() calls Arc::clone() holding a read lock.


🤓 there's an edge case we might mention here about Weak.

If another thread downgraded to Weak, it's possible that the strong_count below returns 1, they upgrade, then it gets removed.

We're not fucking around with Weak but might add a comment here that we do not do that.

Sometimes I wish there was a strong-only Arc

alessandrod · 2025-09-14T13:14:28Z

Some notes I took while re-re-re-viewing the code

clean_slot_entries(slot):

done in the replay loop from dump_then_repair_correct_slots -> purge_unconfirmed_duplicate_slot
dump_then_repair_correct_slots is called in the ReplayStage main loop
ReplayStage is single threaded, dump_then_repair_correct_slots is called
after replay_active_banks; replay_active_banks waits for the replay
scheduler to stop so by the time it returns there are no unified scheduler
threads replaying entries
never called on a slot for which the current validator is leader according
to leader schedule => can never race with banking
never cleans a rooted slot (by definition if it's rooted won't be dumped)
it can work concurrently with RPC... but I believe that interactions with
RPC are already broken when we dump slots as per ReplayStage comment:

// TODO: What about RPC queries that had already cloned the Bank for this slot
// and are looking up the signature for this slot?"
root_bank.clear_slot_signatures(slot);

So effectively in clean_slot_entries(), the only thing to be mindful of is
that banking might be working with some of the blockhashes in
status_cache.cache that might be referenced by the txs we're removing from
the dumped slot. We use self.cache.remove_if_not_accessed_and(...is_empty()) to remove
a blockhash, which means that if banking has added (not empty) or is adding
(is accessed) txs we won't remove the blockhash.

root_slot_deltas():

called from replay handle_votable_bank -> check_and_handle_new_root -> ... -> set_bank_forks_root
this too can't race with any other entry replay threads for the same reason as clean_slot_entries
it has exclusive access to status_cache.slot_delta[the roots it accesses] since roots can't be dumped
it does an Arc::clone of the contents (dashmap) of the roots from
status_cache.slot_delta[the roots] and gives it to the snapshot thread
while the snapshot thread does its thing, purge_root() can remove old roots from
status_cache.slot_delta, but that's ok, it only removes the values it doesn't change the content of
the values at all (which is what the snapshot thread works with)

I don't think that there are any race concerns at all here.

add_root()/purge_roots()

called by replay like 3 lines after root_slot_deltas()
needs to be careful not to remove blockhashes touched by replay/banking.

Since MAX_PROCESSING_AGE = MAX_RECENT_BLOCKHASHES / 2, by the time we
purge a root all the blockhashes referenced have expired, except nonces. We
must make sure we don't remove nonces that have been used past the root
we're purging, and we do so with
key_in_use || max_slot.load(Ordering::Relaxed) > min
where min is the root slot we're purging and max_slot is the last slot a
blockhash was referenced in.

Additionally nonces have the nonce account data check in place that avoids replay, so I think
here we're all good.

insert(blockhash, tx, slot)

called by the entries replay threads and by banking
can't run concurrently with clear_slot_entries(), see above
can't touch the same status_cache.slot_delta[slot] as purge_roots()
because roots can't be inserted into
does status_cache.slot_delta.get_or_insert(slot).get_or_insert(blockhash).push(tx) (boxcar vec)
does status_cache.cache.get_or_insert(blockhash).txs.entry(tx).forks.push(slot) (regular vec)

So effectively, what can happen concurrently:

purge_roots() removing blockhashes that replay/banking are inserting under
(these can only be nonces, see above)
replay/banking inserting concurrently under the same blockhashes (very common)
check_transactions() called by scheduler/banking workers/replay threads
while insert() is in progress (also common)

Because of the serialization forced by account locks, check_transactions(tx) and insert(tx) can
never be called concurrently when it matters (load_and_execute_transactions() in replay or
process_and_record_transactions() in the banking workers).

jstarry · 2025-09-15T16:26:16Z

So effectively in clean_slot_entries(), the only thing to be mindful of is
that banking might be working with some of the blockhashes in
status_cache.cache that might be referenced by the txs we're removing from
the dumped slot. We use self.cache.remove_if_not_accessed_and(...is_empty()) to remove
a blockhash, which means that if banking has added (not empty) or is adding
(is accessed) txs we won't remove the blockhash.

Yeah, I think you described this race condition correctly and the new method remove_if_not_accessed_and looks like a solid fix without needing a read guard. Awesome!

github-actions · 2026-01-12T08:15:54Z

This pull request is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions · 2026-01-29T08:17:13Z

This pull request was closed because it has been stale for 7 days with no activity.

Remove the global RwLock around the status cache, and introduce more granular RwLocks per-blockhash and per-slot. Additionally, change the internal hash tables from std HashMap to Dashmap, so that operations at the blockhash and slot level can be done only holding read locks.

Copilot

Pull request overview

This PR refactors the runtime status cache to reduce contention by removing the global RwLock and introducing concurrent/sharded data structures (DashMap + per-entry Arc-based access), aiming to speed up transaction status checks and updates.

Changes:

Replaced the status cache’s global lock + HashMap layout with ReadOptimizedDashMap/DashMap/DashSet and per-blockhash/per-slot granularity.
Updated snapshot (de)serialization paths and bank call sites to use the new non-RwLock status cache API.
Introduced a concurrent append-only vector (ConcurrentVec via boxcar) and added boxcar/DashMap serde feature to dependencies.

Reviewed changes

Copilot reviewed 11 out of 13 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
runtime/src/status_cache.rs	Core refactor: concurrent cache + slot delta structures, new insertion/clearing/purging logic, new `ConcurrentVec`.
runtime/src/bank.rs	Bank now holds `Arc<BankStatusCache>` directly (no `RwLock`); updated status-cache call sites.
runtime/src/bank/check_transactions.rs	Uses the new status cache API directly (no read lock).
runtime/src/snapshot_controller.rs	Uses new status cache API when preparing snapshot requests.
runtime/src/snapshot_bank_utils.rs	Uses new status cache API for snapshot restore/create (some call sites still need updating).
runtime/src/serde_snapshot/status_cache.rs	Adjusted snapshot serialization/deserialization for new `Status` representation.
runtime/src/read_optimized_dashmap.rs	Tweaks to wrapper; adds a test-only helper around retain behavior.
core/tests/snapshots.rs	Updated one status cache usage to the new API (other call sites still need updating).
runtime/benches/status_cache.rs	Bench updates for new API surface (no lock guard).
runtime/Cargo.toml	Adds `boxcar` dependency to runtime crate.
Cargo.toml	Adds `boxcar` to workspace deps; enables DashMap `serde` feature.
Cargo.lock	Lockfile update for new dependency.
programs/sbf/Cargo.lock	Lockfile update for new dependency in SBF workspace.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-23T03:02:42Z

            .status_cache
-            .read()
-            .unwrap()
            .roots()
-            .iter()
-            .cloned()
            .sorted();


There is still a bank.status_cache.read().unwrap().root_slot_deltas() usage later in this file (in the fastboot snapshot test). With Bank::status_cache no longer behind an RwLock, this remaining call site should be updated to bank.status_cache.root_slot_deltas() to keep the test compiling.

Copilot · 2026-02-23T03:02:43Z

+    // #[test]
+    // fn test_shuttle_purge_nonce_overlap_dfs() {
+    //     shuttle::check_dfs(do_test_shuttle_purge_nonce_overlap, PURGE_DFS_ITERATIONS);
+    // }


The DFS variant of the purge nonce overlap shuttle test has been commented out. Since this PR changes status cache concurrency behavior, disabling this stress test reduces coverage for race conditions around add_root()/purging. Consider keeping it enabled (or gating it behind an opt-in flag) and tracking the shuttle runtime crash separately.

Suggested change

// #[test]

// fn test_shuttle_purge_nonce_overlap_dfs() {

// shuttle::check_dfs(do_test_shuttle_purge_nonce_overlap, PURGE_DFS_ITERATIONS);

// }

#[cfg(feature = "shuttle-dfs-tests")]

#[test]

fn test_shuttle_purge_nonce_overlap_dfs() {

shuttle::check_dfs(do_test_shuttle_purge_nonce_overlap, PURGE_DFS_ITERATIONS);

}

Copilot · 2026-02-23T03:02:43Z

+            // Safety: we just initialized the whole key_slice above
+            let mut forks = txs.entry(unsafe { key_slice.assume_init() }).or_default();
+            forks.push((slot, res.clone()));
+


key_slice.assume_init() is called here and then called again later when passing key_slice into add_to_slot_delta(). assume_init() moves the [u8; CACHED_KEY_SIZE] out of the MaybeUninit, so using it twice is undefined behavior (second call reads moved/uninitialized memory). Fix by materializing the initialized array exactly once (e.g., assign to a local let key_slice = ...;) and then reuse/copy it for both insertions.

Copilot · 2026-02-23T03:02:43Z

+                // FIXME: this can happen with RPC doing get_status or whatever
+                panic!(
+                    "slot {slot} is being cleared while another thread is inserting new entries"
+                );


Panicking on remove_if_not_accessed() failure will crash the node whenever another thread is holding an Arc to this slot's delta map (this includes readers, not just inserters). Since concurrent access is possible in production (e.g., snapshot/RPC readers), this should be handled gracefully (retry, skip, or defer clearing) instead of panic!.

Suggested change

// FIXME: this can happen with RPC doing get_status or whatever

panic!(

"slot {slot} is being cleared while another thread is inserting new entries"

);

// FIXME: this can happen with RPC doing get_status or whatever. Another thread may

// be holding an `Arc` to this slot's delta map (including read-only access).

// In that case, we skip clearing this slot for now to avoid crashing the node.

#[cfg(test)]

panic!(

"slot {slot} is being cleared while another thread is inserting or accessing entries"

);

#[cfg(not(test))]

return;

Copilot · 2026-02-23T03:02:43Z

    verify_slot_deltas(slot_deltas.as_slice(), &bank)?;

-    bank.status_cache.write().unwrap().append(&slot_deltas);
+    bank.status_cache.append(&slot_deltas);


This module still contains at least one bank.status_cache.read().unwrap() call in the test helper create_bank_snapshot_from_bank() (later in this file). Since Bank::status_cache is now Arc<BankStatusCache> (not an RwLock), that call site needs to be updated to use bank.status_cache.root_slot_deltas() directly to avoid compilation failures.

github-actions · 2026-04-24T08:49:45Z

This pull request is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions · 2026-05-04T08:59:15Z

This pull request was closed because it has been stale for 7 days with no activity.

alessandrod force-pushed the status-cache branch 14 times, most recently from 571fca8 to 2db0759 Compare November 27, 2024 05:52

alessandrod force-pushed the status-cache branch 2 times, most recently from d41386e to fc67d2f Compare December 9, 2024 09:50

alessandrod marked this pull request as ready for review December 9, 2024 09:50

alessandrod force-pushed the status-cache branch 2 times, most recently from 2ff1d2b to d9fdc54 Compare December 9, 2024 09:52

alessandrod changed the title ~~[WIP] faster status cache~~ Faster status cache Dec 9, 2024

alessandrod force-pushed the status-cache branch 7 times, most recently from 70974b7 to 4661809 Compare December 9, 2024 11:22

alessandrod requested review from apfitzge and bw-solana December 9, 2024 12:36

apfitzge reviewed Sep 11, 2025

View reviewed changes

alessandrod force-pushed the status-cache branch 3 times, most recently from 172b77c to 339d8ec Compare September 11, 2025 18:37

This was referenced Oct 1, 2025

status-cache: add shuttle tests #8275

Merged

status_cache: add comments and move away test only code #8289

Merged

solana-runtime: add ReadOptimizedDashMap #8314

Merged

github-actions Bot added the stale label Jan 12, 2026

github-actions Bot closed this Jan 29, 2026

alessandrod reopened this Jan 29, 2026

github-actions Bot removed the stale label Jan 30, 2026

alessandrod force-pushed the status-cache branch from 339d8ec to c91fc41 Compare February 18, 2026 13:26

alessandrod requested a review from Copilot February 23, 2026 02:53

Copilot started reviewing on behalf of alessandrod February 23, 2026 02:53 View session

Copilot AI reviewed Feb 23, 2026

View reviewed changes

github-actions Bot added the stale label Apr 24, 2026

github-actions Bot closed this May 4, 2026

		@@ -1,118 +1,195 @@
		// Use shuttle's synchronization primitives when cfg(feature = "shuttle-test") is set, otherwise use std.

-                // FIXME: this can happen with RPC doing get_status or whatever
-                panic!(
-                    "slot {slot} is being cleared while another thread is inserting new entries"
-                );
+                // FIXME: this can happen with RPC doing get_status or whatever. Another thread may
+                // be holding an `Arc` to this slot's delta map (including read-only access).
+                // In that case, we skip clearing this slot for now to avoid crashing the node.
+                #[cfg(test)]
+                panic!(
+                    "slot {slot} is being cleared while another thread is inserting or accessing entries"
+                );
+                #[cfg(not(test))]
+                return;

Conversation

alessandrod commented Nov 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alessandrod commented Dec 9, 2024

Uh oh!

jstarry commented Sep 11, 2025

Uh oh!

alessandrod commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alessandrod commented Sep 11, 2025

Uh oh!

jstarry commented Sep 11, 2025

Uh oh!

alessandrod commented Sep 11, 2025

Uh oh!

apfitzge Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

alessandrod Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

apfitzge Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

alessandrod Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

alessandrod Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

apfitzge Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

alessandrod Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

alessandrod commented Sep 14, 2025

Uh oh!

jstarry commented Sep 15, 2025

Uh oh!

github-actions Bot commented Jan 12, 2026

Uh oh!

github-actions Bot commented Jan 29, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 24, 2026

Uh oh!

github-actions Bot commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

alessandrod commented Nov 26, 2024 •

edited

Loading

alessandrod commented Sep 11, 2025 •

edited

Loading