accounts-db: Benchmark cache evictions #4045

vadorovsky · 2024-12-10T19:52:42Z

The already existing concurrent_{read,scan}_write benchmarks are not sufficient for benchmarking the eviction and evaluating what kind of eviction policy performs the best, because they don't fill up the cache, so eviction never happens.

Add a new benchmark, which starts measuring the concurrent reads and writes on a full cache.

alessandrod

Looks great! Left some comments

alessandrod · 2024-12-11T05:05:02Z

accounts-db/src/read_only_accounts_cache.rs

@@ -137,7 +137,7 @@ impl ReadOnlyAccountsCache {
        }
    }

-    pub(crate) fn load(&self, pubkey: Pubkey, slot: Slot) -> Option<AccountSharedData> {
+    pub fn load(&self, pubkey: Pubkey, slot: Slot) -> Option<AccountSharedData> {


nit: here and in store, make these public only under the dev-context-only-utils
feature. When reading code there's value in knowing whether something is
actually used by the validator or if it's being exported for tests/benches.

Honest dumb question - how?

If this was a regular function, I could simply do something like:

#[cfg(feature = "dev-context-only-utils")] pub use load;

But for the methods, the only way I can imagine is:

impl ReadOnlyAccountsCache { #[cfg(feature = "dev-context-only-utils")] pub fn load(&self, pubkey: Pubkey, slot: Slot) -> Option<AccountSharedData> { self._load(pubkey,slot) } #[cfg(not(feature = "dev-context-only-utils"))] fn load(&self, pubkey: Pubkey, slot: Slot) -> Option<AccountSharedData> { self._load(pubkey, slot) } fn _load(&self, pubkey: Pubkey, slot: Slot) -> Option<AccountSharedData> { // do stuff } }

Is there any other less dumb trick I'm missing?

alessandrod · 2024-12-11T05:06:25Z

accounts-db/benches/bench_read_only_accounts_cache.rs

@@ -0,0 +1,208 @@
+#![feature(test)]


nit: some benches have the bench_ prefix in the filename, some don't. I'd drop it.

alessandrod · 2024-12-11T05:08:44Z

accounts-db/benches/bench_read_only_accounts_cache.rs

+        ));
+
+        // Prepare accounts for the cache fillup.
+        let pubkeys: Vec<_> = std::iter::repeat_with(solana_sdk::pubkey::new_rand)


nit: add use imports for solana_sdk and std at the top, don't use full paths

alessandrod · 2024-12-11T05:18:54Z

accounts-db/benches/bench_read_only_accounts_cache.rs

+
+/// Benchmarks the read-only cache eviction mechanism. It does so by performing
+/// multithreaded reads and writes on a full cache. Each write triggers
+/// eviction. Background reads add more contention.


This is a great bench, but it tests the worst case right? Do we have benches for the cases when we have no evictions, and also for when we have a "normal" amount of evictions? Normal being X every 500ms or so? Something that approximates the current eviction rate on mnb.

We want to test the worst case, and we want to improve perf with higher throughput than current mnb, but we also want to ensure we don't accidentally regress with current load.

This is a great bench, but it tests the worst case right? Do we have benches for the cases when we have no evictions, and also for when we have a "normal" amount of evictions? Normal being X every 500ms or so? Something that approximates the current eviction rate on mnb.

No, we don't. This is the only bench which triggers any eviction at all. I will add a case closer to the current mnb numbers.

Well, the "no eviction" case is handled here -

agave/accounts-db/benches/accounts.rs

Lines 126 to 181 in 8c3a6bb

fn store_accounts_with_possible_contention<F>(bench_name: &str, bencher: &mut Bencher, reader_f: F)

where

F: Fn(&Accounts, &[Pubkey]) + Send + Copy + 'static,

{

let num_readers = 5;

let accounts_db = new_accounts_db(vec![PathBuf::from(

std::env::var("FARF_DIR").unwrap_or_else(|_| "farf".to_string()),

)

.join(bench_name)]);

let accounts = Arc::new(Accounts::new(Arc::new(accounts_db)));

let num_keys = 1000;

let slot = 0;

let pubkeys: Vec<_> = std::iter::repeat_with(solana_sdk::pubkey::new_rand)

.take(num_keys)

.collect();

let accounts_data: Vec<_> = std::iter::repeat(

Account {

lamports: 1,

..Default::default()

}

.to_account_shared_data(),

)

.take(num_keys)

.collect();

let storable_accounts: Vec<_> = pubkeys.iter().zip(accounts_data.iter()).collect();

accounts.store_accounts_cached((slot, storable_accounts.as_slice()));

accounts.add_root(slot);

accounts

.accounts_db

.flush_accounts_cache_slot_for_tests(slot);

let pubkeys = Arc::new(pubkeys);

for i in 0..num_readers {

let accounts = accounts.clone();

let pubkeys = pubkeys.clone();

Builder::new()

.name(format!("reader{i:02}"))

.spawn(move || {

reader_f(&accounts, &pubkeys);

})

.unwrap();

}

let num_new_keys = 1000;

bencher.iter(|| {

let new_pubkeys: Vec<_> = std::iter::repeat_with(solana_sdk::pubkey::new_rand)

.take(num_new_keys)

.collect();

let new_storable_accounts: Vec<_> = new_pubkeys.iter().zip(accounts_data.iter()).collect();

// Write to a different slot than the one being read from. Because

// there's a new account pubkey being written to every time, will

// compete for the accounts index lock on every store

accounts.store_accounts_cached((slot + 1, new_storable_accounts.as_slice()));

});

}

But I'm not a big fan of it, because:

It does iter().collect() inside the bench.

It adds only 1000 accounts.

It doesn't use criterion. 😛

So I think it makes sense to bench that case in this file.

I don't have opinions here, up to you and accounts-db peeps to decide whether to improve this bench or consolidate with the new ones you're writing!

alessandrod · 2024-12-11T05:33:58Z

accounts-db/benches/bench_read_only_accounts_cache.rs

+                                            Account {
+                                                lamports: 1,
+                                                // 1 MiB
+                                                data: vec![1; 1024 * 1024],


I think you want to do these allocations outside the bench, otherwise this is
what takes the most time

Also, do they have to be 1MB? This is a contention bug right? The size of
accounts shouldn't matter

Good point with the allocation, will move outside the bench.

Regarding the size - the size itself doesn't matter, but a bigger size triggers eviction faster. The default eviction threshold is 410 * 1024 * 1024 (410 MiB). CACHE_ENTRY_SIZE is 144. If you're benching without account data, you need 2985529 accounts to fill up the cache. This is the reason why the currently existing bench isn't anywhere close to filling up the cache - it creates 1000 accounts without data.

I'm open to picking other size which is closer to the average on mnb, I'll think of it today.

OK, the average size calculated in the most silly way would be:

>>> 265633981325/744678683 356.7095277319762

356 bytes, lol. That makes sense, given that the most of accounts on mnb are token accounts. In that case, I'm not sure if we should go by some size like 256 or 512 bytes, or should we just skip the size all together and bench with >2985529 empty accounts. Leaning towards the latter.

We should also bench the 0 data case, as that gives us the worst case in terms of [number of things we store]. So I'd say 0 and something reasonable, 512b or 1K.

alessandrod · 2024-12-11T05:35:24Z

accounts-db/benches/bench_read_only_accounts_cache.rs

+                        let cache = cache.clone();
+                        let new_pubkeys = new_pubkeys[i].clone();
+
+                        Builder::new()


I think you should pre-create the writers outside the bench, and make them sleep
on a https://doc.rust-lang.org/stable/std/sync/struct.Barrier.html

Then in the bench you should trigger the barrier, then wait for all the writers to terminate. Otherwise
you're adding thread creation time to your bench, which will get slower with the
more threads you add

I actually don't think you can use the barrier, because iter() will want to repeat this many times. You'll need to get a little creative :P

ah no nevermind, barriers can be used more than once, nice

The already existing `concurrent_{read,scan}_write` benchmarks are not sufficient for benchmarking the eviction and evaluating what kind of eviction policy performs the best, because they don't fill up the cache, so eviction never happens. Add a new benchmark, which starts measuring the concurrent reads and writes on a full cache.

vadorovsky requested review from alessandrod and brooksprumo December 10, 2024 19:52

vadorovsky force-pushed the benchmark-cache-evictions branch from 100c57d to 8206e16 Compare December 10, 2024 19:53

alessandrod reviewed Dec 11, 2024

View reviewed changes

vadorovsky force-pushed the benchmark-cache-evictions branch from 8206e16 to 2766162 Compare December 11, 2024 15:09

vadorovsky force-pushed the benchmark-cache-evictions branch from 2766162 to 42ca375 Compare December 11, 2024 15:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

accounts-db: Benchmark cache evictions #4045

accounts-db: Benchmark cache evictions #4045

vadorovsky commented Dec 10, 2024

alessandrod left a comment

alessandrod Dec 11, 2024

vadorovsky Dec 11, 2024

alessandrod Dec 11, 2024

alessandrod Dec 11, 2024

alessandrod Dec 11, 2024 •

edited

Loading

vadorovsky Dec 11, 2024

vadorovsky Dec 11, 2024 •

edited

Loading

alessandrod Dec 11, 2024

alessandrod Dec 11, 2024

alessandrod Dec 11, 2024

vadorovsky Dec 11, 2024

vadorovsky Dec 11, 2024

alessandrod Dec 11, 2024

alessandrod Dec 11, 2024 •

edited

Loading

alessandrod Dec 11, 2024

alessandrod Dec 11, 2024

	fn store_accounts_with_possible_contention<F>(bench_name: &str, bencher: &mut Bencher, reader_f: F)
	where
	F: Fn(&Accounts, &[Pubkey]) + Send + Copy + 'static,
	{
	let num_readers = 5;
	let accounts_db = new_accounts_db(vec![PathBuf::from(
	std::env::var("FARF_DIR").unwrap_or_else(\|_\| "farf".to_string()),
	)
	.join(bench_name)]);
	let accounts = Arc::new(Accounts::new(Arc::new(accounts_db)));
	let num_keys = 1000;
	let slot = 0;

	let pubkeys: Vec<_> = std::iter::repeat_with(solana_sdk::pubkey::new_rand)
	.take(num_keys)
	.collect();
	let accounts_data: Vec<_> = std::iter::repeat(
	Account {
	lamports: 1,
	..Default::default()
	}
	.to_account_shared_data(),
	)
	.take(num_keys)
	.collect();
	let storable_accounts: Vec<_> = pubkeys.iter().zip(accounts_data.iter()).collect();
	accounts.store_accounts_cached((slot, storable_accounts.as_slice()));
	accounts.add_root(slot);
	accounts
	.accounts_db
	.flush_accounts_cache_slot_for_tests(slot);

	let pubkeys = Arc::new(pubkeys);
	for i in 0..num_readers {
	let accounts = accounts.clone();
	let pubkeys = pubkeys.clone();
	Builder::new()
	.name(format!("reader{i:02}"))
	.spawn(move \|\| {
	reader_f(&accounts, &pubkeys);
	})
	.unwrap();
	}

	let num_new_keys = 1000;
	bencher.iter(\|\| {
	let new_pubkeys: Vec<_> = std::iter::repeat_with(solana_sdk::pubkey::new_rand)
	.take(num_new_keys)
	.collect();
	let new_storable_accounts: Vec<_> = new_pubkeys.iter().zip(accounts_data.iter()).collect();
	// Write to a different slot than the one being read from. Because
	// there's a new account pubkey being written to every time, will
	// compete for the accounts index lock on every store
	accounts.store_accounts_cached((slot + 1, new_storable_accounts.as_slice()));
	});
	}

accounts-db: Benchmark cache evictions #4045

Are you sure you want to change the base?

accounts-db: Benchmark cache evictions #4045

Conversation

vadorovsky commented Dec 10, 2024

alessandrod left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alessandrod Dec 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vadorovsky Dec 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alessandrod Dec 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alessandrod Dec 11, 2024 •

edited

Loading

vadorovsky Dec 11, 2024 •

edited

Loading

alessandrod Dec 11, 2024 •

edited

Loading