prioritization fee cache: remove lru crate by fanatid · Pull Request #30 · anza-xyz/agave

fanatid · 2024-03-03T17:45:30Z

Problem

lru::LruCache requires write lock for any action.

Summary of Changes

Use BTreeMap instead of LruCache, write lock would be required only on finalizing slot + BTreeMap allows easily to remove the oldest slot.

CriesofCarrots · 2024-03-13T20:43:23Z

Looks like this needs a ./cargo-for-all-lock-files.sh tree run to get further in CI

fanatid · 2024-03-13T21:25:20Z

Updated

tao-stones · 2024-03-13T23:25:17Z

Can you benchmark before/after change for comparison?

This would run existing bench, if you found these benches are outdated, please feel free to update them 😄
./cargo nightly bench --manifest-path runtime/Cargo.toml -- -Z unstable-options bench_process_transactions --ignored

fanatid · 2024-03-14T01:12:08Z

master 151675b

running 2 tests
test bench_process_transactions_multiple_slots ... bench:  13,967,987 ns/iter (+/- 1,213,641)
test bench_process_transactions_single_slot    ... bench:   2,059,505 ns/iter (+/- 411,210)

PR 3030611

running 2 tests
test bench_process_transactions_multiple_slots ... bench:  12,907,941 ns/iter (+/- 1,044,988)
test bench_process_transactions_single_slot    ... bench:   1,339,063 ns/iter (+/- 239,231)

but I do not think that this display real improvements because we only parse transactions and send results to background update thread
not sure how to benchmark background thread, so I added std output (where cache_lock_time few times less)

master

running 1 test
slot_finalize_time 128950us, cache_lock_time 7069us
test bench_process_transactions_single_slot    ... bench:   1,704,378 ns/iter (+/- 186,777)

PR

running 1 test
slot_finalize_time 119721us, cache_lock_time 1765us
test bench_process_transactions_single_slot    ... bench:   1,592,675 ns/iter (+/- 249,831)

diff:

diff --git a/runtime/benches/prioritization_fee_cache.rs b/runtime/benches/prioritization_fee_cache.rs
index 8c6bf1fe0a..f97691625d 100644
--- a/runtime/benches/prioritization_fee_cache.rs
+++ b/runtime/benches/prioritization_fee_cache.rs
@@ -44,22 +44,29 @@ fn build_sanitized_transaction(
 fn bench_process_transactions_single_slot(bencher: &mut Bencher) {
     let prioritization_fee_cache = PrioritizationFeeCache::default();
 
-    let bank = Arc::new(Bank::default_for_tests());
+    let GenesisConfigInfo { genesis_config, .. } = create_genesis_config(10_000);
+    let bank0 = Bank::new_for_benches(&genesis_config);
+    let bank_forks = BankForks::new_rw_arc(bank0);
+    let bank = bank_forks.read().unwrap().working_bank();
+    let collector = solana_sdk::pubkey::new_rand();
+    let mut n = 0;
+    bencher.iter(move || {
+        n += 1;
+        let bank = Bank::new_from_parent(bank.clone(), &collector, n);
 
-    // build test transactions
-    let transactions: Vec<_> = (0..5000)
-        .map(|n| {
-            let compute_unit_price = n % 7;
-            build_sanitized_transaction(
-                compute_unit_price,
-                &Pubkey::new_unique(),
-                &Pubkey::new_unique(),
-            )
-        })
-        .collect();
+        let transactions = (0..500)
+            .map(|n| {
+                let compute_unit_price = n % 7;
+                build_sanitized_transaction(
+                    compute_unit_price,
+                    &Pubkey::new_unique(),
+                    &Pubkey::new_unique(),
+                )
+            })
+            .collect::<Vec<_>>();
 
-    bencher.iter(|| {
         prioritization_fee_cache.update(&bank, transactions.iter());
+        prioritization_fee_cache.finalize_priority_fee(bank.slot(), bank.bank_id());
     });
 }
 
diff --git a/runtime/src/prioritization_fee_cache.rs b/runtime/src/prioritization_fee_cache.rs
index 0490f59445..176a7ba405 100644
--- a/runtime/src/prioritization_fee_cache.rs
+++ b/runtime/src/prioritization_fee_cache.rs
@@ -44,6 +44,9 @@ struct PrioritizationFeeCacheMetrics {
 
     // Accumulated time spent on finalizing block prioritization fees.
     total_block_finalize_elapsed_us: AtomicU64,
+
+    finalized: AtomicU64,
+    cache: AtomicU64,
 }
 
 impl PrioritizationFeeCacheMetrics {
@@ -65,6 +68,7 @@ impl PrioritizationFeeCacheMetrics {
     fn accumulate_total_cache_lock_elapsed_us(&self, val: u64) {
         self.total_cache_lock_elapsed_us
             .fetch_add(val, Ordering::Relaxed);
+        self.cache.fetch_add(val, Ordering::Relaxed);
     }
 
     fn accumulate_total_entry_update_elapsed_us(&self, val: u64) {
@@ -75,6 +79,7 @@ impl PrioritizationFeeCacheMetrics {
     fn accumulate_total_block_finalize_elapsed_us(&self, val: u64) {
         self.total_block_finalize_elapsed_us
             .fetch_add(val, Ordering::Relaxed);
+        self.finalized.fetch_add(val, Ordering::Relaxed);
     }
 
     fn report(&self, slot: Slot) {
@@ -161,6 +166,7 @@ impl Drop for PrioritizationFeeCache {
             .unwrap()
             .join()
             .expect("Prioritization fee cache servicing thread failed to join");
+        println!("slot_finalize_time {}us, cache_lock_time {}us", self.metrics.finalized.load(Ordering::Relaxed), self.metrics.cache.load(Ordering::Relaxed));
     }
 }

tao-stones

Overall a good change!

tao-stones · 2024-03-14T17:04:35Z

so I added std output (where cache_lock_time few times less)

Great to see significant decrease on cache_lock_time!

Do you mind to add entry_update_time to std output alone with cache_lock_time and slot_finalize_time to get a better picture? Thank you.

fanatid · 2024-03-14T18:52:27Z

master:

running 1 test
cache 7620us, update: 158405us, finalize 128463us
test bench_process_transactions_single_slot    ... bench:   1,747,303 ns/iter (+/- 323,183)

PR

running 1 test
cache 1527us, update: 392736us, finalize 116258us
test bench_process_transactions_single_slot    ... bench:   1,533,384 ns/iter (+/- 213,986)

PR after bf3cae6 (update per tx instead of per bank)

running 1 test
cache 1785us, update: 149256us, finalize 107472us
test bench_process_transactions_single_slot    ... bench:   1,674,978 ns/iter (+/- 414,933)

but honestly entry_update_time is very unstable here, I had seen that numbers changed for master and updated PR too much, I had seen 2 times difference:

running 1 test
cache 1785us, update: 149256us, finalize 107472us
test bench_process_transactions_single_slot    ... bench:   1,674,978 ns/iter (+/- 414,933)
running 1 test
cache 3929us, update: 318381us, finalize 243182us
test bench_process_transactions_single_slot    ... bench:   1,690,419 ns/iter (+/- 191,294)

not sure would be good to send an update per tx or collect + send per bank

t-nelson

this seems to be making several independent changes simultaneously. we generally reject these in favor of one pr per change. i'd highly recommend breaking it up

tao-stones · 2024-03-15T14:37:39Z

this seems to be making several independent changes simultaneously. we generally reject these in favor of one pr per change. i'd highly recommend breaking it up

Yea, agree it'd be better (at least easier to review) if can break into separate PRs, perhaps:

replace lruCache with BtreeMap with manual capacity control
Introducing unfinalized cache at receiver thread,
batch updates

fanatid · 2024-03-16T02:08:04Z

lruCache and unfinalized are related to each other, code would need to be adjusted after each PR again.
While it's possible and I moved unfinalized to #272 but doesn't make sense at all, I already split getRPF changes for 2 PRs (this and #217)

The batch update was removed after the benchmark.

BTreeMap::retain doesn't make sense, better switch to HashMap then. BTreeMap was used because we iterate in order and can break earlier, HashMap would be faster. I had made a change in #272.

CriesofCarrots · 2024-03-16T17:17:36Z

BTreeMap::retain doesn't make sense, better switch to HashMap then. BTreeMap was used because we iterate in order and can break earlier, HashMap would be faster. I had made a change in #272.

Did you read the entirety of my comment? split_off would take advantage of the ordering and not require visiting every element.

fanatid · 2024-03-27T02:36:45Z

@tao-stones updated

CriesofCarrots · 2024-03-27T02:42:03Z

Can you please rebase on master and clean up the commit history? We are generally opposed to merge commits. Also, it looks like the description needs updating.

codecov-commenter · 2024-03-27T04:28:16Z

Codecov Report

Attention: Patch coverage is 95.34884% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 81.9%. Comparing base (80d3200) to head (f815a5c).

Additional details and impacted files

@@           Coverage Diff           @@
##           master      #30   +/-   ##
=======================================
  Coverage    81.8%    81.9%           
=======================================
  Files         841      841           
  Lines      228242   228246    +4     
=======================================
+ Hits       186923   186940   +17     
+ Misses      41319    41306   -13

mergify Bot added community need:merge-assist labels Mar 3, 2024

mergify Bot requested a review from a team March 3, 2024 17:46

fanatid force-pushed the getRPF-opt branch from 00811c1 to 419a64c Compare March 13, 2024 01:52

fanatid mentioned this pull request Mar 13, 2024

rpc: add percentile to getRecentPrioritizationFees #217

Closed

CriesofCarrots added the CI Pull Request is ready to enter CI label Mar 13, 2024

anza-team removed the CI Pull Request is ready to enter CI label Mar 13, 2024

CriesofCarrots requested a review from tao-stones March 13, 2024 19:57

CriesofCarrots added the CI Pull Request is ready to enter CI label Mar 13, 2024

anza-team removed the CI Pull Request is ready to enter CI label Mar 13, 2024

CriesofCarrots added the CI Pull Request is ready to enter CI label Mar 13, 2024

anza-team removed the CI Pull Request is ready to enter CI label Mar 13, 2024

tao-stones reviewed Mar 14, 2024

View reviewed changes

t-nelson reviewed Mar 15, 2024

View reviewed changes

Comment thread runtime/src/prioritization_fee_cache.rs Outdated

fanatid mentioned this pull request Mar 16, 2024

prioritization fee cache: remove not required locks #272

Merged

fanatid changed the title ~~Optimize prioritization fee cache~~ prioritization fee cache: remove lru crate Mar 27, 2024

prioritization fee cache: remove lru crate

f815a5c

fanatid force-pushed the getRPF-opt branch from 0d1cb44 to f815a5c Compare March 27, 2024 02:48

CriesofCarrots approved these changes Mar 27, 2024

View reviewed changes

CriesofCarrots merged commit ba9c25c into anza-xyz:master Mar 27, 2024

fanatid deleted the getRPF-opt branch March 27, 2024 19:07

OliverNChalk pushed a commit to OliverNChalk/agave that referenced this pull request Nov 11, 2025

report stats for repair ping cache (anza-xyz#30)

9af6a27

Conversation

fanatid commented Mar 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Summary of Changes

Uh oh!

CriesofCarrots commented Mar 13, 2024

Uh oh!

fanatid commented Mar 13, 2024

Uh oh!

tao-stones commented Mar 13, 2024

Uh oh!

fanatid commented Mar 14, 2024

Uh oh!

tao-stones left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tao-stones commented Mar 14, 2024

Uh oh!

fanatid commented Mar 14, 2024

Uh oh!

t-nelson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tao-stones commented Mar 15, 2024

Uh oh!

fanatid commented Mar 16, 2024

Uh oh!

CriesofCarrots commented Mar 16, 2024

Uh oh!

fanatid commented Mar 27, 2024

Uh oh!

CriesofCarrots commented Mar 27, 2024

Uh oh!

codecov-commenter commented Mar 27, 2024

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

fanatid commented Mar 3, 2024 •

edited

Loading