-
Notifications
You must be signed in to change notification settings - Fork 201
perf(l1): add bloom filter to skip trie seeks for non-existent storage slots #6288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
f3b6e77
1629d73
15133f8
f4b9bce
1088c5b
b722ab1
b19f9a6
46110af
cb95ce9
713517a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,137 @@ | ||
| use std::fmt; | ||
| use std::sync::OnceLock; | ||
| use std::sync::atomic::{AtomicBool, Ordering}; | ||
|
|
||
| use ethrex_common::{Address, H256}; | ||
| use fastbloom::AtomicBloomFilter; | ||
| use rustc_hash::FxBuildHasher; | ||
|
|
||
| const FALSE_POSITIVE_RATE: f64 = 0.01; | ||
|
|
||
| /// Bloom filter that tracks which (address, storage_key) pairs have non-zero | ||
| /// storage values. Used to skip expensive trie lookups for slots that were | ||
| /// never written to. | ||
| /// | ||
| /// The filter is allocated lazily on first `insert()` to avoid ~240MB of | ||
| /// upfront memory when the bloom is never used (e.g., dev mode, testnets). | ||
| pub struct StorageBloomFilter { | ||
| filter: OnceLock<AtomicBloomFilter<FxBuildHasher>>, | ||
| capacity: usize, | ||
| enabled: AtomicBool, | ||
| } | ||
|
|
||
| impl fmt::Debug for StorageBloomFilter { | ||
| fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { | ||
| f.debug_struct("StorageBloomFilter").finish() | ||
| } | ||
| } | ||
|
|
||
| impl StorageBloomFilter { | ||
| pub fn new(capacity: usize) -> Self { | ||
| Self { | ||
| filter: OnceLock::new(), | ||
| capacity, | ||
| enabled: AtomicBool::new(false), | ||
| } | ||
| } | ||
|
|
||
| /// Activate the bloom filter after it has been populated. | ||
| /// Before this is called, `might_contain` always returns `true` (pass-through). | ||
| /// | ||
| /// # Precondition | ||
| /// | ||
| /// The filter MUST have been fully populated (via `insert`) for ALL | ||
| /// (address, storage_key) pairs that exist in the trie before this is | ||
| /// called. This includes genesis slots, snap-synced data, and all slots | ||
| /// written during block processing. Calling `enable()` prematurely will | ||
| /// cause false negatives that silently corrupt storage reads. | ||
| #[allow(dead_code)] | ||
| pub fn enable(&self) { | ||
| self.enabled.store(true, Ordering::Release); | ||
| } | ||
|
Comment on lines
+49
to
+51
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Consider adding a doc-comment warning directly on /// Activate the bloom filter.
///
/// # Safety / correctness
/// The filter MUST have been fully populated (via `insert`) for all
/// (address, storage_key) pairs that exist in the trie before this is called.
/// Calling `enable()` on a partially-populated filter will cause
/// `might_contain` to return `false` for real slots, producing silent
/// incorrect `None` results from `get_storage_at_root`.
pub fn enable(&self) {
self.enabled.store(true, Ordering::Release);
}Prompt To Fix With AIThis is a comment left during a code review.
Path: crates/storage/bloom.rs
Line: 36-38
Comment:
**`enable()` is never called — silent correctness risk for future callers**
`enable()` has no callers in the repository today. When someone eventually wires it up (the "Future work" mentioned in the PR), they must ensure the filter has been **fully populated** from existing trie storage before calling it; calling `enable()` prematurely on a partially-populated filter would cause `might_contain` to return `false` for slots that genuinely have a value, silently returning `Ok(None)` instead of the correct stored value.
Consider adding a doc-comment warning directly on `enable()` about this requirement so the footgun is visible at the call site:
```rust
/// Activate the bloom filter.
///
/// # Safety / correctness
/// The filter MUST have been fully populated (via `insert`) for all
/// (address, storage_key) pairs that exist in the trie before this is called.
/// Calling `enable()` on a partially-populated filter will cause
/// `might_contain` to return `false` for real slots, producing silent
/// incorrect `None` results from `get_storage_at_root`.
pub fn enable(&self) {
self.enabled.store(true, Ordering::Release);
}
```
How can I resolve this? If you propose a fix, please make it concise.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| /// Record that a non-zero value exists at (address, key). | ||
| /// | ||
| /// Called unconditionally on every non-zero storage write, even while the | ||
| /// filter is disabled. This is intentional warm-up: the filter is populated | ||
| /// in the background so it is ready when `enable()` is eventually called. | ||
| pub fn insert(&self, address: Address, key: H256) { | ||
| let bloom_key = Self::make_key(address, key); | ||
| self.filter().insert(&bloom_key); | ||
| } | ||
|
|
||
| /// Returns `true` if the slot *might* contain a non-zero value. | ||
| /// Returns `false` if the slot was definitely never written. | ||
| /// When the filter is not yet enabled, always returns `true` (pass-through). | ||
| pub fn might_contain(&self, address: Address, key: H256) -> bool { | ||
| if !self.enabled.load(Ordering::Acquire) { | ||
| return true; | ||
| } | ||
| let bloom_key = Self::make_key(address, key); | ||
| self.filter().contains(&bloom_key) | ||
| } | ||
|
|
||
| fn filter(&self) -> &AtomicBloomFilter<FxBuildHasher> { | ||
| self.filter.get_or_init(|| { | ||
| AtomicBloomFilter::with_false_pos(FALSE_POSITIVE_RATE) | ||
| .hasher(FxBuildHasher) | ||
| .expected_items(self.capacity) | ||
| }) | ||
| } | ||
|
|
||
| fn make_key(address: Address, key: H256) -> [u8; 52] { | ||
| let mut buf = [0u8; 52]; | ||
| buf[..20].copy_from_slice(address.as_bytes()); | ||
| buf[20..].copy_from_slice(key.as_bytes()); | ||
| buf | ||
| } | ||
| } | ||
|
Comment on lines
+1
to
+88
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No unit tests for the new module
Suggested minimal test cases:
Given that a correctness regression here (false negatives while enabled) would silently return Prompt To Fix With AIThis is a comment left during a code review.
Path: crates/storage/bloom.rs
Line: 1-63
Comment:
**No unit tests for the new module**
`bloom.rs` introduces a `StorageBloomFilter` with non-trivial behaviour: a disabled filter must be a transparent pass-through, and an enabled filter must never produce false negatives. Neither property is covered by a test.
Suggested minimal test cases:
- A newly created filter (disabled) always returns `true` from `might_contain`.
- After `enable()`, a key that was `insert()`-ed still returns `true`.
- After `enable()`, a key that was **never** inserted returns `false` (no false negatives for a key set that is small relative to capacity).
- `make_key` produces distinct bytes for `(a, k₁)` vs `(a, k₂)` and `(a₁, k)` vs `(a₂, k)`.
Given that a correctness regression here (false negatives while enabled) would silently return `None` for slots that actually have a value, tests are especially important.
How can I resolve this? If you propose a fix, please make it concise.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| #[cfg(test)] | ||
| mod tests { | ||
| use super::*; | ||
|
|
||
| fn addr(n: u8) -> Address { | ||
| Address::from([n; 20]) | ||
| } | ||
|
|
||
| fn key(n: u8) -> H256 { | ||
| H256::from([n; 32]) | ||
| } | ||
|
|
||
| #[test] | ||
| fn disabled_is_pass_through() { | ||
| let bloom = StorageBloomFilter::new(1000); | ||
| // Before enable, might_contain always returns true | ||
| assert!(bloom.might_contain(addr(1), key(1))); | ||
| assert!(bloom.might_contain(addr(99), key(255))); | ||
| } | ||
|
|
||
| #[test] | ||
| fn no_false_negatives_after_enable() { | ||
| let bloom = StorageBloomFilter::new(1000); | ||
| bloom.insert(addr(1), key(10)); | ||
| bloom.insert(addr(2), key(20)); | ||
| bloom.enable(); | ||
| // Inserted keys must always return true | ||
| assert!(bloom.might_contain(addr(1), key(10))); | ||
| assert!(bloom.might_contain(addr(2), key(20))); | ||
| } | ||
|
|
||
| #[test] | ||
| fn rejects_unknown_after_enable() { | ||
| let bloom = StorageBloomFilter::new(1000); | ||
| bloom.insert(addr(1), key(10)); | ||
| bloom.enable(); | ||
| // A never-inserted key should return false (with high probability) | ||
| assert!(!bloom.might_contain(addr(99), key(99))); | ||
| } | ||
|
|
||
| #[test] | ||
| fn make_key_distinctness() { | ||
| // Different (address, key) pairs must produce different bloom keys | ||
| let k1 = StorageBloomFilter::make_key(addr(1), key(2)); | ||
| let k2 = StorageBloomFilter::make_key(addr(2), key(1)); | ||
| assert_ne!(k1, k2); | ||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -66,6 +66,7 @@ | |
|
|
||
| pub mod api; | ||
| pub mod backend; | ||
| mod bloom; | ||
| pub mod error; | ||
| mod layering; | ||
| pub mod rlp; | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| @@ -1,5 +1,6 @@ | ||||||||||
| #[cfg(feature = "rocksdb")] | ||||||||||
| use crate::backend::rocksdb::RocksDBBackend; | ||||||||||
| use crate::bloom::StorageBloomFilter; | ||||||||||
| use crate::{ | ||||||||||
| STORE_METADATA_FILENAME, STORE_SCHEMA_VERSION, | ||||||||||
| api::{ | ||||||||||
|
|
@@ -187,6 +188,10 @@ pub struct Store { | |||||||||
| /// Uses FxHashMap for efficient lookups, much smaller than code cache. | ||||||||||
| code_metadata_cache: Arc<Mutex<rustc_hash::FxHashMap<H256, CodeMetadata>>>, | ||||||||||
|
|
||||||||||
| /// Bloom filter tracking (address, storage_key) pairs with non-zero values. | ||||||||||
| /// Used to skip trie lookups for storage slots that were never written. | ||||||||||
| storage_bloom: Arc<StorageBloomFilter>, | ||||||||||
|
|
||||||||||
| background_threads: Arc<ThreadList>, | ||||||||||
| } | ||||||||||
|
|
||||||||||
|
|
@@ -1164,6 +1169,10 @@ impl Store { | |||||||||
|
|
||||||||||
| /// CAUTION: This method writes directly to the underlying database, bypassing any caching layer. | ||||||||||
| /// For updating the state after block execution, use [`Self::store_block_updates`]. | ||||||||||
| /// | ||||||||||
| /// NOTE: This method does not update the storage bloom filter. Slots written | ||||||||||
| /// through this path (e.g., snap sync) will be invisible to `might_contain` | ||||||||||
| /// after `enable()`. A backfill step is needed before enabling the bloom. | ||||||||||
| pub async fn write_storage_trie_nodes_batch( | ||||||||||
| &self, | ||||||||||
| storage_trie_nodes: StorageUpdates, | ||||||||||
|
|
@@ -1494,6 +1503,7 @@ impl Store { | |||||||||
| last_computed_flatkeyvalue: Arc::new(RwLock::new(last_written)), | ||||||||||
| account_code_cache: Arc::new(Mutex::new(CodeCache::default())), | ||||||||||
| code_metadata_cache: Arc::new(Mutex::new(rustc_hash::FxHashMap::default())), | ||||||||||
| storage_bloom: Arc::new(StorageBloomFilter::new(200_000_000)), | ||||||||||
|
||||||||||
| storage_bloom: Arc::new(StorageBloomFilter::new(200_000_000)), | |
| storage_bloom: Arc::new(StorageBloomFilter::new(2_000_000)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
~240 MB allocated at startup before the filter is useful
StorageBloomFilter::new(200_000_000) with a 1 % false-positive rate requires roughly m = -n·ln(p) / ln(2)² bits ≈ 1.92 billion bits ≈ 240 MB of RAM. This allocation happens unconditionally at node startup, even though:
enable()is never called anywhere in the codebase, somight_containalways returnstrue(pass-through).- The "Future work" item (scan existing trie on startup then call
enable()) has not yet been implemented.
The net effect today is a large fixed memory cost with zero benefit — every trie lookup still executes.
Consider either (a) deferring the allocation until enable() is about to be called, (b) making capacity configurable/feature-gated, or (c) landing this PR together with the startup-scan + enable() call so the optimisation is actually active from the first merge.
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/store.rs
Line: 1491
Comment:
**~240 MB allocated at startup before the filter is useful**
`StorageBloomFilter::new(200_000_000)` with a 1 % false-positive rate requires roughly `m = -n·ln(p) / ln(2)²` bits ≈ **1.92 billion bits ≈ 240 MB** of RAM. This allocation happens unconditionally at node startup, even though:
1. `enable()` is never called anywhere in the codebase, so `might_contain` always returns `true` (pass-through).
2. The "Future work" item (scan existing trie on startup then call `enable()`) has not yet been implemented.
The net effect today is a large fixed memory cost with **zero benefit** — every trie lookup still executes.
Consider either (a) deferring the allocation until `enable()` is about to be called, (b) making `capacity` configurable/feature-gated, or (c) landing this PR together with the startup-scan + `enable()` call so the optimisation is actually active from the first merge.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot
AI
Mar 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR description says the bloom is “disabled by default (pass-through), so behavior is identical”, but insert() is still called on every non-zero storage write even while disabled. That adds hashing + atomic updates to the write path with no read-side benefit until enable() is wired up. If the intent is truly zero-overhead until enabled, consider gating insert() behind a separate runtime/feature flag or deferring population until you’re ready to enable the filter.
| self.storage_bloom.insert(update.address, *storage_key); | |
| if self.storage_bloom.is_enabled() { | |
| self.storage_bloom.insert(update.address, *storage_key); | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
insert() incurs hashing cost even while the filter is disabled
insert() always computes make_key and hashes into the underlying AtomicBloomFilter, even when enabled == false. Since might_contain short-circuits to true while disabled, these inserts represent pure overhead on every non-zero storage write until enable() is eventually called.
If the motivation for inserting unconditionally is to "warm up" the filter so it's ready when enable() is called, that is a reasonable design — but it should be explicitly documented with a comment. Otherwise, a future reader may add an early-return guard in insert() that would actually break the warm-up intent.
The same applies to the second insert site at line 1830.
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/store.rs
Line: 1738
Comment:
**`insert()` incurs hashing cost even while the filter is disabled**
`insert()` always computes `make_key` and hashes into the underlying `AtomicBloomFilter`, even when `enabled == false`. Since `might_contain` short-circuits to `true` while disabled, these inserts represent **pure overhead** on every non-zero storage write until `enable()` is eventually called.
If the motivation for inserting unconditionally is to "warm up" the filter so it's ready when `enable()` is called, that is a reasonable design — but it should be explicitly documented with a comment. Otherwise, a future reader may add an early-return guard in `insert()` that would actually break the warm-up intent.
The same applies to the second insert site at line 1830.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot
AI
Mar 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
storage_bloom is populated here for added_storage, but there are other code paths that write non-zero values into storage tries without calling storage_bloom.insert (e.g. setup_genesis_state_trie inserts directly into a storage trie). If enable() is ever called without a full scan/population step that covers those paths, this can introduce false negatives and incorrect get_storage_at_root results. Consider centralizing storage-trie writes or ensuring all non-zero writes go through a helper that also updates the bloom.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the feature never enabled?