Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/scripts/check-hive-results.sh
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ KNOWN_FLAKY_TESTS=(
"Invalid Missing Ancestor Syncing ReOrg, Timestamp, EmptyTxs=False, CanonicalReOrg=False, Invalid P8"
"Invalid Missing Ancestor Syncing ReOrg, Timestamp, EmptyTxs=False, CanonicalReOrg=True, Invalid P8"
"Invalid Missing Ancestor Syncing ReOrg, Transaction Value, EmptyTxs=False, CanonicalReOrg=False, Invalid P9"
"Invalid Missing Ancestor Syncing ReOrg, Transaction Nonce, EmptyTxs=False, CanonicalReOrg=True, Invalid P9"
)

# Build a jq filter that excludes known-flaky tests.
Expand Down
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@

## Perf

### 2026-03-03

- Add bloom filter to skip trie seeks for non-existent storage slots [#6288](https://github.com/lambdaclass/ethrex/pull/6288)

### 2026-03-02

- SIMD-accelerate trie nibble operations for block execution [#6286](https://github.com/lambdaclass/ethrex/pull/6286)
Expand Down
137 changes: 137 additions & 0 deletions crates/storage/bloom.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
use std::fmt;
use std::sync::OnceLock;
use std::sync::atomic::{AtomicBool, Ordering};

use ethrex_common::{Address, H256};
use fastbloom::AtomicBloomFilter;
use rustc_hash::FxBuildHasher;

const FALSE_POSITIVE_RATE: f64 = 0.01;

/// Bloom filter that tracks which (address, storage_key) pairs have non-zero
/// storage values. Used to skip expensive trie lookups for slots that were
/// never written to.
///
/// The filter is allocated lazily on first `insert()` to avoid ~240MB of
/// upfront memory when the bloom is never used (e.g., dev mode, testnets).
pub struct StorageBloomFilter {
filter: OnceLock<AtomicBloomFilter<FxBuildHasher>>,
capacity: usize,
enabled: AtomicBool,
}

impl fmt::Debug for StorageBloomFilter {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
f.debug_struct("StorageBloomFilter").finish()
}
}

impl StorageBloomFilter {
pub fn new(capacity: usize) -> Self {
Self {
filter: OnceLock::new(),
capacity,
enabled: AtomicBool::new(false),
}
}

/// Activate the bloom filter after it has been populated.
/// Before this is called, `might_contain` always returns `true` (pass-through).
///
/// # Precondition
///
/// The filter MUST have been fully populated (via `insert`) for ALL
/// (address, storage_key) pairs that exist in the trie before this is
/// called. This includes genesis slots, snap-synced data, and all slots
/// written during block processing. Calling `enable()` prematurely will
/// cause false negatives that silently corrupt storage reads.
#[allow(dead_code)]
pub fn enable(&self) {
Comment on lines +48 to +49
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the feature never enabled?

self.enabled.store(true, Ordering::Release);
}
Comment on lines +49 to +51
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enable() is never called — silent correctness risk for future callers

enable() has no callers in the repository today. When someone eventually wires it up (the "Future work" mentioned in the PR), they must ensure the filter has been fully populated from existing trie storage before calling it; calling enable() prematurely on a partially-populated filter would cause might_contain to return false for slots that genuinely have a value, silently returning Ok(None) instead of the correct stored value.

Consider adding a doc-comment warning directly on enable() about this requirement so the footgun is visible at the call site:

/// Activate the bloom filter.
///
/// # Safety / correctness
/// The filter MUST have been fully populated (via `insert`) for all
/// (address, storage_key) pairs that exist in the trie before this is called.
/// Calling `enable()` on a partially-populated filter will cause
/// `might_contain` to return `false` for real slots, producing silent
/// incorrect `None` results from `get_storage_at_root`.
pub fn enable(&self) {
    self.enabled.store(true, Ordering::Release);
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/bloom.rs
Line: 36-38

Comment:
**`enable()` is never called — silent correctness risk for future callers**

`enable()` has no callers in the repository today. When someone eventually wires it up (the "Future work" mentioned in the PR), they must ensure the filter has been **fully populated** from existing trie storage before calling it; calling `enable()` prematurely on a partially-populated filter would cause `might_contain` to return `false` for slots that genuinely have a value, silently returning `Ok(None)` instead of the correct stored value.

Consider adding a doc-comment warning directly on `enable()` about this requirement so the footgun is visible at the call site:

```rust
/// Activate the bloom filter.
///
/// # Safety / correctness
/// The filter MUST have been fully populated (via `insert`) for all
/// (address, storage_key) pairs that exist in the trie before this is called.
/// Calling `enable()` on a partially-populated filter will cause
/// `might_contain` to return `false` for real slots, producing silent
/// incorrect `None` results from `get_storage_at_root`.
pub fn enable(&self) {
    self.enabled.store(true, Ordering::Release);
}
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


/// Record that a non-zero value exists at (address, key).
///
/// Called unconditionally on every non-zero storage write, even while the
/// filter is disabled. This is intentional warm-up: the filter is populated
/// in the background so it is ready when `enable()` is eventually called.
pub fn insert(&self, address: Address, key: H256) {
let bloom_key = Self::make_key(address, key);
self.filter().insert(&bloom_key);
}

/// Returns `true` if the slot *might* contain a non-zero value.
/// Returns `false` if the slot was definitely never written.
/// When the filter is not yet enabled, always returns `true` (pass-through).
pub fn might_contain(&self, address: Address, key: H256) -> bool {
if !self.enabled.load(Ordering::Acquire) {
return true;
}
let bloom_key = Self::make_key(address, key);
self.filter().contains(&bloom_key)
}

fn filter(&self) -> &AtomicBloomFilter<FxBuildHasher> {
self.filter.get_or_init(|| {
AtomicBloomFilter::with_false_pos(FALSE_POSITIVE_RATE)
.hasher(FxBuildHasher)
.expected_items(self.capacity)
})
}

fn make_key(address: Address, key: H256) -> [u8; 52] {
let mut buf = [0u8; 52];
buf[..20].copy_from_slice(address.as_bytes());
buf[20..].copy_from_slice(key.as_bytes());
buf
}
}
Comment on lines +1 to +88
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No unit tests for the new module

bloom.rs introduces a StorageBloomFilter with non-trivial behaviour: a disabled filter must be a transparent pass-through, and an enabled filter must never produce false negatives. Neither property is covered by a test.

Suggested minimal test cases:

  • A newly created filter (disabled) always returns true from might_contain.
  • After enable(), a key that was insert()-ed still returns true.
  • After enable(), a key that was never inserted returns false (no false negatives for a key set that is small relative to capacity).
  • make_key produces distinct bytes for (a, k₁) vs (a, k₂) and (a₁, k) vs (a₂, k).

Given that a correctness regression here (false negatives while enabled) would silently return None for slots that actually have a value, tests are especially important.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/bloom.rs
Line: 1-63

Comment:
**No unit tests for the new module**

`bloom.rs` introduces a `StorageBloomFilter` with non-trivial behaviour: a disabled filter must be a transparent pass-through, and an enabled filter must never produce false negatives. Neither property is covered by a test.

Suggested minimal test cases:
- A newly created filter (disabled) always returns `true` from `might_contain`.
- After `enable()`, a key that was `insert()`-ed still returns `true`.
- After `enable()`, a key that was **never** inserted returns `false` (no false negatives for a key set that is small relative to capacity).
- `make_key` produces distinct bytes for `(a, k₁)` vs `(a, k₂)` and `(a₁, k)` vs `(a₂, k)`.

Given that a correctness regression here (false negatives while enabled) would silently return `None` for slots that actually have a value, tests are especially important.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


#[cfg(test)]
mod tests {
use super::*;

fn addr(n: u8) -> Address {
Address::from([n; 20])
}

fn key(n: u8) -> H256 {
H256::from([n; 32])
}

#[test]
fn disabled_is_pass_through() {
let bloom = StorageBloomFilter::new(1000);
// Before enable, might_contain always returns true
assert!(bloom.might_contain(addr(1), key(1)));
assert!(bloom.might_contain(addr(99), key(255)));
}

#[test]
fn no_false_negatives_after_enable() {
let bloom = StorageBloomFilter::new(1000);
bloom.insert(addr(1), key(10));
bloom.insert(addr(2), key(20));
bloom.enable();
// Inserted keys must always return true
assert!(bloom.might_contain(addr(1), key(10)));
assert!(bloom.might_contain(addr(2), key(20)));
}

#[test]
fn rejects_unknown_after_enable() {
let bloom = StorageBloomFilter::new(1000);
bloom.insert(addr(1), key(10));
bloom.enable();
// A never-inserted key should return false (with high probability)
assert!(!bloom.might_contain(addr(99), key(99)));
}

#[test]
fn make_key_distinctness() {
// Different (address, key) pairs must produce different bloom keys
let k1 = StorageBloomFilter::make_key(addr(1), key(2));
let k2 = StorageBloomFilter::make_key(addr(2), key(1));
assert_ne!(k1, k2);
}
}
1 change: 1 addition & 0 deletions crates/storage/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@

pub mod api;
pub mod backend;
mod bloom;
pub mod error;
mod layering;
pub mod rlp;
Expand Down
23 changes: 23 additions & 0 deletions crates/storage/store.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
#[cfg(feature = "rocksdb")]
use crate::backend::rocksdb::RocksDBBackend;
use crate::bloom::StorageBloomFilter;
use crate::{
STORE_METADATA_FILENAME, STORE_SCHEMA_VERSION,
api::{
Expand Down Expand Up @@ -187,6 +188,10 @@ pub struct Store {
/// Uses FxHashMap for efficient lookups, much smaller than code cache.
code_metadata_cache: Arc<Mutex<rustc_hash::FxHashMap<H256, CodeMetadata>>>,

/// Bloom filter tracking (address, storage_key) pairs with non-zero values.
/// Used to skip trie lookups for storage slots that were never written.
storage_bloom: Arc<StorageBloomFilter>,

background_threads: Arc<ThreadList>,
}

Expand Down Expand Up @@ -1164,6 +1169,10 @@ impl Store {

/// CAUTION: This method writes directly to the underlying database, bypassing any caching layer.
/// For updating the state after block execution, use [`Self::store_block_updates`].
///
/// NOTE: This method does not update the storage bloom filter. Slots written
/// through this path (e.g., snap sync) will be invisible to `might_contain`
/// after `enable()`. A backfill step is needed before enabling the bloom.
pub async fn write_storage_trie_nodes_batch(
&self,
storage_trie_nodes: StorageUpdates,
Expand Down Expand Up @@ -1494,6 +1503,7 @@ impl Store {
last_computed_flatkeyvalue: Arc::new(RwLock::new(last_written)),
account_code_cache: Arc::new(Mutex::new(CodeCache::default())),
code_metadata_cache: Arc::new(Mutex::new(rustc_hash::FxHashMap::default())),
storage_bloom: Arc::new(StorageBloomFilter::new(200_000_000)),
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StorageBloomFilter::new(200_000_000) is created unconditionally during Store construction. For a 1% false-positive rate, a bloom sized for 200M items is on the order of hundreds of MB of resident memory, which can cause OOM/regressions (notably for the many EngineType::InMemory test stores). Consider making this capacity configurable, drastically smaller by default, or lazily allocating the backing filter only when the bloom is actually going to be used/populated.

Suggested change
storage_bloom: Arc::new(StorageBloomFilter::new(200_000_000)),
storage_bloom: Arc::new(StorageBloomFilter::new(2_000_000)),

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~240 MB allocated at startup before the filter is useful

StorageBloomFilter::new(200_000_000) with a 1 % false-positive rate requires roughly m = -n·ln(p) / ln(2)² bits ≈ 1.92 billion bits ≈ 240 MB of RAM. This allocation happens unconditionally at node startup, even though:

  1. enable() is never called anywhere in the codebase, so might_contain always returns true (pass-through).
  2. The "Future work" item (scan existing trie on startup then call enable()) has not yet been implemented.

The net effect today is a large fixed memory cost with zero benefit — every trie lookup still executes.

Consider either (a) deferring the allocation until enable() is about to be called, (b) making capacity configurable/feature-gated, or (c) landing this PR together with the startup-scan + enable() call so the optimisation is actually active from the first merge.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/store.rs
Line: 1491

Comment:
**~240 MB allocated at startup before the filter is useful**

`StorageBloomFilter::new(200_000_000)` with a 1 % false-positive rate requires roughly `m = -n·ln(p) / ln(2)²` bits ≈ **1.92 billion bits ≈ 240 MB** of RAM. This allocation happens unconditionally at node startup, even though:

1. `enable()` is never called anywhere in the codebase, so `might_contain` always returns `true` (pass-through).
2. The "Future work" item (scan existing trie on startup then call `enable()`) has not yet been implemented.

The net effect today is a large fixed memory cost with **zero benefit** — every trie lookup still executes.

Consider either (a) deferring the allocation until `enable()` is about to be called, (b) making `capacity` configurable/feature-gated, or (c) landing this PR together with the startup-scan + `enable()` call so the optimisation is actually active from the first merge.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread
ElFantasma marked this conversation as resolved.
background_threads: Default::default(),
};
let backend_clone = store.backend.clone();
Expand Down Expand Up @@ -1740,6 +1750,7 @@ impl Store {
if storage_value.is_zero() {
storage_trie.remove(&hashed_key)?;
} else {
self.storage_bloom.insert(update.address, *storage_key);
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description says the bloom is “disabled by default (pass-through), so behavior is identical”, but insert() is still called on every non-zero storage write even while disabled. That adds hashing + atomic updates to the write path with no read-side benefit until enable() is wired up. If the intent is truly zero-overhead until enabled, consider gating insert() behind a separate runtime/feature flag or deferring population until you’re ready to enable the filter.

Suggested change
self.storage_bloom.insert(update.address, *storage_key);
if self.storage_bloom.is_enabled() {
self.storage_bloom.insert(update.address, *storage_key);
}

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

insert() incurs hashing cost even while the filter is disabled

insert() always computes make_key and hashes into the underlying AtomicBloomFilter, even when enabled == false. Since might_contain short-circuits to true while disabled, these inserts represent pure overhead on every non-zero storage write until enable() is eventually called.

If the motivation for inserting unconditionally is to "warm up" the filter so it's ready when enable() is called, that is a reasonable design — but it should be explicitly documented with a comment. Otherwise, a future reader may add an early-return guard in insert() that would actually break the warm-up intent.

The same applies to the second insert site at line 1830.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/store.rs
Line: 1738

Comment:
**`insert()` incurs hashing cost even while the filter is disabled**

`insert()` always computes `make_key` and hashes into the underlying `AtomicBloomFilter`, even when `enabled == false`. Since `might_contain` short-circuits to `true` while disabled, these inserts represent **pure overhead** on every non-zero storage write until `enable()` is eventually called.

If the motivation for inserting unconditionally is to "warm up" the filter so it's ready when `enable()` is called, that is a reasonable design — but it should be explicitly documented with a comment. Otherwise, a future reader may add an early-return guard in `insert()` that would actually break the warm-up intent.

The same applies to the second insert site at line 1830.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

storage_trie.insert(hashed_key, storage_value.encode_to_vec())?;
}
Comment on lines +1753 to 1755
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

storage_bloom is populated here for added_storage, but there are other code paths that write non-zero values into storage tries without calling storage_bloom.insert (e.g. setup_genesis_state_trie inserts directly into a storage trie). If enable() is ever called without a full scan/population step that covers those paths, this can introduce false negatives and incorrect get_storage_at_root results. Consider centralizing storage-trie writes or ensuring all non-zero writes go through a helper that also updates the bloom.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}
Expand Down Expand Up @@ -1831,6 +1842,7 @@ impl Store {
if storage_value.is_zero() {
storage_trie.remove(&hashed_key)?;
} else {
self.storage_bloom.insert(update.address, *storage_key);
storage_trie.insert(hashed_key, storage_value.encode_to_vec())?;
}
}
Expand Down Expand Up @@ -1881,6 +1893,8 @@ impl Store {
if !storage_value.is_zero() {
let hashed_key = hash_key(&H256(storage_key.to_big_endian()));
storage_trie.insert(hashed_key, storage_value.encode_to_vec())?;
// TODO: call storage_bloom.insert(address, storage_key) here when
// bloom is wired up, otherwise genesis-only slots become false negatives.
}
}

Expand Down Expand Up @@ -2118,6 +2132,15 @@ impl Store {
address: Address,
storage_key: H256,
) -> Result<Option<U256>, StoreError> {
// Fast path: if the bloom filter says this slot was never written, skip the trie.
// NOTE: The bloom only tracks writes during the current process lifetime.
// For historical state_root queries (RPC), a slot that was non-zero in older
// states but later zeroed won't be in the filter. When the bloom is enabled,
// this check may need to be limited to latest-state lookups only.
if !self.storage_bloom.might_contain(address, storage_key) {
Comment thread
ElFantasma marked this conversation as resolved.
return Ok(None);
}

let account_hash = hash_address_fixed(&address);

// Pre-acquire shared resources once for both trie opens
Expand Down
Loading