perf(l1): lazy BAL cursor for per-tx parallel execution by edg-l · Pull Request #6669 · lambdaclass/ethrex

edg-l · 2026-05-18T11:55:49Z

Summary

Replaces eager per-tx BAL prefix materialization inside execute_block_parallel with an on-read LazyBalCursor installed on each per-tx GeneralizedDatabase (LEVM's in-memory state cache that the EVM reads accounts/slots through during execution). Each tx materializes only the accounts/slots it actually touches instead of the full BAL prefix.

The two outer sequential seed_db_from_bal callers (system-call recovery, post-tx outer seed) are unchanged; the cursor is per-tx only.

Benchmark

Fixture: bal-devnet-7-mainnet-mix-460 (460 blocks, ~30 Ggas, transfer/EVM-mix). Single run, release-with-debug profile, import-bench --with-bal.

metric	baseline (parallel, eager seed)	this PR (lazy cursor)	delta
wall time	8.58 s	6.85 s	-1.73 s (-20.2%)
agg Ggas/s	3.90	5.02	+28.7%
avg ms / block	16.88	13.11	-3.77 ms (-22.3%)
p95 ms / block	17.73	14.09	-3.64 ms (-20.5%)
max ms / block	90.06	52.39	-37.67 ms
exec avg	15.57	11.89	-3.68 ms (-23.7%)
merkle avg	0.48	0.44	-0.04 ms
store avg	0.67	0.63	-0.04 ms
warmer avg	1.37	1.37	flat

Win is concentrated in exec, which is exactly what the cursor targets; merkle/store/warmer barely move, so the gain is not a measurement shift.

Changes

Extract seed_one_address_info_from_bal and seed_one_storage_slot_from_bal from seed_db_from_bal as reusable helpers in ethrex-levm. seed_db_from_bal becomes a thin loop over these helpers (behavior-preserving).
Add Clone on BalAddressIndex.
Add lazy_bal: Option<LazyBalCursor> field on GeneralizedDatabase. LazyBalCursor holds Arc<BlockAccessList>, bal_index: u32, Arc<BalAddressIndex>.
load_account consults the cursor for account info (balance, nonce, code_hash) on cache miss before falling through to the store. Does not inject account.storage.
get_storage_value consults the cursor per-slot on cache miss.
execute_block_parallel sets tx_db.lazy_bal = Some(...) per tx instead of calling seed_db_from_bal eagerly.
Per-tx GeneralizedDatabase capacity hint drops from bal_account_count to 32.
code_from_bal deduplicated into gen_db.rs.

Invariants

Cursor bal_index = tx_idx + 1; effective cutoff is bal_index.saturating_sub(1), matching the existing seed_db_from_bal's max_idx = tx_idx. debug_assert!(bal_index >= 1).
load_account only injects account-info fields, never account.storage. Storage stays lazy through get_storage_value.
In seed_one_address_info_from_bal, code_update is computed before the &mut LevmAccount borrow; db.codes.entry().or_insert() runs after the borrow is released.
In get_storage_value, the cursor result is copied to a local before taking &mut current_accounts_state.
load_account .take()s the cursor before calling the helper (whose partial-coverage path calls db.get_account internally) and restores it after; prevents re-entry into the lazy hook.

Tests

test/tests/levm/bal_view_tests.rs:

tx1_sees_tx0_write ; off-by-one boundary
load_account_does_not_inject_storage ; no storage injection
sstore_sees_prior_write ; SSTORE pre-image flows through cursor
lazy_load_account_partial_coverage_does_not_recurse ; .take() guard

Test plan

cargo test -p ethrex-test --features rayon bal_view_tests (4/4)
cargo test -p ethrex-vm -p ethrex-levm -p ethrex-blockchain
make lint
cargo fmt --all --check
make -C tooling/ef_tests/state test
make -C tooling/ef_tests/blockchain test

github-actions · 2026-05-18T11:56:01Z

⚠️ Known Issues — intentionally skipped tests

Source: docs/known_issues.md

Known Issues

Tests intentionally excluded from CI. Source of truth for the Known
Issues section the L1 workflow appends to each ef-tests job summary
and posts as a sticky PR comment.

EF Tests — Stateless coverage narrowed to EIP-8025 optional-proofs

make -C tooling/ef_tests/blockchain test calls test-stateless-zkevm
instead of test-stateless. The zkevm@v0.3.3 fixtures are filled against
bal@v5.6.1, out of sync with current bal spec; the broad target trips ~549
fixtures. Re-broaden once the zkevm bundle is regenerated.

Why and resolution path

PR #6527 broadened
test-stateless to extract the entire for_amsterdam/ tree from the
zkevm bundle and run all of it under --features stateless; combined with
this branch's bal-devnet-7 semantics that scope produces ~549
GasUsedMismatch / ReceiptsRootMismatch /
BlockAccessListHashMismatch failures.

test-stateless-zkevm filters cargo to the eip8025_optional_proofs
suite, which still validates the stateless harness without the bal-version
mismatch.

Re-broaden by switching test: back to test-stateless in
tooling/ef_tests/blockchain/Makefile once the zkevm bundle is regenerated
against the current bal spec.

github-actions · 2026-05-18T11:58:55Z

Lines of code report

Total lines added: 212
Total lines removed: 61
Total lines changed: 273

Detailed view

+-------------------------------------------------+-------+------+
| File                                            | Lines | Diff |
+-------------------------------------------------+-------+------+
| ethrex/crates/common/types/block_access_list.rs | 1163  | +11  |
+-------------------------------------------------+-------+------+
| ethrex/crates/vm/backends/levm/mod.rs           | 2387  | -61  |
+-------------------------------------------------+-------+------+
| ethrex/crates/vm/levm/src/db/gen_db.rs          | 762   | +201 |
+-------------------------------------------------+-------+------+

github-actions · 2026-05-18T12:13:00Z

Benchmark Results Comparison

No significant difference was registered for any benchmark run.

Detailed Results

Benchmark Results: BubbleSort

Command	Mean [s]	Min [s]	Max [s]	Relative
`main_revm_BubbleSort`	2.972 ± 0.019	2.945	2.991	1.07 ± 0.01
`main_levm_BubbleSort`	2.917 ± 0.289	2.758	3.682	1.05 ± 0.10
`pr_revm_BubbleSort`	2.962 ± 0.042	2.918	3.053	1.07 ± 0.02
`pr_levm_BubbleSort`	2.778 ± 0.020	2.749	2.811	1.00

Benchmark Results: ERC20Approval

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ERC20Approval`	989.4 ± 5.8	982.4	999.1	1.02 ± 0.01
`main_levm_ERC20Approval`	1059.6 ± 10.3	1039.4	1076.0	1.10 ± 0.01
`pr_revm_ERC20Approval`	966.4 ± 3.7	959.5	971.9	1.00
`pr_levm_ERC20Approval`	1052.5 ± 8.3	1037.3	1062.7	1.09 ± 0.01

Benchmark Results: ERC20Mint

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ERC20Mint`	136.1 ± 2.0	134.3	140.3	1.02 ± 0.02
`main_levm_ERC20Mint`	156.3 ± 1.0	154.9	157.8	1.17 ± 0.02
`pr_revm_ERC20Mint`	133.3 ± 1.6	130.6	135.7	1.00
`pr_levm_ERC20Mint`	154.9 ± 1.0	153.9	156.9	1.16 ± 0.02

Benchmark Results: ERC20Transfer

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ERC20Transfer`	237.6 ± 1.3	236.1	239.3	1.03 ± 0.01
`main_levm_ERC20Transfer`	262.5 ± 2.3	259.3	266.6	1.14 ± 0.01
`pr_revm_ERC20Transfer`	230.7 ± 1.8	229.0	233.8	1.00
`pr_levm_ERC20Transfer`	260.8 ± 1.2	259.0	262.8	1.13 ± 0.01

Benchmark Results: Factorial

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Factorial`	227.1 ± 1.7	224.5	229.7	1.00
`main_levm_Factorial`	270.6 ± 4.2	266.4	278.6	1.19 ± 0.02
`pr_revm_Factorial`	227.5 ± 3.2	220.2	233.1	1.00 ± 0.02
`pr_levm_Factorial`	268.2 ± 1.5	266.6	270.4	1.18 ± 0.01

Benchmark Results: FactorialRecursive

Command	Mean [s]	Min [s]	Max [s]	Relative
`main_revm_FactorialRecursive`	1.726 ± 0.038	1.657	1.793	1.06 ± 0.02
`main_levm_FactorialRecursive`	1.641 ± 0.019	1.612	1.665	1.01 ± 0.01
`pr_revm_FactorialRecursive`	1.711 ± 0.033	1.651	1.752	1.05 ± 0.02
`pr_levm_FactorialRecursive`	1.628 ± 0.010	1.612	1.647	1.00

Benchmark Results: Fibonacci

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Fibonacci`	206.1 ± 2.1	204.0	210.2	1.01 ± 0.01
`main_levm_Fibonacci`	254.1 ± 3.6	249.3	259.0	1.25 ± 0.02
`pr_revm_Fibonacci`	203.2 ± 1.2	201.7	205.2	1.00
`pr_levm_Fibonacci`	249.4 ± 1.5	247.3	253.0	1.23 ± 0.01

Benchmark Results: FibonacciRecursive

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_FibonacciRecursive`	911.1 ± 7.7	895.7	922.4	1.25 ± 0.03
`main_levm_FibonacciRecursive`	730.5 ± 26.9	714.5	804.9	1.01 ± 0.04
`pr_revm_FibonacciRecursive`	907.3 ± 11.8	889.5	933.1	1.25 ± 0.03
`pr_levm_FibonacciRecursive`	726.3 ± 15.7	712.0	765.5	1.00

Benchmark Results: ManyHashes

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ManyHashes`	8.5 ± 0.2	8.4	9.0	1.01 ± 0.02
`main_levm_ManyHashes`	9.9 ± 0.1	9.9	10.1	1.18 ± 0.02
`pr_revm_ManyHashes`	8.4 ± 0.1	8.3	8.6	1.00
`pr_levm_ManyHashes`	10.0 ± 0.2	9.8	10.5	1.19 ± 0.03

Benchmark Results: MstoreBench

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_MstoreBench`	260.1 ± 5.7	255.9	274.7	1.14 ± 0.03
`main_levm_MstoreBench`	229.1 ± 1.2	227.5	230.8	1.00
`pr_revm_MstoreBench`	263.3 ± 8.9	255.6	276.3	1.15 ± 0.04
`pr_levm_MstoreBench`	236.7 ± 4.5	232.7	248.3	1.03 ± 0.02

Benchmark Results: Push

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Push`	291.0 ± 2.2	288.4	295.1	1.00 ± 0.01
`main_levm_Push`	295.0 ± 1.5	293.2	297.4	1.02 ± 0.01
`pr_revm_Push`	290.5 ± 1.2	288.6	292.4	1.00
`pr_levm_Push`	293.6 ± 1.3	292.2	296.7	1.01 ± 0.01

Benchmark Results: SstoreBench_no_opt

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_SstoreBench_no_opt`	169.9 ± 6.0	165.6	183.1	1.67 ± 0.06
`main_levm_SstoreBench_no_opt`	101.8 ± 0.9	100.5	103.2	1.00 ± 0.02
`pr_revm_SstoreBench_no_opt`	165.6 ± 2.6	162.7	170.7	1.63 ± 0.03
`pr_levm_SstoreBench_no_opt`	101.5 ± 1.3	100.0	104.8	1.00

github-actions · 2026-05-19T08:33:46Z

Benchmark Block Execution Results Comparison Against Main

Command	Mean [s]	Min [s]	Max [s]	Relative
`base`	66.767 ± 0.209	66.480	67.095	1.00
`head`	67.136 ± 0.264	66.650	67.444	1.01 ± 0.01

github-actions · 2026-05-19T08:38:37Z

🤖 Codex Code Review

High: shared_base now masks the BAL prefix, so later txs can execute against stale pre-state for any account already loaded during prepare_block. In crates/vm/levm/src/db/gen_db.rs:346- :401, load_account() returns the shared_base snapshot before consulting lazy_bal. Before this PR, seed_db_from_bal() overlaid prior-tx BAL changes on top of that snapshot. After this change, an account touched by system-contract setup and then modified by tx 0 will still be seen by tx 1 as the post-system-call version, not the post-tx-0 version. The same stale-state problem also applies to storage slots already present in the cloned shared-base account, because get_storage_value() returns cached storage before the lazy BAL hook at :957. This is consensus-critical.
Medium: the new lazy storage lookup is linear in the number of changed slots for that account. seed_one_storage_slot_from_bal() does .iter().find(...) over storage_changes at :163- :166, and that helper is now on the SLOAD/SSTORE miss path via get_storage_value(). Since BAL storage entries are slot-sorted, this should stay binary-search-based; otherwise storage-heavy txs regress toward quadratic behavior across many distinct slot reads.

The .take() recursion fix itself looks sound, but I would not merge this until the shared_base/lazy_bal ordering bug is fixed. I could not run cargo test in this environment because rustup cannot create temp files on the read-only filesystem.

Automated review by OpenAI Codex · gpt-5.4 · custom prompt

github-actions · 2026-05-19T08:42:53Z

🤖 Claude Code Review

Now I have everything I need to write a detailed review.

Review: `perf(l1): lazy BAL cursor for per-tx parallel execution`

Summary

The design is sound and the performance numbers are compelling (+28.7% Ggas/s). The .take() anti-recursion guard is correct, the borrow-checker concerns are handled properly, and the off-by-one semantics (bal_index = tx_idx + 1 → max_idx = tx_idx) match the previous eager-seed behaviour. The tests cover the key regression surfaces. Several issues below range from a potential correctness bug to minor nits.

Potential Correctness Bug: `get_storage_value` hook fires before any "already cached" check

crates/vm/levm/src/db/gen_db.rs, hunk @@ -725,6 +954

The BAL hook is placed immediately after the contains_key(address) guard, but there is no check that the slot is already in account.storage before the hook fires:

// Lazy-BAL hook — no prior "is key already in account.storage?" guard
let bal_hit = self.db.lazy_bal.as_ref().and_then(|cursor| {
    seed_one_storage_slot_from_bal(&cursor.bal, acct_idx, key, max_idx)
});
if let Some(value) = bal_hit {
    account.storage.insert(key, value);   // could silently overwrite a live SSTORE
    return Ok(value);
}

If get_storage_value is called for a slot that the current transaction has already written via SSTORE (so the slot IS in account.storage), and the BAL also covers that slot, the hook would overwrite the live written value with the BAL's pre-tx view and return the stale value. Whether this actually happens depends on whether the SLOAD/SSTORE path has an account.storage.get(&key) fast-return before calling get_storage_value — that path is not visible in this diff. Please confirm (and ideally add a test) that get_storage_value is never called for a slot already present in account.storage, or add the guard here.

O(n²) Storage Seeding in `seed_db_from_bal`

crates/vm/backends/levm/mod.rs, around the new storage loop

for sc in &acct_changes.storage_changes {
    let key = ethrex_common::utils::u256_to_h256(sc.slot);
    if let Some(value) = seed_one_storage_slot_from_bal(bal, acct_idx, key, max_idx) {
        acc.storage.insert(key, value);
    }
}

seed_one_storage_slot_from_bal accepts a key: H256 and re-scans acct_changes.storage_changes via .find(|sc| u256_to_h256(sc.slot) == key) for every call. The calling loop is also iterating over the same storage_changes slice, so each account with N storage slots costs O(N²) for this outer path.

The outer seed_db_from_bal is only called twice per block (system-call recovery and post-tx seed), so the blast radius is limited, but for accounts with many touched slots this is a clear regression from the O(N log M) old code. The simplest fix is to inline the partition_point logic directly in the loop (as the old code did) rather than delegating to the helper:

for sc in &acct_changes.storage_changes {
    let pos = sc.slot_changes.partition_point(|c| c.block_access_index <= max_idx);
    if pos > 0 {
        let key = ethrex_common::utils::u256_to_h256(sc.slot);
        acc.storage.insert(key, sc.slot_changes[pos - 1].post_value);
    }
}

Double Hash-Map Lookup in `load_account` Fast Path

crates/vm/levm/src/db/gen_db.rs, load_account rewrite

if self.current_accounts_state.contains_key(&address) {
    return self
        .current_accounts_state
        .get_mut(&address)           // second lookup
        .ok_or(InternalError::AccountNotFound);
}

Two map lookups for the common case. The old Entry::Occupied pattern was a single lookup. Use:

if let Entry::Occupied(e) = self.current_accounts_state.entry(address) {
    return Ok(e.into_mut());
}

The same double-lookup pattern appears for initial_accounts_state and the shared_base path. The ok_or(AccountNotFound) after get_mut is also unreachable (we just confirmed presence with contains_key), which would be eliminated by using the Entry API.

`helper_result` Discards the Semantic Bool from `seed_one_address_info_from_bal`

crates/vm/levm/src/db/gen_db.rs, load_account lazy-BAL block

Some(
    seed_one_address_info_from_bal(self, &cursor.bal, acct_idx, max_idx)
        .map(|_| true),   // bool return is thrown away
)

The function returns Ok(false) to signal "no fields applied" and Ok(true) to signal "at least one field applied", but that meaning is erased here by .map(|_| true). The subsequent check relies on contains_key as the real signal — which works, but obscures intent. Either use the returned bool to decide whether to fall through, or change the return type to Result<(), InternalError> and use the contains_key check explicitly with a comment explaining why.

Unnecessary `saturating_sub(1)` after Proven-Non-Zero Guards

crates/vm/levm/src/db/gen_db.rs, seed_one_address_info_from_bal

// inside `if code_pos > 0 { ... }`
let entry = acct_changes
    .code_changes
    .get(code_pos.saturating_sub(1))     // code_pos > 0, so this is code_pos - 1
    .ok_or(InternalError::AccountNotFound)?;

Inside if code_pos > 0, saturating_sub(1) is identical to - 1. Same pattern appears for balance_pos and nonce_pos in the has_all_info branch. Just use code_pos - 1; the intent is clearer and the ok_or is also unreachable (partition_point guarantees pos - 1 < len when pos > 0).

Confusing `filter(|_| pos > 0)` Pattern in `seed_one_storage_slot_from_bal`

crates/vm/levm/src/db/gen_db.rs

sc.slot_changes
    .get(pos.saturating_sub(1))   // returns Some(...) when pos == 0 if slice non-empty
    .filter(|_| pos > 0)          // then immediately discards it
    .map(|c| c.post_value)

When pos == 0, get(0) may return Some even though there is nothing to return. The .filter corrects this, but the sequence is needlessly opaque. Prefer:

(pos > 0).then(|| sc.slot_changes[pos - 1].post_value)

Stale Line-Number References in Doc Comment

crates/vm/backends/levm/mod.rs, seed_db_from_bal doc

/// Eager BAL prefix seed — used only by the outer DB path (system-call recovery
/// at `:440` and post-tx outer seed at `:463`).

Absolute line numbers in comments rot immediately. Replace with function/variable names or conceptual labels.

Minor: `Arc::new(bal.clone())` Adds a Whole-BAL Clone Per Block

crates/vm/backends/levm/mod.rs, execute_block_parallel

let arc_bal = Arc::new(bal.clone());

bal is already available as a shared reference. Unless the caller owns it and wrapping in Arc without cloning is impossible, consider whether the BAL could be passed as Arc<BlockAccessList> further up the call chain to eliminate this clone. This is a lower-priority concern given it's one clone per block, but for very large BALs it contributes to peak memory.

Test Coverage Gap

There is no test exercising the get_storage_value BAL hook (bal_hit path) through the full EVM/VM machinery. The existing tests cover seed_one_storage_slot_from_bal and seed_one_address_info_from_bal directly, plus the recursion guard. A test that constructs a minimal VM with lazy_bal = Some(...), calls SLOAD for a slot present in the BAL, and verifies the returned value would close the coverage gap for the hook that carries the most correctness risk.

Positive Notes

The .take() + restore pattern to break the load_account → seed_one_address_info_from_bal → get_account → load_account recursion cycle is correct and the explanation in the comment is excellent.
Copying bal_hit to a local before the &mut current_accounts_state borrow is the right fix for the borrow-checker tension in get_storage_value.
Using Arc<BlockAccessList> / Arc<BalAddressIndex> cloned cheaply per-tx (rather than cloning data per-tx) is the right pattern.
debug_assert!(cursor.bal_index >= 1) at both hook sites is good defensive practice.
The four regression tests are well-targeted: off-by-one boundary, storage-non-injection, multi-write boundaries, and recursion prevention.

Automated review by Claude (Anthropic) · sonnet · custom prompt

greptile-apps · 2026-05-19T08:43:31Z

Greptile Summary

This PR replaces the eager per-tx BAL prefix materialization in execute_block_parallel with an on-read LazyBalCursor installed on each per-tx GeneralizedDatabase, so only the accounts and slots actually touched during a transaction are materialized rather than the entire BAL prefix upfront. The benchmark shows a ~20% wall-time improvement concentrated entirely in the exec phase, with merkle/store/warmer metrics flat, validating the targeted scope of the change.

LazyBalCursor is added to GeneralizedDatabase; load_account and get_storage_value consult it on cache-miss before falling back to the store, with a .take()/restore guard in load_account to prevent infinite re-entry through partial-coverage accounts that call db.get_account internally.
seed_db_from_bal is refactored into reusable seed_one_address_info_from_bal and seed_one_storage_slot_from_bal helpers; the outer eager-seed callers (system-call recovery, post-tx outer seed) are unchanged and the lazy cursor is per-tx only.
Four unit tests in bal_view_tests.rs cover the off-by-one BAL boundary, no-storage-injection invariant, multi-write cursor semantics, and the recursion guard.

Confidence Score: 4/5

The parallel per-tx path is safe — the lazy cursor correctly replicates the semantics of the eager seed with a well-documented recursion guard. The two outer eager-seed callers are untouched.

The core lazy-cursor implementation in gen_db.rs is carefully structured and the off-by-one invariants, borrow-split patterns, and anti-recursion guard are all correct. The regression lives in the outer seed_db_from_bal storage loop in mod.rs, where the new code delegates to seed_one_storage_slot_from_bal (which re-searches storage_changes by key) while iterating storage_changes itself — trading O(n) for O(n²) per account. This path runs only twice per block, limiting impact, but it is a clear regression relative to the old code.

crates/vm/backends/levm/mod.rs — specifically the refactored storage inner-loop inside seed_db_from_bal

Important Files Changed

Filename	Overview
crates/vm/levm/src/db/gen_db.rs	Core of the PR: adds LazyBalCursor struct, seed_one_address_info_from_bal/seed_one_storage_slot_from_bal helpers, lazy_bal field on GeneralizedDatabase, and hooks into load_account and get_storage_value. The recursion guard (take/restore of cursor) and borrow-split patterns are correctly implemented.
crates/vm/backends/levm/mod.rs	seed_db_from_bal refactored to delegate info-seeding to seed_one_address_info_from_bal; execute_block_parallel switches from eager seed to lazy cursor. The storage inner-loop introduces an O(n²) scan for the outer eager seed path.
crates/common/types/block_access_list.rs	Adds #[derive(Clone)] to BalAddressIndex to allow Arc wrapping; minimal, safe change.
test/tests/levm/bal_view_tests.rs	Adds four unit tests: off-by-one boundary, no-storage-injection invariant, multi-write cursor semantics, and recursion guard. Good coverage of the non-trivial edge cases.

Sequence Diagram

sequenceDiagram
    participant EP as execute_block_parallel
    participant TxDB as per-tx GeneralizedDatabase
    participant Cursor as LazyBalCursor
    participant BAL as BlockAccessList
    participant Store as backing Store

    EP->>TxDB: "set lazy_bal = Some(LazyBalCursor)"
    Note over EP,TxDB: replaces eager seed_db_from_bal call

    TxDB->>TxDB: load_account(addr) — cache miss
    TxDB->>Cursor: take() cursor (anti-recursion guard)
    Cursor->>BAL: seed_one_address_info_from_bal(addr, max_idx)
    alt has_all_info
        BAL-->>TxDB: insert LevmAccount with BAL fields
    else partial coverage
        TxDB->>Store: get_account_state(addr)
        Store-->>TxDB: base account + overlay BAL fields
    else not in BAL
        TxDB->>Store: get_account_state(addr)
        Store-->>TxDB: account
    end
    TxDB->>Cursor: restore cursor unconditionally

    TxDB->>TxDB: get_storage_value(addr, key) — slot not cached
    TxDB->>Cursor: as_ref() read only
    Cursor->>BAL: seed_one_storage_slot_from_bal(acct_idx, key, max_idx)
    alt slot in BAL
        BAL-->>TxDB: post_value → cache → return
    else not in BAL
        TxDB->>Store: get_value_from_database
        Store-->>TxDB: value → cache → return
    end

Prompt To Fix All With AI

Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
crates/vm/backends/levm/mod.rs:921-929
The refactored storage loop reintroduces an O(n²) cost in the outer eager `seed_db_from_bal` path. For each `sc` yielded by the outer `for sc in &acct_changes.storage_changes` iteration, `seed_one_storage_slot_from_bal` re-scans `storage_changes` via `iter().find()` to locate the *same* `sc` by key. With n storage entries per account this is O(n²), whereas the old code used the already-available iterator value and did a single O(log k) `partition_point` call — net O(n) per account. For a contract with hundreds of BAL-tracked slots this compounds noticeably in both the system-call-recovery path and the post-tx outer seed.

```suggestion
            let acc = db
                .get_account_mut(addr)
                .map_err(|e| EvmError::Custom(format!("seed storage mut: {e}")))?;
            for sc in &acct_changes.storage_changes {
                let pos = sc
                    .slot_changes
                    .partition_point(|c| c.block_access_index <= max_idx);
                if pos > 0 {
                    let key = ethrex_common::utils::u256_to_h256(sc.slot);
                    acc.storage.insert(key, sc.slot_changes[pos - 1].post_value);
                }
            }
```

### Issue 2 of 2
crates/vm/levm/src/db/gen_db.rs:407-414
**Linear slot scan on every storage cache-miss**

`seed_one_storage_slot_from_bal` uses `iter().find()` over the full `storage_changes` slice on every call from the lazy-cursor hook in `get_storage_value`. For a contract whose BAL entry has many storage-change records the cost is O(n_storage_changes_in_bal) per cold-slot read. The `BalAddressIndex` gives O(1) address lookup, but there is no equivalent per-account slot index. For the described workload (transfer / EVM-mix) the average account has few BAL storage slots and this is fine; however, a DeFi block dominated by high-storage-turnover contracts (e.g., AMM pools with many SSTORE'd ticks) could see the lazy path regress relative to the old eager seed. A `FxHashMap<H256, usize>` slot-to-position index on `LazyBalCursor`, built when the cursor is constructed, would restore O(1) slot lookup without increasing peak memory meaningfully.

_{Reviews (1): Last reviewed commit: "docs(changelog): add lazy BAL cursor per..." | Re-trigger Greptile}

Address greptile findings on PR #6669: - seed_db_from_bal eager loop walked storage_changes, then seed_one_storage_slot_from_bal re-found the same sc by slot key. Use the outer sc directly via a new post_value_at_or_before helper. - seed_one_storage_slot_from_bal (lazy cursor) did iter().find() over storage_changes on every cache miss. Resolve slot in O(1) via a new per-account slot_idx_by_account map on BalAddressIndex, built once per block in build_validation_index. Safe under EIP-7928: canonical-ordering validation enforces strictly ascending unique slots per account, so map insert order matches the former find() semantics. Verified clean: 8721 + 93 ef-tests pass on a clean vectors checkout.

ElFantasma

Three inline findings, all minor — none blocking.

Address greptile findings on PR #6669: - seed_db_from_bal eager loop walked storage_changes, then seed_one_storage_slot_from_bal re-found the same sc by slot key. Use the outer sc directly via a new post_value_at_or_before helper. - seed_one_storage_slot_from_bal (lazy cursor) did iter().find() over storage_changes on every cache miss. Resolve slot in O(1) via a new per-account slot_idx_by_account map on BalAddressIndex, built once per block in build_validation_index. Safe under EIP-7928: canonical-ordering validation enforces strictly ascending unique slots per account, so map insert order matches the former find() semantics. Verified clean: 8721 + 93 ef-tests pass on a clean vectors checkout.

Replaces eager per-tx BAL prefix materialization inside execute_block_parallel with an on-read LazyBalCursor installed on each per-tx GeneralizedDatabase. load_account consults the cursor for account info only; get_storage_value consults it per-slot. Each tx now materializes only what it actually touches instead of the full BAL prefix. The two outer sequential seed_db_from_bal callers (system-call recovery, post-tx outer seed) remain untouched. - Extract seed_one_address_info_from_bal + seed_one_storage_slot_from_bal from seed_db_from_bal as reusable helpers in ethrex-levm - Add Clone to BalAddressIndex so it can be Arc-wrapped once per block - Add lazy_bal: Option<LazyBalCursor> on GeneralizedDatabase - Hook load_account and get_storage_value with explicit borrow-ordering - Switch execute_block_parallel to set tx_db.lazy_bal instead of seeding - Drop per-tx DB capacity hint from bal_account_count to 32 Tests in test/tests/levm/bal_view_tests.rs cover: - T1 off-by-one cutoff (tx1_sees_tx0_write) - T2 no storage injection in load_account - T3 SSTORE pre-image flows through cursor - T4 partial-coverage load_account does not recurse (cursor .take() guard)

The per-tx GeneralizedDatabase in execute_block_parallel is configured with both a shared_base (pre-block snapshot of system-touched addresses, captured from initial_accounts_state after prepare_block) and a LazyBalCursor that materialises the BAL prefix on cache-miss. load_account previously consulted shared_base before the cursor, so any address present in both would short- circuit to the pre-block balance / nonce / code and miss the BAL overlay. For a predeploy touched by prepare_block (e.g. the withdrawal / consolidation request contracts) whose info is then mutated by a prior tx in the same block, a later tx reading that info via BALANCE / EXTCODE* would observe the stale pre-block value. Storage reads are unaffected because shared_base accounts are cloned with empty .storage and slot reads go through the lazy_bal hook in get_storage_value. Reorder load_account: lazy_bal hook runs first, falling back to shared_base only when the cursor has no entry for the address. The .take() guard already prevents the partial-coverage recursion through db.get_account; the inner call now lands on shared_base (or store), then the outer overlays BAL info. Regression test in test/tests/levm/bal_view_tests.rs constructs a per-tx db with a shared_base balance of 0 and a BAL balance_change of 42_000 at block_access_index 1, and asserts load_account returns the BAL value. Verified clean: full blockchain ef-tests (8721 + 93 = 8814 tests, 0 failed) on a freshly downloaded amsterdam fixtures bundle.

Address greptile findings on PR #6669: - seed_db_from_bal eager loop walked storage_changes, then seed_one_storage_slot_from_bal re-found the same sc by slot key. Use the outer sc directly via a new post_value_at_or_before helper. - seed_one_storage_slot_from_bal (lazy cursor) did iter().find() over storage_changes on every cache miss. Resolve slot in O(1) via a new per-account slot_idx_by_account map on BalAddressIndex, built once per block in build_validation_index. Safe under EIP-7928: canonical-ordering validation enforces strictly ascending unique slots per account, so map insert order matches the former find() semantics. Verified clean: 8721 + 93 ef-tests pass on a clean vectors checkout.

L2 lint (no rayon feature) flagged unused import: SlotChange, since post_value_at_or_before is rayon-gated.

Replace fragile line-number references in seed_db_from_bal doc with descriptive context.

Resolves a conflict in crates/vm/backends/levm/mod.rs introduced by #6669 (lazy BAL cursor) and #6655 (BAL optimistic merkleization), which rewrote the same lines this branch un-gated. Took main's version of the file wholesale, then re-stripped the rayon/eip-8025 cfg gates — keeping main's is_amsterdam correctness guard and gen_db refactor. The merge also pulled in new rayon/eip-8025 gates from #6669 in files that did not conflict (auto-merged): crates/vm/levm/src/db/gen_db.rs, test/Cargo.toml, and test/tests/levm/bal_view_tests.rs. Stripped those too, so the only remaining eip-8025 gates are the four guest binary main.rs files and no rayon feature gates remain. The bal_view_tests now run unconditionally.

github-actions Bot assigned edg-l May 18, 2026

github-actions Bot added L1 Ethereum client performance Block execution throughput and performance in general labels May 18, 2026

github-project-automation Bot added this to ethrex_l1 May 18, 2026

github-project-automation Bot moved this to Todo in ethrex_performance May 18, 2026

github-project-automation Bot added this to ethrex_performance May 18, 2026

edg-l force-pushed the perf/bal-lazy-cursor branch from 7abf051 to 01ecf18 Compare May 19, 2026 07:30

edg-l marked this pull request as ready for review May 19, 2026 08:35

edg-l requested a review from a team as a code owner May 19, 2026 08:35

ethrex-project-sync Bot moved this to In Review in ethrex_l1 May 19, 2026

greptile-apps Bot reviewed May 19, 2026

View reviewed changes

Comment thread crates/vm/backends/levm/mod.rs

Comment thread crates/vm/levm/src/db/gen_db.rs

edg-l mentioned this pull request May 19, 2026

perf(l1): reduce BAL parallel-path overhead #6639

Closed

5 tasks

azteca1998 approved these changes May 19, 2026

View reviewed changes

iovoid reviewed May 19, 2026

View reviewed changes

Comment thread crates/vm/levm/src/db/gen_db.rs Outdated

edg-l force-pushed the perf/bal-lazy-cursor branch from 91648bb to 4d26edc Compare May 20, 2026 10:54

edg-l requested a review from iovoid May 20, 2026 11:23

ElFantasma approved these changes May 20, 2026

View reviewed changes

Comment thread crates/vm/levm/src/db/gen_db.rs

Comment thread crates/vm/levm/src/db/gen_db.rs

Comment thread crates/vm/levm/src/db/gen_db.rs

iovoid reviewed May 21, 2026

View reviewed changes

Comment thread crates/vm/levm/src/db/gen_db.rs

Comment thread crates/vm/levm/src/db/gen_db.rs

Comment thread crates/vm/backends/levm/mod.rs Outdated

edg-l force-pushed the perf/bal-lazy-cursor branch from 4839027 to a57c161 Compare May 21, 2026 14:07

iovoid approved these changes May 21, 2026

View reviewed changes

edg-l added 8 commits May 21, 2026 16:47

docs(changelog): add lazy BAL cursor perf entry

5b2815b

fix(l1): cfg-gate SlotChange import in gen_db

df05e95

L2 lint (no rayon feature) flagged unused import: SlotChange, since post_value_at_or_before is rayon-gated.

style(l1): cargo fmt SlotChange import ordering

7efae28

docs(levm): note storage-lazy invariant on BAL has_all_info shortcut

8c5a220

refactor(l1): address iovoid review on bal-lazy-cursor

df9a356

Replace fragile line-number references in seed_db_from_bal doc with descriptive context.

edg-l force-pushed the perf/bal-lazy-cursor branch from a57c161 to df9a356 Compare May 21, 2026 14:47

ilitteri added this pull request to the merge queue May 21, 2026

ilitteri removed this pull request from the merge queue due to a manual request May 21, 2026

edg-l added this pull request to the merge queue May 22, 2026

Merged via the queue into main with commit 17c3d14 May 22, 2026
70 checks passed

edg-l deleted the perf/bal-lazy-cursor branch May 22, 2026 06:37

github-project-automation Bot moved this from In Review to Done in ethrex_l1 May 22, 2026

github-project-automation Bot moved this from Todo to Done in ethrex_performance May 22, 2026

This was referenced May 22, 2026

perf(l1): move per-tx BAL validation into the par_iter closure #6677

Open

tracking: bal-devnet-7 alignment #6583

Open

perf(l1): BAL warmer prefetch experiments; bytecode batch wins, parallel phases don't #6729

Open

Conversation

edg-l commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Benchmark

Changes

Invariants

Tests

Test plan

Uh oh!

github-actions Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Known Issues — intentionally skipped tests

Known Issues

EF Tests — Stateless coverage narrowed to EIP-8025 optional-proofs

Uh oh!

github-actions Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Lines of code report

Uh oh!

github-actions Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results Comparison

Benchmark Results: BubbleSort

Benchmark Results: ERC20Approval

Benchmark Results: ERC20Mint

Benchmark Results: ERC20Transfer

Benchmark Results: Factorial

Benchmark Results: FactorialRecursive

Benchmark Results: Fibonacci

Benchmark Results: FibonacciRecursive

Benchmark Results: ManyHashes

Benchmark Results: MstoreBench

Benchmark Results: Push

Benchmark Results: SstoreBench_no_opt

Uh oh!

github-actions Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Block Execution Results Comparison Against Main

Uh oh!

github-actions Bot commented May 19, 2026

🤖 Codex Code Review

Uh oh!

github-actions Bot commented May 19, 2026

🤖 Claude Code Review

Review: perf(l1): lazy BAL cursor for per-tx parallel execution

Summary

Potential Correctness Bug: get_storage_value hook fires before any "already cached" check

O(n²) Storage Seeding in seed_db_from_bal

Double Hash-Map Lookup in load_account Fast Path

helper_result Discards the Semantic Bool from seed_one_address_info_from_bal

Unnecessary saturating_sub(1) after Proven-Non-Zero Guards

Confusing filter(|_| pos > 0) Pattern in seed_one_storage_slot_from_bal

Stale Line-Number References in Doc Comment

Minor: Arc::new(bal.clone()) Adds a Whole-BAL Clone Per Block

Test Coverage Gap

Positive Notes

Uh oh!

greptile-apps Bot commented May 19, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ElFantasma left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

edg-l commented May 18, 2026 •

edited

Loading

github-actions Bot commented May 18, 2026 •

edited

Loading

github-actions Bot commented May 18, 2026 •

edited

Loading

github-actions Bot commented May 18, 2026 •

edited

Loading

github-actions Bot commented May 19, 2026 •

edited

Loading

Review: `perf(l1): lazy BAL cursor for per-tx parallel execution`

Potential Correctness Bug: `get_storage_value` hook fires before any "already cached" check

O(n²) Storage Seeding in `seed_db_from_bal`

Double Hash-Map Lookup in `load_account` Fast Path

`helper_result` Discards the Semantic Bool from `seed_one_address_info_from_bal`

Unnecessary `saturating_sub(1)` after Proven-Non-Zero Guards

Confusing `filter(|_| pos > 0)` Pattern in `seed_one_storage_slot_from_bal`

Minor: `Arc::new(bal.clone())` Adds a Whole-BAL Clone Per Block