perf(l1): lazy BAL cursor for per-tx parallel execution#6669
Conversation
|
Lines of code reportTotal lines added: Detailed view |
Benchmark Results ComparisonNo significant difference was registered for any benchmark run. Detailed ResultsBenchmark Results: BubbleSort
Benchmark Results: ERC20Approval
Benchmark Results: ERC20Mint
Benchmark Results: ERC20Transfer
Benchmark Results: Factorial
Benchmark Results: FactorialRecursive
Benchmark Results: Fibonacci
Benchmark Results: FibonacciRecursive
Benchmark Results: ManyHashes
Benchmark Results: MstoreBench
Benchmark Results: Push
Benchmark Results: SstoreBench_no_opt
|
7abf051 to
01ecf18
Compare
Benchmark Block Execution Results Comparison Against Main
|
🤖 Codex Code Review
The Automated review by OpenAI Codex · gpt-5.4 · custom prompt |
🤖 Claude Code ReviewNow I have everything I need to write a detailed review. Review:
|
Greptile SummaryThis PR replaces the eager per-tx BAL prefix materialization in
Confidence Score: 4/5The parallel per-tx path is safe — the lazy cursor correctly replicates the semantics of the eager seed with a well-documented recursion guard. The two outer eager-seed callers are untouched. The core lazy-cursor implementation in gen_db.rs is carefully structured and the off-by-one invariants, borrow-split patterns, and anti-recursion guard are all correct. The regression lives in the outer seed_db_from_bal storage loop in mod.rs, where the new code delegates to seed_one_storage_slot_from_bal (which re-searches storage_changes by key) while iterating storage_changes itself — trading O(n) for O(n²) per account. This path runs only twice per block, limiting impact, but it is a clear regression relative to the old code. crates/vm/backends/levm/mod.rs — specifically the refactored storage inner-loop inside seed_db_from_bal
|
| Filename | Overview |
|---|---|
| crates/vm/levm/src/db/gen_db.rs | Core of the PR: adds LazyBalCursor struct, seed_one_address_info_from_bal/seed_one_storage_slot_from_bal helpers, lazy_bal field on GeneralizedDatabase, and hooks into load_account and get_storage_value. The recursion guard (take/restore of cursor) and borrow-split patterns are correctly implemented. |
| crates/vm/backends/levm/mod.rs | seed_db_from_bal refactored to delegate info-seeding to seed_one_address_info_from_bal; execute_block_parallel switches from eager seed to lazy cursor. The storage inner-loop introduces an O(n²) scan for the outer eager seed path. |
| crates/common/types/block_access_list.rs | Adds #[derive(Clone)] to BalAddressIndex to allow Arc wrapping; minimal, safe change. |
| test/tests/levm/bal_view_tests.rs | Adds four unit tests: off-by-one boundary, no-storage-injection invariant, multi-write cursor semantics, and recursion guard. Good coverage of the non-trivial edge cases. |
Sequence Diagram
sequenceDiagram
participant EP as execute_block_parallel
participant TxDB as per-tx GeneralizedDatabase
participant Cursor as LazyBalCursor
participant BAL as BlockAccessList
participant Store as backing Store
EP->>TxDB: "set lazy_bal = Some(LazyBalCursor)"
Note over EP,TxDB: replaces eager seed_db_from_bal call
TxDB->>TxDB: load_account(addr) — cache miss
TxDB->>Cursor: take() cursor (anti-recursion guard)
Cursor->>BAL: seed_one_address_info_from_bal(addr, max_idx)
alt has_all_info
BAL-->>TxDB: insert LevmAccount with BAL fields
else partial coverage
TxDB->>Store: get_account_state(addr)
Store-->>TxDB: base account + overlay BAL fields
else not in BAL
TxDB->>Store: get_account_state(addr)
Store-->>TxDB: account
end
TxDB->>Cursor: restore cursor unconditionally
TxDB->>TxDB: get_storage_value(addr, key) — slot not cached
TxDB->>Cursor: as_ref() read only
Cursor->>BAL: seed_one_storage_slot_from_bal(acct_idx, key, max_idx)
alt slot in BAL
BAL-->>TxDB: post_value → cache → return
else not in BAL
TxDB->>Store: get_value_from_database
Store-->>TxDB: value → cache → return
end
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
crates/vm/backends/levm/mod.rs:921-929
The refactored storage loop reintroduces an O(n²) cost in the outer eager `seed_db_from_bal` path. For each `sc` yielded by the outer `for sc in &acct_changes.storage_changes` iteration, `seed_one_storage_slot_from_bal` re-scans `storage_changes` via `iter().find()` to locate the *same* `sc` by key. With n storage entries per account this is O(n²), whereas the old code used the already-available iterator value and did a single O(log k) `partition_point` call — net O(n) per account. For a contract with hundreds of BAL-tracked slots this compounds noticeably in both the system-call-recovery path and the post-tx outer seed.
```suggestion
let acc = db
.get_account_mut(addr)
.map_err(|e| EvmError::Custom(format!("seed storage mut: {e}")))?;
for sc in &acct_changes.storage_changes {
let pos = sc
.slot_changes
.partition_point(|c| c.block_access_index <= max_idx);
if pos > 0 {
let key = ethrex_common::utils::u256_to_h256(sc.slot);
acc.storage.insert(key, sc.slot_changes[pos - 1].post_value);
}
}
```
### Issue 2 of 2
crates/vm/levm/src/db/gen_db.rs:407-414
**Linear slot scan on every storage cache-miss**
`seed_one_storage_slot_from_bal` uses `iter().find()` over the full `storage_changes` slice on every call from the lazy-cursor hook in `get_storage_value`. For a contract whose BAL entry has many storage-change records the cost is O(n_storage_changes_in_bal) per cold-slot read. The `BalAddressIndex` gives O(1) address lookup, but there is no equivalent per-account slot index. For the described workload (transfer / EVM-mix) the average account has few BAL storage slots and this is fine; however, a DeFi block dominated by high-storage-turnover contracts (e.g., AMM pools with many SSTORE'd ticks) could see the lazy path regress relative to the old eager seed. A `FxHashMap<H256, usize>` slot-to-position index on `LazyBalCursor`, built when the cursor is constructed, would restore O(1) slot lookup without increasing peak memory meaningfully.
Reviews (1): Last reviewed commit: "docs(changelog): add lazy BAL cursor per..." | Re-trigger Greptile
91648bb to
4d26edc
Compare
Address greptile findings on PR #6669: - seed_db_from_bal eager loop walked storage_changes, then seed_one_storage_slot_from_bal re-found the same sc by slot key. Use the outer sc directly via a new post_value_at_or_before helper. - seed_one_storage_slot_from_bal (lazy cursor) did iter().find() over storage_changes on every cache miss. Resolve slot in O(1) via a new per-account slot_idx_by_account map on BalAddressIndex, built once per block in build_validation_index. Safe under EIP-7928: canonical-ordering validation enforces strictly ascending unique slots per account, so map insert order matches the former find() semantics. Verified clean: 8721 + 93 ef-tests pass on a clean vectors checkout.
Address greptile findings on PR #6669: - seed_db_from_bal eager loop walked storage_changes, then seed_one_storage_slot_from_bal re-found the same sc by slot key. Use the outer sc directly via a new post_value_at_or_before helper. - seed_one_storage_slot_from_bal (lazy cursor) did iter().find() over storage_changes on every cache miss. Resolve slot in O(1) via a new per-account slot_idx_by_account map on BalAddressIndex, built once per block in build_validation_index. Safe under EIP-7928: canonical-ordering validation enforces strictly ascending unique slots per account, so map insert order matches the former find() semantics. Verified clean: 8721 + 93 ef-tests pass on a clean vectors checkout.
Address greptile findings on PR #6669: - seed_db_from_bal eager loop walked storage_changes, then seed_one_storage_slot_from_bal re-found the same sc by slot key. Use the outer sc directly via a new post_value_at_or_before helper. - seed_one_storage_slot_from_bal (lazy cursor) did iter().find() over storage_changes on every cache miss. Resolve slot in O(1) via a new per-account slot_idx_by_account map on BalAddressIndex, built once per block in build_validation_index. Safe under EIP-7928: canonical-ordering validation enforces strictly ascending unique slots per account, so map insert order matches the former find() semantics. Verified clean: 8721 + 93 ef-tests pass on a clean vectors checkout.
ElFantasma
left a comment
There was a problem hiding this comment.
Three inline findings, all minor — none blocking.
Address greptile findings on PR #6669: - seed_db_from_bal eager loop walked storage_changes, then seed_one_storage_slot_from_bal re-found the same sc by slot key. Use the outer sc directly via a new post_value_at_or_before helper. - seed_one_storage_slot_from_bal (lazy cursor) did iter().find() over storage_changes on every cache miss. Resolve slot in O(1) via a new per-account slot_idx_by_account map on BalAddressIndex, built once per block in build_validation_index. Safe under EIP-7928: canonical-ordering validation enforces strictly ascending unique slots per account, so map insert order matches the former find() semantics. Verified clean: 8721 + 93 ef-tests pass on a clean vectors checkout.
4839027 to
a57c161
Compare
Replaces eager per-tx BAL prefix materialization inside execute_block_parallel with an on-read LazyBalCursor installed on each per-tx GeneralizedDatabase. load_account consults the cursor for account info only; get_storage_value consults it per-slot. Each tx now materializes only what it actually touches instead of the full BAL prefix. The two outer sequential seed_db_from_bal callers (system-call recovery, post-tx outer seed) remain untouched. - Extract seed_one_address_info_from_bal + seed_one_storage_slot_from_bal from seed_db_from_bal as reusable helpers in ethrex-levm - Add Clone to BalAddressIndex so it can be Arc-wrapped once per block - Add lazy_bal: Option<LazyBalCursor> on GeneralizedDatabase - Hook load_account and get_storage_value with explicit borrow-ordering - Switch execute_block_parallel to set tx_db.lazy_bal instead of seeding - Drop per-tx DB capacity hint from bal_account_count to 32 Tests in test/tests/levm/bal_view_tests.rs cover: - T1 off-by-one cutoff (tx1_sees_tx0_write) - T2 no storage injection in load_account - T3 SSTORE pre-image flows through cursor - T4 partial-coverage load_account does not recurse (cursor .take() guard)
The per-tx GeneralizedDatabase in execute_block_parallel is configured with both a shared_base (pre-block snapshot of system-touched addresses, captured from initial_accounts_state after prepare_block) and a LazyBalCursor that materialises the BAL prefix on cache-miss. load_account previously consulted shared_base before the cursor, so any address present in both would short- circuit to the pre-block balance / nonce / code and miss the BAL overlay. For a predeploy touched by prepare_block (e.g. the withdrawal / consolidation request contracts) whose info is then mutated by a prior tx in the same block, a later tx reading that info via BALANCE / EXTCODE* would observe the stale pre-block value. Storage reads are unaffected because shared_base accounts are cloned with empty .storage and slot reads go through the lazy_bal hook in get_storage_value. Reorder load_account: lazy_bal hook runs first, falling back to shared_base only when the cursor has no entry for the address. The .take() guard already prevents the partial-coverage recursion through db.get_account; the inner call now lands on shared_base (or store), then the outer overlays BAL info. Regression test in test/tests/levm/bal_view_tests.rs constructs a per-tx db with a shared_base balance of 0 and a BAL balance_change of 42_000 at block_access_index 1, and asserts load_account returns the BAL value. Verified clean: full blockchain ef-tests (8721 + 93 = 8814 tests, 0 failed) on a freshly downloaded amsterdam fixtures bundle.
Address greptile findings on PR #6669: - seed_db_from_bal eager loop walked storage_changes, then seed_one_storage_slot_from_bal re-found the same sc by slot key. Use the outer sc directly via a new post_value_at_or_before helper. - seed_one_storage_slot_from_bal (lazy cursor) did iter().find() over storage_changes on every cache miss. Resolve slot in O(1) via a new per-account slot_idx_by_account map on BalAddressIndex, built once per block in build_validation_index. Safe under EIP-7928: canonical-ordering validation enforces strictly ascending unique slots per account, so map insert order matches the former find() semantics. Verified clean: 8721 + 93 ef-tests pass on a clean vectors checkout.
L2 lint (no rayon feature) flagged unused import: SlotChange, since post_value_at_or_before is rayon-gated.
Replace fragile line-number references in seed_db_from_bal doc with descriptive context.
a57c161 to
df9a356
Compare
Resolves a conflict in crates/vm/backends/levm/mod.rs introduced by #6669 (lazy BAL cursor) and #6655 (BAL optimistic merkleization), which rewrote the same lines this branch un-gated. Took main's version of the file wholesale, then re-stripped the rayon/eip-8025 cfg gates — keeping main's is_amsterdam correctness guard and gen_db refactor. The merge also pulled in new rayon/eip-8025 gates from #6669 in files that did not conflict (auto-merged): crates/vm/levm/src/db/gen_db.rs, test/Cargo.toml, and test/tests/levm/bal_view_tests.rs. Stripped those too, so the only remaining eip-8025 gates are the four guest binary main.rs files and no rayon feature gates remain. The bal_view_tests now run unconditionally.
Summary
Replaces eager per-tx BAL prefix materialization inside
execute_block_parallelwith an on-readLazyBalCursorinstalled on each per-txGeneralizedDatabase(LEVM's in-memory state cache that the EVM reads accounts/slots through during execution). Each tx materializes only the accounts/slots it actually touches instead of the full BAL prefix.The two outer sequential
seed_db_from_balcallers (system-call recovery, post-tx outer seed) are unchanged; the cursor is per-tx only.Benchmark
Fixture:
bal-devnet-7-mainnet-mix-460(460 blocks, ~30 Ggas, transfer/EVM-mix). Single run,release-with-debugprofile,import-bench --with-bal.Win is concentrated in
exec, which is exactly what the cursor targets; merkle/store/warmer barely move, so the gain is not a measurement shift.Changes
seed_one_address_info_from_balandseed_one_storage_slot_from_balfromseed_db_from_balas reusable helpers inethrex-levm.seed_db_from_balbecomes a thin loop over these helpers (behavior-preserving).CloneonBalAddressIndex.lazy_bal: Option<LazyBalCursor>field onGeneralizedDatabase.LazyBalCursorholdsArc<BlockAccessList>,bal_index: u32,Arc<BalAddressIndex>.load_accountconsults the cursor for account info (balance, nonce, code_hash) on cache miss before falling through to the store. Does not injectaccount.storage.get_storage_valueconsults the cursor per-slot on cache miss.execute_block_parallelsetstx_db.lazy_bal = Some(...)per tx instead of callingseed_db_from_baleagerly.GeneralizedDatabasecapacity hint drops frombal_account_countto32.code_from_baldeduplicated intogen_db.rs.Invariants
bal_index = tx_idx + 1; effective cutoff isbal_index.saturating_sub(1), matching the existingseed_db_from_bal'smax_idx = tx_idx.debug_assert!(bal_index >= 1).load_accountonly injects account-info fields, neveraccount.storage. Storage stays lazy throughget_storage_value.seed_one_address_info_from_bal,code_updateis computed before the&mut LevmAccountborrow;db.codes.entry().or_insert()runs after the borrow is released.get_storage_value, the cursor result is copied to a local before taking&mut current_accounts_state.load_account.take()s the cursor before calling the helper (whose partial-coverage path callsdb.get_accountinternally) and restores it after; prevents re-entry into the lazy hook.Tests
test/tests/levm/bal_view_tests.rs:tx1_sees_tx0_write; off-by-one boundaryload_account_does_not_inject_storage; no storage injectionsstore_sees_prior_write; SSTORE pre-image flows through cursorlazy_load_account_partial_coverage_does_not_recurse;.take()guardTest plan
cargo test -p ethrex-test --features rayon bal_view_tests(4/4)cargo test -p ethrex-vm -p ethrex-levm -p ethrex-blockchainmake lintcargo fmt --all --checkmake -C tooling/ef_tests/state testmake -C tooling/ef_tests/blockchain test