Skip to content

perf(levm): replace BTreeMap/BTreeSet with FxHashMap/FxHashSet for accessed storage slots#6185

Closed
ilitteri wants to merge 2 commits into
mainfrom
perf/flatten-accessed-storage-slots
Closed

perf(levm): replace BTreeMap/BTreeSet with FxHashMap/FxHashSet for accessed storage slots#6185
ilitteri wants to merge 2 commits into
mainfrom
perf/flatten-accessed-storage-slots

Conversation

@ilitteri
Copy link
Copy Markdown
Collaborator

@ilitteri ilitteri commented Feb 11, 2026

Motivation

Profile analysis (Rank 7 in the optimization plan) shows that access_storage_slot accounts for 2.37s cumulative (4.76%) in the gigablocks scenario. The BTreeMap<Address, BTreeSet<H256>> data structure for accessed_storage_slots in the EVM Substate adds O(log n) overhead to every SLOAD/SSTORE warm/cold check.

Description

Replace BTreeMap<Address, BTreeSet<H256>> with FxHashMap<Address, FxHashSet<H256>> for the accessed_storage_slots field in Substate. This gives O(1) amortized lookup and insert for the hot path (add_accessed_slot, is_slot_accessed).

The cold-path make_access_list() still constructs a local BTreeMap<Address, BTreeSet<H256>> for deterministic ordering in its output. This function is called at most once per transaction.

Files modified: crates/vm/levm/src/vm.rs

Benchmark Results

Baseline: runs #63/#64/#65 on main at 1a47fb18

Scenario Baseline (Mgas/s) FxHash (Mgas/s) Change
Fast 420.48 462.18 +9.9%
Gigablocks 909.63 1,044.26 +14.8%
Slow 811.04 870.27 +7.3%

Profile Highlights

Gigablocks (runs #64 vs #80):

Function Baseline cum FxHash cum Change
VM::execute 12,370ms 11,055ms -10.6%
VM::run_execution 6,776ms 5,909ms -12.8%
access_storage_slot 2,372ms 2,086ms -12.1%

Slow (runs #65 vs #81):

Function Baseline cum FxHash cum Change
VM::execute 13,244ms 12,574ms -5.1%
VM::run_execution 7,874ms 7,315ms -7.1%
access_storage_slot 3,275ms 3,076ms -6.1%

RocksDB write path and compaction are essentially unchanged — this optimization is purely EVM-side.

Copilot AI review requested due to automatic review settings February 11, 2026 19:27
@ilitteri ilitteri requested a review from a team as a code owner February 11, 2026 19:27
@github-actions github-actions Bot added levm Lambda EVM implementation performance Block execution throughput and performance in general labels Feb 11, 2026
ilitteri added a commit that referenced this pull request Feb 11, 2026
@github-actions
Copy link
Copy Markdown

🤖 Kimi Code Review

Review Summary

The PR replaces BTreeMap/BTreeSet with FxHashMap/FxHashSet for EVM accessed storage slots to improve performance. The change is straightforward and appears correct, but there are a few considerations:

Issues Found

  1. Line 739: The initialization of initial_accessed_storage_slots uses FxHashMap::default() which is correct, but there's a missing space after the colon that affects code formatting consistency.

  2. Line 87: The function signature change from BTreeMap<Address, BTreeSet<H256>> to FxHashMap<Address, FxHashSet<H256>> is a breaking change for any external code that calls Substate::from_accesses(). This needs to be documented as a breaking change.

  3. Line 255: The comment mentions "BAL" which appears to be a typo - it should likely be "BLOB" (referring to EIP-4844 blob transactions).

Security & Consensus Considerations

  • Determinism: The change from BTreeMap/BTreeSet (which maintain sorted order) to FxHashMap/FxHashSet (which don't maintain order) could potentially affect consensus if the order of accessed storage slots is used anywhere in state calculation. However, since these are only used for gas accounting (EIP-2929) and not for state root calculation, this should be safe.

  • Hash DoS: FxHashMap uses a non-cryptographic hash function which is faster but potentially vulnerable to hash collision attacks. However, since the keys here are Address and H256 (both fixed-size cryptographic hashes), this risk is minimal.

Performance Impact

The change should provide measurable performance improvements as FxHashMap/FxHashSet have significantly better lookup and insertion performance compared to BTreeMap/BTreeSet for these use cases.

Suggested Changes

// Line 739: Fix formatting
let mut initial_accessed_storage_slots: FxHashMap<Address, FxHashSet<H256>> = FxHashMap::default();

// Line 255: Fix typo in comment
/// Used by SELFDESTRUCT to record storage reads in BLOB per EIP-7928:

Additional Note

The CHANGELOG.md entry has the wrong year (2026 instead of 2025).


Automated review by Kimi (Moonshot AI) · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Claude Code Review

Perfect! Now I have all the context I need. Let me analyze this PR thoroughly.

Code Review: PR #6185 - Replace BTreeMap/BTreeSet with FxHashMap/FxHashSet for accessed storage slots

Summary

This is a well-justified performance optimization that replaces ordered data structures with hash-based ones for EVM warm/cold slot tracking. The change is targeted, minimal, and correctly preserves determinism where needed.


Strengths

  1. Performance rationale is sound - The hot path (SLOAD/SSTORE warm/cold checks) benefits from O(1) hash operations instead of O(log n) tree operations. The 4.59% cumulative time in profiling justifies this optimization.

  2. Determinism correctly preserved - The make_access_list() method at vm.rs:197 creates a local BTreeMap<Address, BTreeSet<H256>> to ensure deterministic ordering for access list output, which is critical for consensus.

  3. Type consistency maintained - All related function signatures updated correctly:

    • from_accesses() at vm.rs:88
    • get_accessed_storage_slots() at vm.rs:255
    • initialize() at vm.rs:742
  4. Merge logic compatible - The commit_backup() method at vm.rs:121-126 correctly extends hash sets, which works identically to the previous BTree approach.


🔍 Detailed Analysis

No functional changes

  • Hot path methods (add_accessed_slot, is_slot_accessed) work identically with hash-based collections
  • Parent traversal logic unchanged
  • The SELFDESTRUCT EIP-7928 compliance (get_accessed_storage_slots) works correctly with FxHashSet

Access list construction (vm.rs:196-221)

The critical determinism requirement is met:

let mut entries = BTreeMap::<Address, BTreeSet<H256>>::new();

This local conversion ensures the output Vec<AccessListEntry> is deterministic, which is essential since access lists affect:

  • Transaction replay
  • RPC responses
  • Consensus if used in block production

Hash function safety

FxHashMap/FxHashSet use a non-cryptographic hash (FxHash) which is appropriate here because:

  • Storage slot keys (H256) have sufficient entropy
  • No DoS risk from hash collision attacks (slots come from contract execution, not external input)
  • Performance is more important than DoS resistance in this internal data structure

⚠️ Potential Concerns

Minor: Non-deterministic iteration order (Low severity)

While the code correctly converts to BTreeMap for make_access_list(), any future code that iterates directly over accessed_storage_slots would get non-deterministic ordering. This is acceptable for internal use but worth noting.

Recommendation: Consider adding a documentation comment at vm.rs:74:

/// Storage slots accessed per address (for EIP-2929 warm/cold gas costs).
/// Uses FxHashMap for O(1) lookup performance. Use make_access_list() for deterministic output.
accessed_storage_slots: FxHashMap<Address, FxHashSet<H256>>,

Minor: get_accessed_storage_slots() return type change (Low severity)

The function at vm.rs:255 now returns FxHashSet<H256> instead of BTreeSet<H256>. The caller at crates/vm/levm/src/opcode_handlers/system.rs:624 uses this for SELFDESTRUCT BAL recording. Since BAL recording likely needs determinism, verify that the consumer handles this correctly.

Action item: Check if the BAL (Block Access List) recording needs deterministic slot ordering. If so, the consumer should sort the slots.


🎯 Verdict

LGTM with minor documentation suggestion

The change is:

  • ✅ Correct
  • ✅ Safe (no security issues)
  • ✅ Well-motivated (measurable performance impact)
  • ✅ Minimal scope
  • ✅ Preserves determinism where needed

Suggested improvements:

  1. Add a comment at vm.rs:74 noting the hash map is for performance and make_access_list() should be used for deterministic output
  2. Verify the SELFDESTRUCT BAL recording consumer can handle non-deterministic FxHashSet iteration (or document that determinism isn't required there)

The optimization is solid and the implementation correctly handles the consensus-critical determinism requirement.


Automated review by Claude (Anthropic) · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Codex Code Review

Findings

  • Potential hash-collision DoS risk: switching accessed_storage_slots to FxHashMap/FxHashSet uses a non‑cryptographic hasher on attacker-controlled keys (storage slots). A malicious contract can potentially craft colliding H256 keys to degrade to worst-case behavior. Consider keeping BTreeMap/BTreeSet here or using a randomized hasher for the map/set (e.g., HashMap with RandomState) if DoS resistance matters. crates/vm/levm/src/vm.rs:74,88,224-236,255-266

Notes

  • Determinism looks preserved for BAL output because record_storage_read stores into BTreeSet and RLPEncode sorts; iteration order from the hash set should not affect consensus outputs.

If you’ve already assessed collision risk as acceptable for this hot path, I’m fine with the change.


Automated review by OpenAI Codex · custom prompt

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Feb 11, 2026

Greptile Overview

Greptile Summary

This PR optimizes LEVM’s EIP-2929 warm/cold slot tracking by switching Substate.accessed_storage_slots from BTreeMap<Address, BTreeSet<H256>> to FxHashMap<Address, FxHashSet<H256>>, reducing per-SLOAD/SSTORE lookup/insert overhead on the hot path. To preserve deterministic access list output, make_access_list() still constructs a local BTreeMap/BTreeSet when materializing the final Vec<AccessListEntry>.

Key concerns before merge:

Confidence Score: 3/5

  • Mostly safe, but has a correctness/documentation issue to fix before merge.
  • Core map/set swap is localized and make_access_list() preserves deterministic output, but the changelog PR link is wrong and a public API now returns an unordered set which can cause non-deterministic downstream behavior if relied upon.
  • CHANGELOG.md and crates/vm/levm/src/vm.rs (get_accessed_storage_slots return type/usage determinism).

Important Files Changed

Filename Overview
CHANGELOG.md Adds a perf changelog entry, but the PR number/link is incorrect (#6184 vs #6185).
crates/vm/levm/src/vm.rs Switches accessed storage slot tracking from BTreeMap/BTreeSet to FxHashMap/FxHashSet and adjusts access list building; note public API now returns an unordered set which can break deterministic downstream uses.

Sequence Diagram

sequenceDiagram
  participant VM as VM
  participant Sub as Substate
  participant Parent as Parent Substate (checkpoint)

  Note over VM,Sub: Transaction start
  VM->>Sub: initialize(env, tx)
  Sub->>Sub: seed accessed_addresses (origin, coinbase, precompiles)
  Sub->>Sub: seed accessed_storage_slots from tx.access_list

  Note over VM,Sub: During opcode execution (SLOAD/SSTORE)
  VM->>Sub: is_slot_accessed(address, key)
  Sub->>Sub: accessed_storage_slots.get(address).contains(key)
  alt not in current
    Sub->>Parent: is_slot_accessed(address, key) (recursive)
  end

  VM->>Sub: add_accessed_slot(address, key)
  Sub->>Sub: accessed_storage_slots.entry(address).or_default().insert(key)

  Note over VM,Sub: Nested call / checkpointing
  VM->>Sub: push_backup()
  VM->>Sub: (mutations during inner call)
  alt inner call succeeds
    VM->>Sub: commit_backup()
    Sub->>Sub: merge delta.accessed_storage_slots into current (extend sets)
  else inner call fails
    VM->>Sub: revert_backup()
  end

  Note over VM,Sub: Access list generation (once per tx)
  VM->>Sub: make_access_list()
  Sub->>Sub: iterate substates and merge into local BTreeMap/BTreeSet
  Sub-->>VM: Vec<AccessListEntry> (deterministic order)
Loading

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment thread CHANGELOG.md
Comment thread crates/vm/levm/src/vm.rs
Comment on lines 252 to +256
/// Returns all accessed storage slots for a given address.
/// Used by SELFDESTRUCT to record storage reads in BAL per EIP-7928:
/// "SELFDESTRUCT: Include modified/read storage keys as storage_read"
pub fn get_accessed_storage_slots(&self, address: &Address) -> BTreeSet<H256> {
let mut slots = BTreeSet::new();
pub fn get_accessed_storage_slots(&self, address: &Address) -> FxHashSet<H256> {
let mut slots = FxHashSet::default();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return type change breaks callers

get_accessed_storage_slots now returns FxHashSet<H256> (was BTreeSet<H256>). Any caller that relied on deterministic ordering (common for receipts/tracing/encoding) will now see non-deterministic iteration order. If the result is later serialized/hashed/compared, this can cause flaky tests or consensus-sensitive differences. Consider either keeping the public API ordered (return BTreeSet/sorted Vec) or sorting at the call site(s) where determinism is required.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/vm/levm/src/vm.rs
Line: 252:256

Comment:
**Return type change breaks callers**

`get_accessed_storage_slots` now returns `FxHashSet<H256>` (was `BTreeSet<H256>`). Any caller that relied on deterministic ordering (common for receipts/tracing/encoding) will now see non-deterministic iteration order. If the result is later serialized/hashed/compared, this can cause flaky tests or consensus-sensitive differences. Consider either keeping the public API ordered (return `BTreeSet`/sorted `Vec`) or sorting at the call site(s) where determinism is required.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes LEVM’s EIP-2929 warm/cold storage-slot tracking by switching the in-memory data structure in Substate from ordered trees to faster hash-based collections, while preserving deterministic ordering where it matters (access list output).

Changes:

  • Replace BTreeMap<Address, BTreeSet<H256>> with FxHashMap<Address, FxHashSet<H256>> for Substate::accessed_storage_slots.
  • Update initialization and helper APIs (from_accesses, get_accessed_storage_slots) to use the new hash-based types.
  • Add a changelog entry under Perf documenting the optimization.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
crates/vm/levm/src/vm.rs Switch accessed storage-slot tracking to FxHashMap/FxHashSet; keep make_access_list() deterministic via local BTreeMap/BTreeSet.
CHANGELOG.md Document the performance optimization in the Perf section.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 11, 2026

Benchmark Results Comparison

No significant difference was registered for any benchmark run.

Detailed Results

Benchmark Results: BubbleSort

Command Mean [s] Min [s] Max [s] Relative
main_revm_BubbleSort 2.947 ± 0.021 2.921 2.992 1.08 ± 0.01
main_levm_BubbleSort 2.742 ± 0.020 2.719 2.787 1.00 ± 0.01
pr_revm_BubbleSort 2.940 ± 0.012 2.923 2.963 1.07 ± 0.01
pr_levm_BubbleSort 2.739 ± 0.012 2.721 2.756 1.00

Benchmark Results: ERC20Approval

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_ERC20Approval 975.4 ± 9.7 966.0 999.8 1.00
main_levm_ERC20Approval 1044.5 ± 8.4 1029.0 1056.3 1.07 ± 0.01
pr_revm_ERC20Approval 982.0 ± 7.1 976.3 995.8 1.01 ± 0.01
pr_levm_ERC20Approval 1039.6 ± 5.1 1035.4 1051.8 1.07 ± 0.01

Benchmark Results: ERC20Mint

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_ERC20Mint 135.2 ± 2.6 133.0 139.9 1.00 ± 0.02
main_levm_ERC20Mint 158.8 ± 2.5 155.4 163.1 1.18 ± 0.02
pr_revm_ERC20Mint 135.1 ± 1.6 133.5 138.6 1.00
pr_levm_ERC20Mint 158.9 ± 2.9 155.3 162.8 1.18 ± 0.03

Benchmark Results: ERC20Transfer

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_ERC20Transfer 232.8 ± 3.1 229.4 237.3 1.00
main_levm_ERC20Transfer 270.2 ± 3.2 265.9 276.3 1.16 ± 0.02
pr_revm_ERC20Transfer 235.1 ± 1.4 232.5 237.2 1.01 ± 0.01
pr_levm_ERC20Transfer 268.9 ± 3.2 263.0 273.5 1.16 ± 0.02

Benchmark Results: Factorial

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_Factorial 231.0 ± 4.4 228.1 242.8 1.01 ± 0.02
main_levm_Factorial 259.0 ± 5.7 254.3 272.7 1.13 ± 0.03
pr_revm_Factorial 228.8 ± 1.3 226.9 231.3 1.00
pr_levm_Factorial 258.6 ± 1.2 257.0 260.3 1.13 ± 0.01

Benchmark Results: FactorialRecursive

Command Mean [s] Min [s] Max [s] Relative
main_revm_FactorialRecursive 1.659 ± 0.039 1.593 1.724 1.00
main_levm_FactorialRecursive 8.639 ± 0.059 8.546 8.706 5.21 ± 0.13
pr_revm_FactorialRecursive 1.693 ± 0.018 1.662 1.722 1.02 ± 0.03
pr_levm_FactorialRecursive 8.545 ± 0.062 8.474 8.652 5.15 ± 0.13

Benchmark Results: Fibonacci

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_Fibonacci 213.7 ± 6.4 209.2 226.8 1.01 ± 0.03
main_levm_Fibonacci 239.1 ± 8.9 231.1 258.1 1.13 ± 0.04
pr_revm_Fibonacci 211.3 ± 1.3 210.2 214.8 1.00
pr_levm_Fibonacci 239.5 ± 11.5 231.5 263.8 1.13 ± 0.05

Benchmark Results: FibonacciRecursive

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_FibonacciRecursive 854.0 ± 15.9 833.5 893.9 1.24 ± 0.03
main_levm_FibonacciRecursive 696.5 ± 9.1 685.7 719.4 1.01 ± 0.02
pr_revm_FibonacciRecursive 864.0 ± 16.0 848.1 895.1 1.25 ± 0.03
pr_levm_FibonacciRecursive 690.5 ± 8.6 679.2 707.6 1.00

Benchmark Results: ManyHashes

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_ManyHashes 8.4 ± 0.0 8.3 8.5 1.00
main_levm_ManyHashes 9.7 ± 0.1 9.6 9.8 1.15 ± 0.01
pr_revm_ManyHashes 8.6 ± 0.4 8.3 9.3 1.02 ± 0.04
pr_levm_ManyHashes 9.9 ± 0.1 9.7 10.1 1.18 ± 0.02

Benchmark Results: MstoreBench

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_MstoreBench 267.5 ± 1.9 265.9 272.0 1.19 ± 0.02
main_levm_MstoreBench 226.4 ± 3.3 220.4 232.0 1.01 ± 0.02
pr_revm_MstoreBench 272.0 ± 6.0 266.7 287.7 1.21 ± 0.03
pr_levm_MstoreBench 225.2 ± 3.2 216.9 229.5 1.00

Benchmark Results: Push

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_Push 295.0 ± 0.7 293.6 296.1 1.02 ± 0.01
main_levm_Push 290.4 ± 2.0 287.5 292.7 1.00
pr_revm_Push 296.9 ± 1.2 294.9 298.6 1.02 ± 0.01
pr_levm_Push 292.1 ± 2.9 289.2 298.8 1.01 ± 0.01

Benchmark Results: SstoreBench_no_opt

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_SstoreBench_no_opt 168.5 ± 2.9 164.5 172.0 1.56 ± 0.03
main_levm_SstoreBench_no_opt 108.5 ± 0.6 107.5 109.0 1.00 ± 0.01
pr_revm_SstoreBench_no_opt 170.4 ± 4.9 164.4 182.2 1.58 ± 0.05
pr_levm_SstoreBench_no_opt 108.0 ± 0.7 107.2 109.0 1.00

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 11, 2026

Benchmark Block Execution Results Comparison Against Main

Command Mean [s] Min [s] Max [s] Relative
base 65.516 ± 0.544 64.902 66.283 1.01 ± 0.01
head 65.043 ± 0.414 64.606 65.886 1.00

@github-project-automation github-project-automation Bot moved this to In Review in ethrex_l1 Feb 12, 2026
…ge slots

The EVM Substate tracked accessed storage slots (EIP-2929) using
BTreeMap<Address, BTreeSet<H256>>, which has O(log n) lookup and insert.
Since these are called on every SLOAD/SSTORE, replacing with
FxHashMap<Address, FxHashSet<H256>> gives O(1) amortized operations.

The cold-path make_access_list still uses local BTreeMap/BTreeSet
for deterministic ordering in its output.
@ilitteri ilitteri force-pushed the perf/flatten-accessed-storage-slots branch from 9e52403 to 6a5c343 Compare February 23, 2026 20:48
@ilitteri ilitteri closed this Feb 24, 2026
@github-project-automation github-project-automation Bot moved this from In Review to Done in ethrex_l1 Feb 24, 2026
@github-project-automation github-project-automation Bot moved this from Todo to Done in ethrex_performance Feb 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

levm Lambda EVM implementation performance Block execution throughput and performance in general

Projects

Status: Done
Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants