Skip to content

core: extend code read statistics with cache and unique metrics#33522

Closed
CPerezz wants to merge 2 commits intoethereum:masterfrom
CPerezz:extended-code-stats
Closed

core: extend code read statistics with cache and unique metrics#33522
CPerezz wants to merge 2 commits intoethereum:masterfrom
CPerezz:extended-code-stats

Conversation

@CPerezz
Copy link
Copy Markdown
Contributor

@CPerezz CPerezz commented Jan 3, 2026

Summary

This PR extends the execution statistics infrastructure introduced in #33442 with:

  • Code cache hit/miss tracking (completing parity with account/storage cache stats)
  • Code bytes read metric
  • Unique state access metrics (accounts, storage slots, contracts executed)

Additions

1. Code Cache Hit/Miss Tracking

Tracks code cache efficiency, complementing existing account and storage cache statistics:

  • Added CodeCacheHit and CodeCacheMiss to ReaderStats
  • Reports cache hit rate in slow block logs alongside account/storage stats

2. Code Bytes Read

Tracks the actual volume of contract bytecode read:

  • Added CodeBytesRead field to track total bytes loaded
  • Enhanced slow block log to show bytes alongside count

3. Unique State Access Metrics

Tracks the diversity of state access during block execution:

  • UniqueAccountsAccessed - Number of distinct accounts touched
  • UniqueStorageAccessed - Number of distinct storage slots accessed
  • UniqueCodeExecuted - Number of distinct contracts executed

Example Output

Running geth with --debug.logslowblock 1ns shows the new metrics:

Example

INFO ########## SLOW BLOCK #########
INFO "Block: 1 (0x1d5ae6...) txs: 1, mgasps: 1142.80, elapsed: 205.168µs"
INFO "EVM execution: 46.789µs"
INFO "Validation: 5.542µs"
INFO "State read: 16.628µs"
INFO "    Account read: 5.25µs(10)"
INFO "    Storage read: 10.752µs(11)"
INFO     Code read: 626ns(4, 1.07 KiB)              <-- NEW: bytes read
INFO Unique state access:                           <-- NEW SECTION
INFO     Accounts: 6
INFO     Storage slots: 11
INFO     Contracts executed: 5 (4 system)
INFO "State hash: 48.332µs"
INFO "    Account hash: 8.416µs"
INFO "    Storage hash: 11.666µs"
INFO "    Trie commit: 28.25µs"
INFO "DB write: 55.292µs"
INFO "    State write: 25.917µs"
INFO "    Block write: 29.375µs"
INFO Reader statistics
INFO account: hit: 4, miss: 6, rate: 40.00
INFO storage: hit: 0, miss: 11, rate: 0.00
INFO code: hit: 4, miss: 0, rate: 100.00            <-- NEW: code cache stats
INFO ##############################

I've made some tests locally and seems to work fine. Anyways, I'm open to suggestions on how can I actually test this better. Such that we can be sure what's logged is correct!

@CPerezz CPerezz requested a review from rjl493456442 as a code owner January 3, 2026 22:02
This extends the execution statistics infrastructure from ethereum#33442 with:

- Code cache hit/miss tracking (completing parity with account/storage)
- Code bytes read metric for I/O volume analysis
- Unique state access metrics (accounts, storage slots, contracts)

The slow block log now shows:

    Code read: 626ns(4, 1.07 KiB)

    Unique state access:
        Accounts: 6
        Storage slots: 11
        Contracts executed: 4

    Reader statistics
    account: hit: 4, miss: 6, rate: 40.00
    storage: hit: 0, miss: 11, rate: 0.00
    code: hit: 4, miss: 0, rate: 100.00
@CPerezz CPerezz force-pushed the extended-code-stats branch from a727f31 to be2d430 Compare January 3, 2026 22:19
Extend the unique code execution tracking to distinguish between system
contract calls (EIP-4788 beacon root, EIP-2935 history storage, EIP-7002
withdrawal queue, EIP-7251 consolidation queue) and user contract calls.

The MarkCodeExecuted method now takes an isSystem parameter to track
system contracts separately. This is useful for understanding block
execution statistics, as system contracts are called automatically by
the protocol at the start of each block.

The slow block log now shows "Contracts executed: N (M system)" to
make it clear how many of the executed contracts were system calls.
@CPerezz
Copy link
Copy Markdown
Contributor Author

CPerezz commented Jan 3, 2026

b9a50ae might be a bit too invasive. But I thought it was useful to make the distinction between system contracts and regular ones. It can of course be removed.

Comment thread core/vm/evm.go
codeHash := evm.resolveCodeHash(addr)
contract.SetCallCode(codeHash, evm.resolveCode(addr))
// Track unique contract execution for metrics
evm.StateDB.MarkCodeExecuted(codeHash, isSystemCall(caller))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you want to differentiate the (a) contract code load and run (b) pure contract code reading via EXT*?

Any particular reason to capture these statistics?

Copy link
Copy Markdown
Contributor Author

@CPerezz CPerezz Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the distinction is intentional:

  • CodeLoaded / CodeReads / CodeBytesRead - Counts all code reads, including both execution and EXT* opcodes (EXTCODECOPY, EXTCODESIZE, EXTCODEHASH). This was already in the original core: add code read statistics #33442.
  • UniqueCodeExecuted counts only contracts whose code was actually executed via CALL/CALLCODE/DELEGATECALL/STATICCALL. This is the new metric added in this PR.

The reason for tracking unique executed contracts separately:

  • Block complexity analysis - Knowing how many distinct contracts were invoked helps understand block execution complexity. A block with 100 transactions calling 5 different contracts behaves differently than one calling 100 different contracts (more cold code loads, more diverse execution paths).
  • System vs user contract distinction - The SystemCodeExecuted count helps explain why a simple user transaction might show 5 contracts executed (4 system contracts from EIP-4788, EIP-2935, EIP-7002, EIP-7251 + 1 user contract). Without this distinction, the metric could be confusing.
  • Cache effectiveness insight - Combined with code cache hit/miss rates, knowing unique contracts executed helps understand cache pressure.

It also allows a deeper analysis of the block execution and internal DB burden which is of interest for the repricings.

Not only that, but I also wanted to know if it would be ok to enable an exported of these block logs into a file. Such that we can collect these and compare across clients.

@rjl493456442
Copy link
Copy Markdown
Member

rjl493456442 commented Jan 22, 2026

Check out #33659 as the alternative.

@rjl493456442
Copy link
Copy Markdown
Member

I don't like the way UniqueCodeExecuted is implemented. Whether the code is executed or not depends on the consumer side (EVM), rather than the producer side (StateDB).

We can easily track how many codes have been executed and how many of them are system contracts in EVM, like this hack

diff --git a/core/vm/evm.go b/core/vm/evm.go
index 25a3318c02..f39100b764 100644
--- a/core/vm/evm.go
+++ b/core/vm/evm.go
@@ -79,6 +79,11 @@ type TxContext struct {
 	AccessEvents *state.AccessEvents // Capture all state accesses for this tx
 }
 
+type EVMStats struct {
+	ContractExecution       int
+	SystemContractExecution int
+}
+
 // EVM is the Ethereum Virtual Machine base object and provides
 // the necessary tools to run a contract on the given state with
 // the provided context. It should be noted that any error
@@ -130,6 +135,8 @@ type EVM struct {
 
 	readOnly   bool   // Whether to throw on stateful modifications
 	returnData []byte // Last CALL's return data for subsequent reuse
+
+	stats EVMStats
 }
 
 // NewEVM constructs an EVM instance with the supplied block context, state
diff --git a/core/vm/interpreter.go b/core/vm/interpreter.go
index 52dbe83d86..36ca636de7 100644
--- a/core/vm/interpreter.go
+++ b/core/vm/interpreter.go
@@ -114,6 +114,11 @@ func (evm *EVM) Run(contract *Contract, input []byte, readOnly bool) (ret []byte
 	if len(contract.Code) == 0 {
 		return nil, nil
 	}
+	if contract.IsSystemCall {
+		evm.stats.SystemContractExecution++
+	} else {
+		evm.stats.ContractExecution++
+	}
 
 	var (
 		op          OpCode     // current opcode

@rjl493456442
Copy link
Copy Markdown
Member

UniqueAccountsAccessed - Number of distinct accounts touched
UniqueStorageAccessed - Number of distinct storage slots accessed

Aren't they overlapped with the counters of account reads and storage reads? Within a single state transition, the unique states will only be read once and then be cached until the end of the block.

Also, the way you implemented is wrong. If the account is self-destructed, the object will be removed from the stateObject set at the boundary of transaction.

@CPerezz
Copy link
Copy Markdown
Contributor Author

CPerezz commented Jan 22, 2026

UniqueAccountsAccessed - Number of distinct accounts touched
UniqueStorageAccessed - Number of distinct storage slots accessed

Aren't they overlapped with the counters of account reads and storage reads? Within a single state transition, the unique states will only be read once and then be cached until the end of the block.

Also, the way you implemented is wrong. If the account is self-destructed, the object will be removed from the stateObject set at the boundary of transaction.

Ahh that's true. My bad. We can close this then in favour of #33659

Thanks for taking care of it!

@CPerezz CPerezz closed this Jan 22, 2026
rjl493456442 added a commit that referenced this pull request Jan 28, 2026
…3655)

Implement standardized JSON format for slow block logging to enable
cross-client performance analysis and protocol research.

This change is part of the Cross-Client Execution Metrics initiative
proposed by Gary Rong: https://hackmd.io/dg7rizTyTXuCf2LSa2LsyQ

The standardized metrics enabled data-driven analysis like the EIP-7907
research: https://ethresear.ch/t/data-driven-analysis-on-eip-7907/23850

JSON format includes:
- block: number, hash, gas_used, tx_count
- timing: execution_ms, total_ms
- throughput: mgas_per_sec
- state_reads: accounts, storage_slots, bytecodes, code_bytes
- state_writes: accounts, storage_slots, bytecodes
- cache: account/storage/code hits, misses, hit_rate


This should come after merging #33522

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
CPerezz added a commit to CPerezz/go-ethereum that referenced this pull request Feb 25, 2026
…hereum#33655)

Implement standardized JSON format for slow block logging to enable
cross-client performance analysis and protocol research.

This change is part of the Cross-Client Execution Metrics initiative
proposed by Gary Rong: https://hackmd.io/dg7rizTyTXuCf2LSa2LsyQ

The standardized metrics enabled data-driven analysis like the EIP-7907
research: https://ethresear.ch/t/data-driven-analysis-on-eip-7907/23850

JSON format includes:
- block: number, hash, gas_used, tx_count
- timing: execution_ms, total_ms
- throughput: mgas_per_sec
- state_reads: accounts, storage_slots, bytecodes, code_bytes
- state_writes: accounts, storage_slots, bytecodes
- cache: account/storage/code hits, misses, hit_rate

This should come after merging ethereum#33522

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants