core: extend code read statistics with cache and unique metrics#33522
core: extend code read statistics with cache and unique metrics#33522CPerezz wants to merge 2 commits intoethereum:masterfrom
Conversation
This extends the execution statistics infrastructure from ethereum#33442 with: - Code cache hit/miss tracking (completing parity with account/storage) - Code bytes read metric for I/O volume analysis - Unique state access metrics (accounts, storage slots, contracts) The slow block log now shows: Code read: 626ns(4, 1.07 KiB) Unique state access: Accounts: 6 Storage slots: 11 Contracts executed: 4 Reader statistics account: hit: 4, miss: 6, rate: 40.00 storage: hit: 0, miss: 11, rate: 0.00 code: hit: 4, miss: 0, rate: 100.00
a727f31 to
be2d430
Compare
Extend the unique code execution tracking to distinguish between system contract calls (EIP-4788 beacon root, EIP-2935 history storage, EIP-7002 withdrawal queue, EIP-7251 consolidation queue) and user contract calls. The MarkCodeExecuted method now takes an isSystem parameter to track system contracts separately. This is useful for understanding block execution statistics, as system contracts are called automatically by the protocol at the start of each block. The slow block log now shows "Contracts executed: N (M system)" to make it clear how many of the executed contracts were system calls.
|
b9a50ae might be a bit too invasive. But I thought it was useful to make the distinction between system contracts and regular ones. It can of course be removed. |
| codeHash := evm.resolveCodeHash(addr) | ||
| contract.SetCallCode(codeHash, evm.resolveCode(addr)) | ||
| // Track unique contract execution for metrics | ||
| evm.StateDB.MarkCodeExecuted(codeHash, isSystemCall(caller)) |
There was a problem hiding this comment.
So you want to differentiate the (a) contract code load and run (b) pure contract code reading via EXT*?
Any particular reason to capture these statistics?
There was a problem hiding this comment.
Yes, the distinction is intentional:
CodeLoaded / CodeReads / CodeBytesRead- Counts all code reads, including both execution andEXT*opcodes (EXTCODECOPY,EXTCODESIZE,EXTCODEHASH). This was already in the original core: add code read statistics #33442.UniqueCodeExecutedcounts only contracts whose code was actually executed viaCALL/CALLCODE/DELEGATECALL/STATICCALL. This is the new metric added in this PR.
The reason for tracking unique executed contracts separately:
- Block complexity analysis - Knowing how many distinct contracts were invoked helps understand block execution complexity. A block with 100 transactions calling 5 different contracts behaves differently than one calling 100 different contracts (more cold code loads, more diverse execution paths).
- System vs user contract distinction - The
SystemCodeExecutedcount helps explain why a simple user transaction might show 5 contracts executed (4 system contracts from EIP-4788, EIP-2935, EIP-7002, EIP-7251 + 1 user contract). Without this distinction, the metric could be confusing. - Cache effectiveness insight - Combined with code cache hit/miss rates, knowing unique contracts executed helps understand cache pressure.
It also allows a deeper analysis of the block execution and internal DB burden which is of interest for the repricings.
Not only that, but I also wanted to know if it would be ok to enable an exported of these block logs into a file. Such that we can collect these and compare across clients.
|
Check out #33659 as the alternative. |
|
I don't like the way We can easily track how many codes have been executed and how many of them are system contracts in EVM, like this hack diff --git a/core/vm/evm.go b/core/vm/evm.go
index 25a3318c02..f39100b764 100644
--- a/core/vm/evm.go
+++ b/core/vm/evm.go
@@ -79,6 +79,11 @@ type TxContext struct {
AccessEvents *state.AccessEvents // Capture all state accesses for this tx
}
+type EVMStats struct {
+ ContractExecution int
+ SystemContractExecution int
+}
+
// EVM is the Ethereum Virtual Machine base object and provides
// the necessary tools to run a contract on the given state with
// the provided context. It should be noted that any error
@@ -130,6 +135,8 @@ type EVM struct {
readOnly bool // Whether to throw on stateful modifications
returnData []byte // Last CALL's return data for subsequent reuse
+
+ stats EVMStats
}
// NewEVM constructs an EVM instance with the supplied block context, state
diff --git a/core/vm/interpreter.go b/core/vm/interpreter.go
index 52dbe83d86..36ca636de7 100644
--- a/core/vm/interpreter.go
+++ b/core/vm/interpreter.go
@@ -114,6 +114,11 @@ func (evm *EVM) Run(contract *Contract, input []byte, readOnly bool) (ret []byte
if len(contract.Code) == 0 {
return nil, nil
}
+ if contract.IsSystemCall {
+ evm.stats.SystemContractExecution++
+ } else {
+ evm.stats.ContractExecution++
+ }
var (
op OpCode // current opcode
|
Aren't they overlapped with the counters of account reads and storage reads? Within a single state transition, the unique states will only be read once and then be cached until the end of the block. Also, the way you implemented is wrong. If the account is self-destructed, the object will be removed from the |
Ahh that's true. My bad. We can close this then in favour of #33659 Thanks for taking care of it! |
…3655) Implement standardized JSON format for slow block logging to enable cross-client performance analysis and protocol research. This change is part of the Cross-Client Execution Metrics initiative proposed by Gary Rong: https://hackmd.io/dg7rizTyTXuCf2LSa2LsyQ The standardized metrics enabled data-driven analysis like the EIP-7907 research: https://ethresear.ch/t/data-driven-analysis-on-eip-7907/23850 JSON format includes: - block: number, hash, gas_used, tx_count - timing: execution_ms, total_ms - throughput: mgas_per_sec - state_reads: accounts, storage_slots, bytecodes, code_bytes - state_writes: accounts, storage_slots, bytecodes - cache: account/storage/code hits, misses, hit_rate This should come after merging #33522 --------- Co-authored-by: Gary Rong <garyrong0905@gmail.com>
…hereum#33655) Implement standardized JSON format for slow block logging to enable cross-client performance analysis and protocol research. This change is part of the Cross-Client Execution Metrics initiative proposed by Gary Rong: https://hackmd.io/dg7rizTyTXuCf2LSa2LsyQ The standardized metrics enabled data-driven analysis like the EIP-7907 research: https://ethresear.ch/t/data-driven-analysis-on-eip-7907/23850 JSON format includes: - block: number, hash, gas_used, tx_count - timing: execution_ms, total_ms - throughput: mgas_per_sec - state_reads: accounts, storage_slots, bytecodes, code_bytes - state_writes: accounts, storage_slots, bytecodes - cache: account/storage/code hits, misses, hit_rate This should come after merging ethereum#33522 --------- Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Summary
This PR extends the execution statistics infrastructure introduced in #33442 with:
Additions
1. Code Cache Hit/Miss Tracking
Tracks code cache efficiency, complementing existing account and storage cache statistics:
CodeCacheHitandCodeCacheMisstoReaderStats2. Code Bytes Read
Tracks the actual volume of contract bytecode read:
CodeBytesReadfield to track total bytes loaded3. Unique State Access Metrics
Tracks the diversity of state access during block execution:
UniqueAccountsAccessed- Number of distinct accounts touchedUniqueStorageAccessed- Number of distinct storage slots accessedUniqueCodeExecuted- Number of distinct contracts executedExample Output
Running geth with
--debug.logslowblock 1nsshows the new metrics:Example
I've made some tests locally and seems to work fine. Anyways, I'm open to suggestions on how can I actually test this better. Such that we can be sure what's logged is correct!