Add cross-client execution metrics and slow block logging#9834
Add cross-client execution metrics and slow block logging#9834macfarla wants to merge 67 commits into
Conversation
Implements standardized execution metrics collection for block processing performance monitoring, following the cross-client execution metrics specification. Key features: - SlowBlockTracer: logs detailed JSON metrics for blocks exceeding a configurable threshold (--slow-block-threshold CLI flag or -Dbesu.execution.slowBlockThresholdMs) - StateMetricsCollector: instance-based metrics collection threaded through the world state object graph, replacing ThreadLocal-based approaches - ExecutionStats: tracks timing breakdowns, state access counts, cache performance, and EVM operation counts (SLOAD, SSTORE, CALL, CREATE) - BlockAwareTracerAggregator: composable tracer pattern for combining multiple block-aware operation tracers - ExecutionMetricsTracer: EVM-level tracer for opcode counting - Parallel execution support with metrics aggregation across background threads Co-authored-by: CPerezz <c@cperezz.dev> Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
…tion-metrics-standardization-rebased
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
…tion-metrics-standardization-rebased
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
…tion-metrics-standardization-rebased
…tion-metrics-standardization-rebased
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
…tion-metrics-standardization-rebased
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
…tion-metrics-standardization-rebased
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR implements standardized JSON-format slow block logging to enable cross-client performance analysis and protocol research. It introduces a comprehensive execution metrics system that collects detailed statistics about block execution including timing, state access patterns, cache performance, and EVM operation counts following the cross-client execution metrics specification.
Changes:
- Added ExecutionStats/ExecutionStatsTracer classes to collect block execution metrics across the EVM and state layers
- Introduced SlowBlockTracer for logging slow blocks in standardized JSON format with 38 metric fields
- Extended MetricsConfiguration with executionMetricsEnabled flag and slow block threshold configuration
Reviewed changes
Copilot reviewed 43 out of 43 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| metrics/core/src/main/java/org/hyperledger/besu/metrics/prometheus/MetricsConfiguration.java | Added executionMetricsEnabled field to metrics configuration |
| evm/src/main/java/org/hyperledger/besu/evm/tracing/ExecutionMetricsTracer.java | New EVM-level tracer for collecting operation counts (SLOAD/SSTORE/CALL/CREATE) |
| ethereum/core/src/main/java/org/hyperledger/besu/ethereum/mainnet/ExecutionStats.java | Core metrics collection class implementing StateMetricsCollector interface |
| ethereum/core/src/main/java/org/hyperledger/besu/ethereum/mainnet/SlowBlockTracer.java | Block-aware tracer that logs slow blocks in standardized JSON format |
| ethereum/core/src/main/java/org/hyperledger/besu/ethereum/mainnet/parallelization/ParallelizedConcurrentTransactionProcessor.java | Extended to support metrics collection during parallel transaction execution |
| ethereum/core/src/main/java/org/hyperledger/besu/ethereum/mainnet/AbstractBlockProcessor.java | Integrated SlowBlockTracer into block processing pipeline |
| ethereum/core/src/main/java/org/hyperledger/besu/ethereum/ProtocolContext.java | Added slowBlockThresholdMs configuration |
| app/src/main/java/org/hyperledger/besu/cli/options/MetricsOptions.java | Added CLI options for execution metrics and slow block threshold |
| acceptance-tests/tests/src/acceptanceTest/java/org/hyperledger/besu/tests/acceptance/SlowBlockMetricsValidationTest.java | Comprehensive acceptance test validating all 38 JSON metric fields |
Comments suppressed due to low confidence (2)
ethereum/core/src/main/java/org/hyperledger/besu/ethereum/trie/pathbased/common/worldview/accumulator/PathBasedWorldStateUpdateAccumulator.java:1
- The change from
parallelStream()tostream()removes parallelism from account updates. This could impact performance for blocks with many account modifications. Consider documenting why this change was necessary (likely for metrics collection ordering) or re-evaluate if sequential processing is required here.
ethereum/core/src/test/java/org/hyperledger/besu/ethereum/mainnet/ExecutionStatsIntegrationTest.java:1 - This TODO indicates incomplete implementation of code write metrics. Consider creating a tracking issue for this follow-up work and referencing it in the comment for better traceability.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Persist before traceEndBlock so that state root calculation (trie cache lookups, | ||
| // state_hash_ms timing) occurs while ExecutionStatsHolder is still set on this thread. |
There was a problem hiding this comment.
The reordering of worldState.persist() before blockTracer.traceEndBlock() changes the execution flow significantly. While the comment explains the reasoning for metrics timing, this could have subtle effects on tracer behavior. Ensure all block tracers are compatible with receiving traceEndBlock after state persistence completes.
There was a problem hiding this comment.
Since SlowBlockTracer is now standalone and only reads from ExecutionStatsHolder at traceEndBlock, this ordering is actually correct — it needs persist to have run so
state_hash_ms timing is captured.
|
example output for hoodi block https://hoodi.etherscan.io/block/2259313 compared with debug_traceBockByNumber |
…n-metrics-standardization-rebased
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
…sage Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
|
+1 on OperationTracer[], Collections.unmodifiableList introduces another overhead on top of existing ones for ArrayList :
Maybe have a fast path execution with 1 and 2 tracers to avoid megamorphic calls, so have only monomorphic or bimorphic calls. |
I have a slight preference for delegation pattern since we already use that elsewhere in the codebase and easy to mix and match with other Tracers. |
|
Still reviewing, but couple of comments before I forget:
|
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Resolved conflicts between the execution metrics / slow block tracing feature (PR besu-eth#9834) and the following upstream changes: - BesuPluginServiceRegistrar extracted from BesuCommand (service registration now centralised); registerRuntimeServices() extended with slowBlockThresholdMs so BlockSimulatorServiceImpl receives it. - TracerAggregator removed from evm module; replaced with a new CompositeOperationTracer in ethereum/core that delegates every OperationTracer method to a fixed list of child tracers and adds a hasTracer() utility used by the parallel-tracer tests. - PreprocessingFunction.run() extended with both BlockProcessingContext (our addition) and Optional<BlockHeader> maybeParentHeader (upstream addition); all implementations and call sites updated. - PathBasedWorldState.persist() API changed to use computeRoot() with a state-root supplier; our timing-metrics hooks retained. - BlockSimulator gas-limit check added alongside our EVMExecutionMetrics tracer composition (using CompositeOperationTracer.of). - parallelStream() → stream() in PathBasedWorldStateUpdateAccumulator to eliminate a data race on the accountWrites/storageWrites int fields. - BalConcurrentTransactionProcessor wired with the same workerStats ExecutionStats pattern as ParallelizedConcurrentTransactionProcessor. - Test suites updated: maybeParentHeader propagated to all runAsyncBlock call sites; blockHeader mocks extended with getStateRoot()/getBlockHash() stubs required after the getWorldState() API change; testApplyResult… test fixed to simulate traceBeforeRewardTransaction so that the miningBeneficiaryReward is non-zero and setPostBalance is exercised. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
…teCallSimulationResult Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
…re/execution-metrics-standardization-rebased
…re/execution-metrics-standardization-rebased
|
verified mainnet output matches eg for block https://etherscan.io/block/24975451
|
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
…re/execution-metrics-standardization-rebased
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>



PR description
Builds on #9660 from @CPerezz
Summary
Implement standardized JSON format for slow block logging to enable cross-client performance analysis and protocol research.
This change is part of the Cross-Client Execution Metrics initiative proposed by Gary Rong and CPerezz.
Motivation
Standardized execution metrics are critical for:
Real-world example: The EIP-7907 analysis used execution metrics to measure code read latency, per-call overhead scaling, and block execution breakdown. Without standardized metrics across clients, such analysis cannot be validated cross-client.
References
JSON Format
{ "level": "warn", "msg": "Slow block", "block": { "number": ..., "hash": ..., "gas_used": ..., "tx_count": ... }, "timing": { "execution_ms": ..., "total_ms": ... }, "throughput": { "mgas_per_sec": ... }, "state_reads": { "accounts": ..., "storage_slots": ..., "code": ..., "code_bytes": ... }, "state_writes": { "accounts": ..., "storage_slots": ... }, "cache": { "account": { "hits": ..., "misses": ..., "hit_rate": ... }, "storage": { ... }, "code": { ... } }, "evm": { "sload": ..., "sstore": ..., "calls": ..., "creates": ... } }Thanks for sending a pull request! Have you done the following?
doc-change-requiredlabel to this PR if updates are required.Locally, you can run these tests to catch failures early:
./gradlew spotlessApply./gradlew build./gradlew acceptanceTest./gradlew integrationTest./gradlew ethereum:referenceTests:referenceTests