Conversation
3edfd77 to
57b9e54
Compare
mattsse
left a comment
There was a problem hiding this comment.
this is quite a lot of stuff -.-
but I guess all of these make sense, I'd like to avoid having this leak as part of the block type
also have some suggestions for how to avoid adding this to the persistence task as well.
| } | ||
| } | ||
|
|
||
| /// A handle to [`InstrumentedStateProvider`] that can be used to fetch recorded metrics. |
There was a problem hiding this comment.
i find handle here a bit confusing, isnt this now just a container for metrics?
| fn from_str(s: &str) -> Result<Self, Self::Err> { | ||
| if s.is_empty() { | ||
| return Ok(Self(Default::default())) | ||
| return Ok(Self(Default::default())); |
| let Self { builder, task_executor } = self; | ||
|
|
||
| let engine_tree_config = builder.config.engine.tree_config(); | ||
|
|
| reth-evm = { workspace = true, features = ["test-utils", "metrics"] } | ||
| reth-revm.workspace = true | ||
| reth-execution-types.workspace = true | ||
| secp256k1.workspace = true | ||
| alloy-genesis.workspace = true | ||
| revm-bytecode.workspace = true | ||
| revm-database.workspace = true | ||
| revm-primitives.workspace = true | ||
| tracing-subscriber = { workspace = true, features = ["env-filter"] } |
There was a problem hiding this comment.
why do we need all of this here now?
There was a problem hiding this comment.
To enable testing. I needed all of the dev depencencies in order to be able to run the tests.
| /// Trie context needed for [`BasicEngineValidator::spawn_deferred_trie_task`] | ||
| #[derive(Debug)] | ||
| struct TrieCtx<P> { | ||
| hashed_state: Arc<HashedPostState>, | ||
| trie_output: Arc<TrieUpdates>, | ||
| overlay_factory: OverlayStateProviderFactory<P>, | ||
| } | ||
|
|
||
| impl<P> TrieCtx<P> { | ||
| fn new( | ||
| hashed_state: HashedPostState, | ||
| trie_output: TrieUpdates, | ||
| overlay_factory: OverlayStateProviderFactory<P>, | ||
| ) -> Self { | ||
| Self { | ||
| hashed_state: Arc::new(hashed_state), | ||
| trie_output: Arc::new(trie_output), | ||
| overlay_factory, | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
im losing it with amp, every single type it puts new types at the wrong place, this does not belong above the main type
There was a problem hiding this comment.
This wasn't in my original PR. @shekhirin I assume is your addition?
| // Emit unified slow block logs after commit completes | ||
| for stats in timing_stats { | ||
| let total_duration = stats.execution_duration + | ||
| stats.state_read_duration + | ||
| stats.state_hash_duration + | ||
| commit_duration; | ||
|
|
||
| // Check if total time exceeds threshold | ||
| if let Some(threshold) = self.slow_block_threshold && | ||
| total_duration > threshold | ||
| { | ||
| self.log_slow_block(stats, commit_duration, total_duration); | ||
| } | ||
| } |
There was a problem hiding this comment.
can we extract this to a helper fn perhaps?
| self.metrics.save_blocks_duration_seconds.record(total_commit_duration); | ||
|
|
||
| // Emit unified slow block logs after commit completes | ||
| for stats in timing_stats { |
There was a problem hiding this comment.
this loop is a noop if no slow_block_threshold is set right?
Implements execution metrics following the cross-client specification: https://github.com/ethereum/execution-specs/blob/main/docs/execution-metrics-spec.md - Add slow block logging method to ExecutorMetrics - Output structured JSON log entries for cross-client analysis - Include timing, throughput, and state access counts - Threshold configurable via SLOW_BLOCK_THRESHOLD_MS constant
Implement standardized JSON format for slow block logging to enable cross-client performance analysis and protocol research. This change is part of the Cross-Client Execution Metrics initiative proposed by Gary Rong: https://hackmd.io/dg7rizTyTXuCf2LSa2LsyQ The standardized metrics enabled data-driven analysis like the EIP-7907 research: https://ethresear.ch/t/data-driven-analysis-on-eip-7907/23850 JSON format includes: - block: number, hash, gas_used, tx_count - timing: execution_ms, total_ms - throughput: mgas_per_sec - state_reads: accounts, storage_slots, code, code_bytes - state_writes: accounts, storage_slots - cache: account/storage/code hits, misses, hit_rate
Convert timing parameters from u64 to f64 to preserve sub-millisecond
precision in slow block JSON output.
Changes:
- log_slow_block() timing params: u64 -> f64
- is_slow_block(): u64 -> f64 with threshold cast
- Tracing format: add {:.3} for 3 decimal places
- Update tests for f64 values
Remove CacheStats from CachedStateMetrics and pass it as an optional Arc<CacheStats> to CachedStateProvider only when slow_block_threshold is configured. This avoids unnecessary atomic increments on every cache hit/miss when slow block logging is disabled. Replace InstrumentedStateProviderHandle/InstrumentedStateProviderHandleInner with a flat StateProviderStats struct shared via Arc. The provider now holds metrics and stats directly, removing a layer of indirection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
a95a657 to
82dd69a
Compare
…r for cache_stats Use with_cache_stats() builder method instead of adding a parameter to the constructor. Also fix pre-existing const fn breakage on nightly for ExecutionCache::new and SavedCache::new. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
067a938 to
41ab9d1
Compare
Eliminates PersistenceRx enum and SaveBlocksResult by using a single PersistenceResult type for all persistence operations, simplifying wait_for_event from duplicated per-variant select blocks to a single unified select_biased block. Fixes total_duration double-counting (state_read_duration is a subset of execution_duration). Removes premature purge_timing_stats call in on_new_persisted_block that was clearing stats before the threshold check could run. Fixes divide-by-zero in mgas/s calculation. Corrects misleading doc comments on Duration fields in ExecutionTimingStats. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ock logging Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…_complete Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rics-standardization
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When CacheStats is present (slow block threshold configured), record only to CacheStats during execution and flush to prometheus in a single batch at the end, avoiding double recording on every cache access. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This reverts commit 2bbc78b.
25a187e to
25c62b0
Compare
Resolved conflicts in config.rs and engine.rs where sparse_trie_max_storage_tries was replaced by sparse_trie_max_hot_slots + sparse_trie_max_hot_accounts on main. Kept both main's LFU pruning renames and the PR's slow_block_threshold additions. Amp-Thread-ID: https://ampcode.com/threads/T-019cc34f-c511-71b7-b6bd-ad8600e06a51 Co-authored-by: Amp <amp@ampcode.com>
Ref: https://hackmd.io/dg7rizTyTXuCf2LSa2LsyQ, https://ethresear.ch/t/data-driven-analysis-on-eip-7907/23850
Reference implementations in other clients: NethermindEth/nethermind#10288, ethereum/go-ethereum#33655
Adds configurable slow block logging via
--engine.slow-block-thresholdthat emits a structuredwarn!log with detailed timing, state-operation counts, and cache hit-rate metrics for blocks whose total processing time exceeds the threshold.When the threshold is configured, block validation instruments the state provider to collect fetch counts and latencies, and tracks cache hits/misses. After persistence commits, blocks exceeding the threshold emit a
ConsensusEngineEvent::SlowBlockcarrying anExecutionTimingStatssnapshot. The node events handler formats this into a single structured log line with execution time, state root time, commit time, mgas/s, account/storage/code fetch counts, and cache hit rates.When no threshold is set, no stats are collected and no additional allocations occur.
Key types introduced:
ExecutionTimingStats(reth-chain-state): 24-field struct capturing block number/hash, execution/state-root/commit durations, state operation counts, and cache metricsSlowBlockInfo(reth-engine-primitives): wrapper carrying stats + commit duration + total durationCacheStats(engine-tree): atomic counters for cache hit/miss trackingStateProviderStats(engine-tree): atomic counters for state provider fetch counts and latenciesValidationOutcome(engine-tree): result type alias for block validation with optional timing statsAlso refactors
PersistenceResultto carry commit duration alongside the last persisted block, replacing the separateSaveBlocksResulttype.