feat: send state metrics to xatu by weiihann · Pull Request #3 · samcm/go-ethereum

weiihann · 2026-01-16T05:36:32Z

No description provided.

The`plucky` and `oracular` have reached end of life. That's why launchpad isn't building them anymore: https://launchpad.net/~ethereum/+archive/ubuntu/ethereum/+packages.

…thereum#33898)

We didn't upgrade to 1.25, so this jumps over one version. I want to upgrade all builds to Go 1.26 soon, but let's start with the Docker build to get a sense of any possible issues.

…33900) The endianness was wrong, which means that the code chunks were stored in the wrong location in the tree.

fix the flaky test found in https://ci.appveyor.com/project/ethereum/go-ethereum/builds/53601688/job/af5ccvufpm9usq39 1. increase the timeout from 3+1s to 15s, and use timer instead of sleep(in the CI env, it may need more time to sync the 1024 blocks) 2. add `synced.Load()` to ensure the full async chain is finished Signed-off-by: Delweng <delweng@gmail.com>

With this, we are dropping support for protocol version eth/68. The only supported version is eth/69 now. The p2p receipt encoding logic can be simplified a lot, and processing of receipts during sync gets a little faster because we now transform the network encoding into the database encoding directly, without decoding the receipts first. --------- Co-authored-by: Felix Lange <fjl@twurst.com>

I noticed that some autonomous agents have a tendency to commit binaries if asked to create a PR.

Fixes priceheap comparison in some edge cases. --------- Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>

This PR introduces a threshold (relative to current market base fees), below which we suppress the diffusion of low fee transactions. Once base fees go down, and if the transactions were not evicted in the meantime, we release these transactions. The PR also updates the bucketing logic to be more sensitive, removing the extra logarithm. Blobpool description is also updated to reflect the new behavior. EIP-7918 changed the maximim blob fee decrease that can happen in a slot. The PR also updates fee jump calculation to reflect this. --------- Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>

…ethereum#33908) The payload rebuild loop resets the timer with the full Recommit duration after generateWork returns, making the actual interval generateWork_elapsed + Recommit instead of Recommit alone. Since fillTransactions uses Recommit (2s) as its timeout ceiling, the effective rebuild interval can reach ~4s under heavy blob workloads — only 1–2 rebuilds in a 6s half-slot window instead of the intended 3. Fix by subtracting elapsed time from the timer reset. ### Before this fix ``` t=0s timer fires, generateWork starts t=2s fillTransactions times out, timer.Reset(2s) t=4s second rebuild starts t=6s CL calls getPayload — gets the t=2s result (1 effective rebuild) ``` ### After ``` t=0s timer fires, generateWork starts t=2s fillTransactions times out, timer.Reset(2s - 2s = 0) t=2s second rebuild starts immediately t=4s timer.Reset(0), third rebuild starts t=6s CL calls getPayload — gets the t=4s result (3 effective rebuilds) ```

We got a report that after v1.17.0 a geth-teku node starts to time out on engine_getBlobsV2 after around 3h of operation. The culprit seems to be our optional http2 service which Teku attempts first. The exact cause of the timeout is still unclear. This PR is more of a workaround than proper fix until we figure out the underlying issue. But I don't expect http2 to particularly benefit engine API throughput and latency. Hence it should be fine to disable it for now.

…reum#33946) ethereum#33916 + cmd/keeper go mod tidy --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

In `buildPayload()`, the background goroutine uses a `select` to wait on the recommit timer, the stop channel, and the end timer. When both `timer.C` and `payload.stop` are ready simultaneously, Go's `select` picks a case non-deterministically. This means the loop can enter the `timer.C` case and perform an unnecessary `generateWork` call even after the payload has been resolved. Add a non-blocking check of `payload.stop` at the top of the `timer.C` case to exit immediately when the payload has already been delivered.

Return the Amsterdam instruction set from `LookupInstructionSet` when `IsAmsterdam` is true, so Amsterdam rules no longer fall through to the Osaka jump table. --------- Co-authored-by: rjl493456442 <garyrong0905@gmail.com>

Fixes ethereum#33572

…reum#33869) For bal-devnet-3 we need to update the EIP-8024 implementation to the latest spec changes: ethereum/EIPs#11306 > Note: I deleted tests not specified in the EIP bc maintaining them through EIP changes is too error prone.

Pebble maintains a batch pool to recycle the batch object. Unfortunately batch object must be explicitly returned via `batch.Close` function. This PR extends the batch interface by adding the close function and also invoke batch.Close in some critical code paths. Memory allocation must be measured before merging this change. What's more, it's an open question that whether we should apply batch.Close as much as possible in every invocation.

…eum#33593) Implements https://eips.ethereum.org/EIPS/eip-7778 --------- Co-authored-by: Gary Rong <garyrong0905@gmail.com>

Mainnet was already overriding --cache to 4096. This PR just makes this the default.

…thereum#33927) The BatchSpanProcessor queue size was incorrectly set to DefaultMaxExportBatchSize (512) instead of DefaultMaxQueueSize (2048). I noticed the issue on bloatnet when analyzing the block building traces. During a particular run, the miner was including 1000 transactions in a single block. When telemetry is enabled, the miner creates a span for each transaction added to the block. With the queue capped at 512, spans were silently dropped when production outpaced the span export, resulting in incomplete traces with orphaned spans. While this doesn't eliminate the possibility of drops under extreme load, using the correct default restores the 4x buffer between queue capacity and export batch size that the SDK was designed around.

This fixes a theoretical overflow condition if an account has an impossibly high nonce.

Removes the appveyor.yml since we moved to github runners. --------- Co-authored-by: Sina Mahmoodi <itz.s1na@gmail.com> Co-authored-by: Felix Lange <fjl@twurst.com>

…st (ethereum#34883)

Passing `--dev=false` currently still enters the dev-mode startup path because a couple of branches check whether the flag was set, not its boolean value. This switches those branches to use `ctx.Bool`, so explicit false does not start dev mode or emit a dev genesis, while `--dev` keeps its existing behavior.

Changes core.Message to use Uint256 which is faster --------- Co-authored-by: Gary Rong <garyrong0905@gmail.com>

…ereum#34827) Every tracer that implements Stop/GetResult held a `reason error` field that is written by Stop (called from the trace-timeout watchdog goroutine in api.go) and read by GetResult (called by the RPC handler main goroutine). These accesses were unsynchronized.

In the --create path, execFunc returns gasLeft as the second return value, but the rest of the code treats this value as "gas used" (printed as such, and compared in timedExec). This makes gas reporting incorrect and can cause benchmark consistency checks to fail.

This is a refactoring PR to wrap all pre/post-execution system calls as the exported functions, eliminating the duplicated system calls across the codebase. There are a few things unchanged but worths highlight: - ChainMaker is left as unchanged, a significant rewrite is required - BeaconRoot in header should be non-nil if Cancun is enabled --------- Co-authored-by: jwasinger <j-wasinger@hotmail.com>

…m#34939) Fixes the regression caught by https://hive.ethpandaops.io/#/test/generic/1778481210-e59b7465e1d04f7ed1b0200838584b16?testnumber=137. engine.AssembleBlock explicitly expects withdrawals to be non-nil for pre-Shanghai blocks as opposed to FinaliseAndAssemble which stripped off the withdrawal.

In b2843a1, metrics check len(res) == len(hashes) but res is pre-allocated with make(), so length is always equal. Partial hit metric never fires. Count non-nil elements instead. --------- Co-authored-by: Bosul Mun <bsbs8645@snu.ac.kr>

This PR introduces a separate transaction pool type for sparse blobpool. In sparse blobpool, PooledTransactions message delivers transactions without blobs, partial or full cells are downloaded by Cells message. Blobpool no longer stores transactions with complete sidecars, and it stores transactions without blobs, along with the corresponding cells. Because of this, a dedicated type distinct from types.Transaction is required. This PR introduces a type called `BlobTxForPool` and stores each sidecar field independently, in order to bypass the assumption that a sidecar always exists as a complete unit. Reintroducing the conversion queue was considered, but was ultimately omitted because type conversion should be sufficiently fast. With sparse blobpool, blob -> cell computation would take about ~13ms per blob. Not sure whether this is fast enough, but otherwise we can add the conversion queue later on the sparse blobpool branch.

1. should use !reflect.DeepEqual. 2. big.NewInt(0).SetBits([]big.Word{}) work around for DeepEqual when big.Int is zero, unpack return a []big.Word{}.

Passing `--v2=false` currently still selects the v2 binding generator because the command checks whether the flag was set. This switches generation to use the boolean flag value, so explicit false continues to generate legacy bindings while `--v2` keeps selecting v2.

This PR introduces OnGasChangeV2 tracing hook, as the pre-requisite for landing EIP-8037. --------- Co-authored-by: Sina M <1591639+s1na@users.noreply.github.com>

This PR extends the journal to track the pre-transaction values of mutated balances, nonces, and code. At the end of the transaction, these values are used to filter out no-op changes, such as balance transitions from a-> b->a. These changes are excluded from the block-level access list. Additionally, there is a dedicated `bal.ConstructionBlockAccessList` objects for gathering the state reads and writes within the current transaction. These state writes will be keyed by the block accessList index. --------- Co-authored-by: jwasinger <j-wasinger@hotmail.com>

## Summary The `--rpc.telemetry.sample-ratio` flag declares `Value: 1.0` and `geth --help` advertises `(default: 1)`. In practice, however, omitting the flag produces a sample ratio of `0`, causing `sdktrace.TraceIDRatioBased(0)` to drop 100% of spans. Users who enable `--rpc.telemetry` see the `OpenTelemetry trace export enabled` log line and a clean startup, but no traces ever leave the process. The root cause is the interaction between two pieces of code: 1. `cmd/utils/flags.go:setOpenTelemetry` (added in ethereum#34062) only copies the flag value when `ctx.IsSet(...)` returns true: ```go if ctx.IsSet(RPCTelemetrySampleRatioFlag.Name) { tcfg.SampleRatio = ctx.Float64(RPCTelemetrySampleRatioFlag.Name) } ``` That is the right pattern for "don't clobber a config-file value with the CLI default," but it implies that something else must initialise the field when neither source sets it. 2. `node/defaults.go:DefaultConfig` never initialises `OpenTelemetry.SampleRatio`, leaving it at the float64 zero value. The result for the common CLI-only user (no TOML config) is `SampleRatio = 0` → every span is silently dropped, despite the documented default of 1. ## Change Seed `OpenTelemetry: OpenTelemetryConfig{SampleRatio: 1.0}` in `node.DefaultConfig` so the documented default matches runtime behavior and the `ctx.IsSet` guard in `setOpenTelemetry` continues to do what it was designed to do.

Adds a new CLI flag --state.size-tracking-depth to control how many recent block state sizes are tracked in memory. The default is set to 10000 blocks (previously hardcoded at 128). This allows users to tune memory usage vs historical depth based on their monitoring needs. Setting to 0 uses the default value. Memory impact per block tracked: ~292 bytes (map entry + heap entry)

…relation This change updates the state tracking mechanism to include the block hash alongside the block number and state root when calculating and publishing size statistics. This allows for more precise correlation between state size changes and specific blocks, improving observability and debugging capabilities for state growth.

count and bytes feat: send state metrics to xatu feat: change module and add network name

The state size tracer previously emitted a single signed delta per category (account, storage, contract code, account trie nodes, storage trie nodes). That representation lost information because a block-level update appeared the same as no activity once writes and deletes cancelled. The tracer now emits per-block writes and deletes counts and bytes separately, nested under top-level "writes" and "deletes" keys in the "State metrics" JSON log. An update is accounted as BOTH a write of the new value AND a delete of the prev value, so consumers can recover the net delta as (writes - deletes). The four symmetric categories use this three-arm switch (create / update / delete). Contract code remains write-only with hash dedup (reliable ref-counting would be needed for deletions); its delete counters stay 0. The mpt-depth side of the log is untouched. Consumer side: xatu's sentry-logs Vector pipeline normalises the new shape and ClickHouse derives the original delta fields via MATERIALIZED columns.

weiihann · 2026-05-14T05:15:58Z

Closing in favour of a fresh PR built on a rebased base. The state-size infra in bump-state-size-tracker conflicted heavily with current ethereum/go-ethereum master (414 commits behind, pre-dates PR ethereum#33490 which introduced the OnStateUpdate hook our tracer depends on). The remaining useful commits from bump-state-size-tracker (configurable depth + block-hash tracking) have been integrated into a new branch on weiihann/go-ethereum and a successor PR will follow.

weiihann changed the base branch from master to bump-state-size-tracker January 16, 2026 05:36

weiihann force-pushed the feat/xatu-state-metrics-integrate branch from 469728a to 137c8c3 Compare January 16, 2026 05:42

weiihann force-pushed the feat/xatu-state-metrics-integrate branch from 137c8c3 to 6447d7a Compare February 19, 2026 06:45

s1na and others added 27 commits February 26, 2026 13:55

build: update ubuntu distros list (ethereum#33864)

8a43456

The`plucky` and `oracular` have reached end of life. That's why launchpad isn't building them anymore: https://launchpad.net/~ethereum/+archive/ubuntu/ethereum/+packages.

trie: error out for unexpected key-value pairs preceding the range (e…

be92f54

…thereum#33898)

go.mod: update ckzg (ethereum#33901)

1b1133d

Dockerfile: upgrade to Go 1.26 (ethereum#33899)

7793e00

We didn't upgrade to 1.25, so this jumps over one version. I want to upgrade all builds to Go 1.26 soon, but let's start with the Docker build to get a sense of any possible issues.

trie/bintrie: fix endianness in code chunk key computation (ethereum#…

95c6b05

…33900) The endianness was wrong, which means that the code chunks were stored in the wrong location in the tree.

AGENTS.md: add instruction not to commit binaries (ethereum#33921)

825436f

I noticed that some autonomous agents have a tendency to commit binaries if asked to create a PR.

.github: set @gballet as codeowner for keeper (ethereum#33920)

5695fbc

core/vm: enable 8024 instructions in Amsterdam (ethereum#33928)

2726c9e

core/types: fix transaction pool price-heap comparison (ethereum#33923)

1eead2e

Fixes priceheap comparison in some edge cases. --------- Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>

p2p/tracker: fix crash in clean when tracker is stopped (ethereum#33940)

9962e2c

version: release go-ethereum v1.17.1 stable

16783c1

version: begin v1.17.2 release cycle

db7d3a4

miner: enable trie prefetcher in block builder (ethereum#33945)

773f71b

core/vm: use amsterdam jump table in lookup (ethereum#33947)

fe3a74e

Return the Amsterdam instruction set from `LookupInstructionSet` when `IsAmsterdam` is true, so Amsterdam rules no longer fall through to the Osaka jump table. --------- Co-authored-by: rjl493456442 <garyrong0905@gmail.com>

cmd, core, eth, tests: prevent state flushing in RPC (ethereum#33931)

6d99759

Fixes ethereum#33572

core: implement eip-7778: block gas accounting without refunds (ether…

6d0dd08

…eum#33593) Implements https://eips.ethereum.org/EIPS/eip-7778 --------- Co-authored-by: Gary Rong <garyrong0905@gmail.com>

cmd/geth: set default cache to 4096 (ethereum#33836)

28dad94

Mainnet was already overriding --cache to 4096. This PR just makes this the default.

cuiweixie and others added 18 commits May 10, 2026 13:03

core/txpool: use cmp.Compare instead of subtraction (ethereum#34918)

7facf9c

This fixes a theoretical overflow condition if an account has an impossibly high nonce.

internal/download: close dst on io.Copy error (ethereum#34910)

f63c265

appveyor.yml: remove appveyor configuration (ethereum#34720)

18becee

Removes the appveyor.yml since we moved to github runners. --------- Co-authored-by: Sina Mahmoodi <itz.s1na@gmail.com> Co-authored-by: Felix Lange <fjl@twurst.com>

triedb/pathdb: fix layer 5 key range in storage iterator traversal te…

934a009

…st (ethereum#34883)

core: use uint256 in core.Message (ethereum#34934)

e1047b9

Changes core.Message to use Uint256 which is faster --------- Co-authored-by: Gary Rong <garyrong0905@gmail.com>

version: release go-ethereum v1.17.3 stable (ethereum#34937)

117e067

version: start release 1.17.4 cycle (ethereum#34938)

8b39453

core: write head hash to db after snap sync is complete (ethereum#34912)

d446676

internal/ethapi: add balHash to block results (ethereum#34652)

91f8e7c

accounts/abi: fix unittest code (ethereum#34740)

6af374e

1. should use !reflect.DeepEqual. 2. big.NewInt(0).SetBits([]big.Word{}) work around for DeepEqual when big.Int is zero, unpack return a []big.Word{}.

weiihann force-pushed the feat/xatu-state-metrics-integrate branch from 6447d7a to 5e97196 Compare May 13, 2026 07:59

rjl493456442 and others added 8 commits May 13, 2026 10:53

core: introduce GasChangeHook v2 (ethereum#34946)

0494cdc

This PR introduces OnGasChangeV2 tracing hook, as the pre-requisite for landing EIP-8037. --------- Co-authored-by: Sina M <1591639+s1na@users.noreply.github.com>

add state size tracer

a8f815a

count and bytes feat: send state metrics to xatu feat: change module and add network name

update

b8c86cd

weiihann force-pushed the feat/xatu-state-metrics-integrate branch from 5e97196 to f5efc52 Compare May 14, 2026 05:02

weiihann closed this May 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: send state metrics to xatu#3

feat: send state metrics to xatu#3
weiihann wants to merge 419 commits into
samcm:bump-state-size-trackerfrom
weiihann:feat/xatu-state-metrics-integrate

weiihann commented Jan 16, 2026

Uh oh!

weiihann commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

weiihann commented Jan 16, 2026

Uh oh!

weiihann commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants