Skip to content

Comments

op-supernode: multiplex metrics by chain_id label and serve at a single endpoint#18750

Merged
geoknee merged 14 commits intodevelopfrom
gk/supernode-metrics
Jan 16, 2026
Merged

op-supernode: multiplex metrics by chain_id label and serve at a single endpoint#18750
geoknee merged 14 commits intodevelopfrom
gk/supernode-metrics

Conversation

@geoknee
Copy link
Contributor

@geoknee geoknee commented Jan 12, 2026

Closes #18746

Tested manually, using op-supernode wired to op-sync-tester itself talking to opm / unichain RPCs.

Snapshot of raw prometheus output:

Screenshot 2026-01-12 at 20 11 08 Screenshot 2026-01-12 at 20 13 06

which I also piped to promtool check metrics, which found only some preexisting issues with op_node metrics.

➜  prometheus-3.9.1.darwin-arm64 curl -s http://localhost:7300/metrics \
| tee out |  docker run --rm -i mbenabda/promtool:v3.9.0-rc.0-alpine promtool check metrics
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
op_node_supernode_channel_input_bytes counter metrics should have "_total" suffix
op_node_supernode_events_dequeued counter metrics should have "_total" suffix
op_node_supernode_events_emitted counter metrics should have "_total" suffix
op_node_supernode_events_process_time counter metrics should have "_total" suffix
op_node_supernode_events_processed counter metrics should have "_total" suffix
op_node_supernode_l2_source_cache_add counter metrics should have "_total" suffix
op_node_supernode_l2_source_cache_get counter metrics should have "_total" suffix
op_node_supernode_p2p_accepts counter metrics should have "_total" suffix
op_node_supernode_p2p_bandwidth_bytes_total non-counter metrics should not have "_total" suffix
op_node_supernode_p2p_dials counter metrics should have "_total" suffix
op_node_supernode_p2p_ip_unbans counter metrics should have "_total" suffix
op_node_supernode_p2p_payloads_quarantine_total non-counter metrics should not have "_total" suffix
op_node_supernode_p2p_peer_count non-histogram and non-summary metrics should not have "_count" suffix
op_node_supernode_p2p_peer_unbans counter metrics should have "_total" suffix
op_node_supernode_p2p_stream_count non-histogram and non-summary metrics should not have "_count" suffix

@geoknee geoknee added the A-op-supernode Area: op-supernode label Jan 12, 2026
@geoknee geoknee requested a review from axelKingsley January 12, 2026 20:20
@geoknee geoknee marked this pull request as ready for review January 13, 2026 11:44
@geoknee geoknee requested review from a team as code owners January 13, 2026 11:44
Change NewMetrics to accept prometheus.Labels and, when provided, wrap
the registry with those labels. Update the factory.With signature to
accept a prometheus.Registerer. Plumb the new NewMetrics signature
through call sites, add VIRTUAL_NODE_CHAIN_ID_LABEL in op-supernode and
pass the chain ID label for virtual node metrics
Add op-supernode/supernode/metrics with a Metrics type and StartServer
helper. Replace per-chain metrics router/service with a single metrics
server managed by Supernode. Remove setMetricsHandler from
ChainContainer and update tests. Add initMetricsServer to handle metrics
startup and shutdown, and minor TODO comment in op-node initialization.
Rename setMetricsHandler to addMetricsHandler and adjust its signature
to accept only an http.Handler. Update NewChainContainer and call sites
accordingly. MetricsRouter now keeps a slice of handlers and invokes
each for metrics requests, removing per-chain path normalization.
MetricsRouter now accepts prometheus.Gatherer instances and builds a
single promhttp.Handler that aggregates all registered gatherers.
ChainContainer and supernode updated to pass registries (AddRegistry)
instead of per-chain HTTP handlers.
Expand comment to explain that MetricsRouter serves multiple Prometheus
Gatherers from a single HTTP server and warn that Gatherers must not
collide (e.g. use unique names or a distinct global label)
Rename AddRegistry to AddMetricsRegistry and make it accept a chain key
so metrics registries are stored in a map and then aggregated into a
Gatherers slice. Update chain container and supernode to pass
chainID.String() when registering per-chain metrics.
@geoknee geoknee force-pushed the gk/supernode-metrics branch from 5929f0d to 04c40c1 Compare January 13, 2026 13:20
Update metrics fan-in comment to mention the /metrics path. Add a unit
test to verify multiple registries are merged and served. Update
supernode to call the new method name.
@geoknee geoknee added this pull request to the merge queue Jan 15, 2026
@geoknee geoknee removed this pull request from the merge queue due to a manual request Jan 15, 2026
@jelias2 jelias2 added this pull request to the merge queue Jan 15, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to no response for status checks Jan 15, 2026
@geoknee geoknee enabled auto-merge January 15, 2026 20:32
@geoknee geoknee disabled auto-merge January 16, 2026 10:01
@geoknee geoknee enabled auto-merge January 16, 2026 10:01
@geoknee geoknee added this pull request to the merge queue Jan 16, 2026
@sebastianst sebastianst removed this pull request from the merge queue due to a manual request Jan 16, 2026
@geoknee geoknee added this pull request to the merge queue Jan 16, 2026
Merged via the queue into develop with commit 2ad8d90 Jan 16, 2026
91 checks passed
@geoknee geoknee deleted the gk/supernode-metrics branch January 16, 2026 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-op-supernode Area: op-supernode

Projects

None yet

Development

Successfully merging this pull request may close these issues.

op-supernode: multiplex virtual node metrics with labels, not url path

3 participants