chore: update to boxo without goprocess#10567
Conversation
0089a37 to
d63ac5e
Compare
d63ac5e to
37e9bee
Compare
|
Triage note: ok, just make sure CI is green |
b2184d8 to
966d77a
Compare
There was a problem hiding this comment.
I think I've tracked down why prometheus metrics tests were failing.
When manually starting ipfs daemon, it produces errors:
2024-11-18T21:07:43.009+0100 ERROR metrics-prometheus go-metrics-prometheus@v0.0.3/binding.go:61 Registering prometheus collector, name: <no-scope>_pending_tasks, error: descriptor Desc{fqName: "<no-scope>_pending_tasks", help: "Total number of pending tasks", constLabels: {}, variableLabels: {}} is invalid: "<no-scope>_pending_tasks" is not a valid metric name
2024-11-18T21:07:43.013+0100 ERROR metrics-prometheus go-metrics-prometheus@v0.0.3/binding.go:61 Registering prometheus collector, name: <no-scope>_active_tasks, error: descriptor Desc{fqName: "<no-scope>_active_tasks", help: "Total number of active tasks", constLabels: {}, variableLabels: {}} is invalid: "<no-scope>_active_tasks" is not a valid metric name
2024-11-18T21:07:43.013+0100 ERROR metrics-prometheus go-metrics-prometheus@v0.0.3/binding.go:61 Registering prometheus collector, name: <no-scope>_pending_block_tasks, error: descriptor Desc{fqName: "<no-scope>_pending_block_tasks", help: "Total number of pending blockstore tasks", constLabels: {}, variableLabels: {}} is invalid: "<no-scope>_pending_block_tasks" is not a valid metric name
2024-11-18T21:07:43.013+0100 ERROR metrics-prometheus go-metrics-prometheus@v0.0.3/binding.go:61 Registering prometheus collector, name: <no-scope>_active_block_tasks, error: descriptor Desc{fqName: "<no-scope>_active_block_tasks", help: "Total number of active blockstore tasks", constLabels: {}, variableLabels: {}} is invalid: "<no-scope>_active_block_tasks" is not a valid metric name
Likely due to how https://github.com/ipfs/boxo/blob/625aadd0c7a669f2df698c687a4cd22c2f78ffd1/bitswap/metrics/metrics.go#L33 works (called before context has all info?).
We may need to refactor the way bitswap metrics are registered, avoid magic, use prometheus.DefaultRegisterer by default and then allow override via WithPrometheusRegistry(prometheus.NewRegistry()) if needed (eg. in parallel tests) – some prior art in ipfs/boxo#722.
Unsure if related, second problem is shutdown crashing.
Trying to kill ipfs deamon with ctrl+c produces panic, there seem to be goprocess still somewhere:
Run 'ipfs id' to inspect announced and discovered multiaddrs of this node.
panic: Process cannot add children after being closed
goroutine 1 [running]:
github.com/jbenet/goprocess.(*process).AddChild(0xc000c38d80, {0x3663ea8, 0xc001d1ab40})
github.com/jbenet/goprocess@v0.1.4/impl-mutex.go:99 +0x24d
github.com/ipfs/kubo/cmd/ipfs/kubo.daemonFunc(0xc0003f0000, {0x6?, 0xc0000f0050?}, {0x2e15620, 0xc0004b40a0})
github.com/ipfs/kubo/cmd/ipfs/kubo/daemon.go:515 +0x1aa8
github.com/ipfs/go-ipfs-cmds.(*executor).Execute(0x489a100?, 0xc0003f0000, {0x7ff5b2869578, 0xc0000c7c80}, {0x2e15620, 0xc0004b40a0})
github.com/ipfs/go-ipfs-cmds@v0.14.0/executor.go:88 +0x126
github.com/ipfs/kubo/cmd/ipfs/kubo.tracingWrappedExecutor.Execute({{0x36337a0?, 0xc00028a078?}}, 0xc0003f0000, {0x7ff5b2869578, 0xc0000c7c80}, {0x2e15620, 0xc0004b40a0})
github.com/ipfs/kubo/cmd/ipfs/kubo/start.go:375 +0x3eb
github.com/ipfs/go-ipfs-cmds/cli.Run({0x364f258?, 0xc0004434f0?}, 0x49faf20, {0xc000160000, 0x2, 0x2}, 0x6e0?, 0xc00012e028, 0xc00012e030, 0x33b6370, ...)
github.com/ipfs/go-ipfs-cmds@v0.14.0/cli/run.go:137 +0x99c
github.com/ipfs/kubo/cmd/ipfs/kubo.Start(0x33b6370)
github.com/ipfs/kubo/cmd/ipfs/kubo/start.go:213 +0x516
main.main()
github.com/ipfs/kubo/cmd/ipfs/main.go:10 +0x1a
9e4395a to
896ec7b
Compare
|
I updated this PR to use a "fixed" PR version of boxo to fix the metrics registration issue. Another small fix in the daemon code to remove the panic on early ctrl-c before daemon ready. The change in boxo was to pass the context into the server decision engine. I would prefer not relying on long-lived contexts to do things like metrics registration. |
lidel
left a comment
There was a problem hiding this comment.
Thanks, both issues are fixed, and helia-interop was fixed upstream.
Switched to commit from boxo main. I'm going to merge this so we have more time to catch issues (if any).
Uh oh!
There was an error while loading. Please reload this page.