Skip to content

fix(p2p): add per-peer timeout to performRequest#376

Merged
walldiss merged 1 commit intocelestiaorg:mainfrom
walldiss:fix/per-peer-timeout
Mar 6, 2026
Merged

fix(p2p): add per-peer timeout to performRequest#376
walldiss merged 1 commit intocelestiaorg:mainfrom
walldiss:fix/per-peer-timeout

Conversation

@walldiss
Copy link
Member

@walldiss walldiss commented Mar 5, 2026

Summary

  • performRequest() iterates trusted peers sequentially but previously inherited the full parent context deadline for each ex.request() call. A single slow or unresponsive peer could consume the entire remaining deadline, causing GetByHeight/Get to time out even when other trusted peers are healthy.
  • Wraps each ex.request() call with context.WithTimeout(ctx, ex.Params.RequestTimeout) (default 8s), matching how session.doRequest() already applies per-peer timeouts for range requests.
  • Adds TestExchange_PerformRequest_PeerTimeout to verify a slow first peer is skipped and the request succeeds via the second peer.

Context

Mocha light nodes fail to start within the 20s StartupTimeout when a bootstrap peer is slow (e.g., returns "not found" after 14s). Head() returns quickly, but the subsequent GetByHeight() call blocks on the slow peer for the full remaining context deadline since performRequest had no per-peer timeout.

With this fix, a slow peer is abandoned after RequestTimeout (8s) and the next trusted peer is tried immediately.

performRequest iterates trusted peers sequentially but previously
inherited the full parent context deadline for each request. A single
slow or unresponsive peer could consume the entire remaining deadline,
causing GetByHeight (and Get) to time out even when other trusted peers
are healthy.

Wrap each ex.request() call with context.WithTimeout using the existing
Params.RequestTimeout (default 8s), matching how session.doRequest()
already applies per-peer timeouts for range requests. This ensures a
slow peer is skipped after RequestTimeout and the next peer is tried
immediately.
@walldiss walldiss force-pushed the fix/per-peer-timeout branch from 017efa6 to cce003f Compare March 5, 2026 16:25
@walldiss walldiss enabled auto-merge (squash) March 6, 2026 15:24
@walldiss walldiss merged commit d8a3e80 into celestiaorg:main Mar 6, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants