Skip to content

transport: Limit dial concurrency and bound total dialing time#538

Merged
lexnv merged 12 commits into
masterfrom
lexnv/impr-dialing
Feb 26, 2026
Merged

transport: Limit dial concurrency and bound total dialing time#538
lexnv merged 12 commits into
masterfrom
lexnv/impr-dialing

Conversation

@lexnv
Copy link
Copy Markdown
Collaborator

@lexnv lexnv commented Feb 19, 2026

This PR ensures that the TCP and WebSocket transport layers use all addresses from the address store.

  • To avoid excessive dialing, only 8 addresses are used concurrently.
  • The total timeout for the given dialing process is capped at 2 *connection_open_timeout.
  • This ensures we get to dial other addresses (as opposed to giving up entirely if the first 8 address return timeout after consuming the entire connection_open_timeout), and at the same time, this cannot stall indefinitely for minutes when dialing a peer with 50+ addresses.

Before this PR, the transport manager would give up after the first 8 addresses. This behavior is insufficient in unhealthy test networks where peers / validators have up to 64 addresses.

Part of:

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Comment thread src/transport/tcp/mod.rs
.map_err(|error| (open_address, error.into()))
}
}))
.buffer_unordered(max_parallel_dials);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm do we not limit the max adresses already here?

Comment thread src/transport/tcp/mod.rs Outdated
Err(error) => {
// Deadline for the overall dial attempt, including all retries. This is to prevent
// retry attempts from indefinitely delaying the dial result.
let deadline = tokio::time::sleep(2 * connection_open_timeout);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make this a constant?

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Copy link
Copy Markdown
Collaborator

@dmitry-markin dmitry-markin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! We have parity with libp2p in dialing approach now.

@lexnv lexnv merged commit 5aff19f into master Feb 26, 2026
8 checks passed
@lexnv lexnv deleted the lexnv/impr-dialing branch February 26, 2026 10:04
dmitry-markin added a commit that referenced this pull request Feb 27, 2026
## [0.13.1] - 2026-02-27

This release includes multiple fixes of transports and protocols, fixing
connection stability issues with other librariies (specifically,
[smoldot](https://github.com/smol-dot/smoldot/)) and increasing success
rates of dialing & opening substreams, especially in extreme cases when
remote nodes have a lot of private addresses published to the DHT.

### Fixed

- ping: Conform to the spec & exclude from connection keep-alive
([#416](#416))
- transport: Make accept async to close the gap on service races
([#525](#525))
- transport: Limit dial concurrency and bound total dialing time
([#538](#538))
- webrtc: Support `FIN`/`FIN_ACK` handshake for substream shutdown
([#513](#513))
- transport: Expose failed addresses to the transport manager
([#529](#529))

### Changed

- manager: Prioritize public addresses for dialing
([#530](#530))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants