Skip to content

transports: Improves the robustness and success rate of connection dialing#495

Merged
lexnv merged 9 commits intomasterfrom
lexnv/elegant-dialing
Dec 9, 2025
Merged

transports: Improves the robustness and success rate of connection dialing#495
lexnv merged 9 commits intomasterfrom
lexnv/elegant-dialing

Conversation

@lexnv
Copy link
Copy Markdown
Collaborator

@lexnv lexnv commented Dec 4, 2025

This PR improves the robustness and success rate of connection dialing by grouping peer negotiation (multistream-select) into the "open" (socket-to-socket connectivity) stages of the TCP and Websocket transports.
This avoids premature cancellation of potentially successful dials.

The litep2p establishes connection in two sequential stages:

  • Step 1: socket to socket connection (raw TCP connection)
  • Step 2: Peer negotiation (multistream-select protocol)

Previously, the transport manager would cancel all pending dials immediately after the first successful completion of Step 1.

Timeline:

  • T0: Transport Manager dials peer on TCP Address A and Websocket Address B
  • T1: Address A established raw socket connection (Step 1). The Transport Manager cancels immediately the dial to Address B
  • T2: Address A proceeds to the peer negotiation (Step 2).
  • T3: Negotiation on address A fails

Because at T1 the manager cancels all other attempts, the dialing is reported as failed without trying to dial more than one address.

To improve the robustness of the dialing process, Step 2 is merged with Step 1.

Implementation Details

Any installed protocol (TCP / Websocket) must comply to the following interface:

pub(crate) trait Transport: Stream + Unpin + Send {
/// Dial `address` and negotiate connection.
fn dial(&mut self, connection_id: ConnectionId, address: Multiaddr) -> crate::Result<()>;
/// Accept negotiated connection.
fn accept(&mut self, connection_id: ConnectionId) -> crate::Result<()>;
/// Accept pending connection.
fn accept_pending(&mut self, connection_id: ConnectionId) -> crate::Result<()>;
/// Reject pending connection.
fn reject_pending(&mut self, connection_id: ConnectionId) -> crate::Result<()>;
/// Reject negotiated connection.
fn reject(&mut self, connection_id: ConnectionId) -> crate::Result<()>;
/// Attempt to open connection to remote peer over one or more addresses.
fn open(&mut self, connection_id: ConnectionId, addresses: Vec<Multiaddr>)
-> crate::Result<()>;
/// Negotiate opened connection.
fn negotiate(&mut self, connection_id: ConnectionId) -> crate::Result<()>;
/// Cancel opening connections.
///
/// This is a no-op for connections that have already succeeded/canceled.
fn cancel(&mut self, connection_id: ConnectionId);
}

TCP/Websocket protocols now implement:

  • fn open integrates step 1 and step 2 (with peer negotiation)
  • TCPAddress A is reported to the manager only after the negotiation phase
  • fn negotiate forwards the already negotiated connection to the manager immediately (no-op since it was already handled)
  • WebSocket Address B is correctly dropped since we have a connection on the TCP socket

Closes: #232

lexnv added 6 commits December 3, 2025 19:06
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Copy link
Copy Markdown
Collaborator

@dmitry-markin dmitry-markin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done here! 🚀

Comment thread src/transport/tcp/mod.rs
@@ -496,52 +513,12 @@ impl Transport for TcpTransport {
}

fn negotiate(&mut self, connection_id: ConnectionId) -> crate::Result<()> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be we can delete negotiate fn from the trait completely, highlighting the fact that negotiation should happen as part of open()?

And likely rename self.opened_raw -> self.opened.

lexnv added 3 commits December 9, 2025 12:47
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
@lexnv lexnv merged commit e9f5555 into master Dec 9, 2025
8 checks passed
@lexnv lexnv deleted the lexnv/elegant-dialing branch December 9, 2025 13:12
lexnv added a commit that referenced this pull request Dec 16, 2025
## [0.12.3] - 2025-12-16

This release improves the robustness of the multistream-select
negotiation over WebRTC transport and fixes inbound bandwidth metering
on substreams. It also enhances the dialing success rate by improving
the transport dialing logic. Additionally, it re-exports CID's multihash
to facilitate the construction of CID V1.

### Changed

- transports: Improves the robustness and success rate of connection
dialing ([#495](#495))
- types: Re-export cid's multihash to construct CID V1
([#491](#491))

### Fixed

- fix: multistream-select negotiation on outbound substream over webrtc
([#465](#465))
- substream: Fix inbound bandwidth metering
([#499](#499))

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
github-merge-queue Bot pushed a commit to paritytech/polkadot-sdk that referenced this pull request Dec 17, 2025
## [0.12.3] - 2025-12-16

This release improves the robustness of the multistream-select
negotiation over WebRTC transport and fixes inbound bandwidth metering
on substreams. It also enhances the dialing success rate by improving
the transport dialing logic. Additionally, it re-exports CID's multihash
to facilitate the construction of CID V1.

### Changed

- transports: Improves the robustness and success rate of connection
dialing ([#495](paritytech/litep2p#495))
- types: Re-export cid's multihash to construct CID V1
([#491](paritytech/litep2p#491))

### Fixed

- fix: multistream-select negotiation on outbound substream over webrtc
([#465](paritytech/litep2p#465))
- substream: Fix inbound bandwidth metering
([#499](paritytech/litep2p#499))

cc @paritytech/sdk-node

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
paritytech-release-backport-bot Bot pushed a commit to paritytech/polkadot-sdk that referenced this pull request Dec 17, 2025
## [0.12.3] - 2025-12-16

This release improves the robustness of the multistream-select
negotiation over WebRTC transport and fixes inbound bandwidth metering
on substreams. It also enhances the dialing success rate by improving
the transport dialing logic. Additionally, it re-exports CID's multihash
to facilitate the construction of CID V1.

### Changed

- transports: Improves the robustness and success rate of connection
dialing ([#495](paritytech/litep2p#495))
- types: Re-export cid's multihash to construct CID V1
([#491](paritytech/litep2p#491))

### Fixed

- fix: multistream-select negotiation on outbound substream over webrtc
([#465](paritytech/litep2p#465))
- substream: Fix inbound bandwidth metering
([#499](paritytech/litep2p#499))

cc @paritytech/sdk-node

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
(cherry picked from commit a12ec9c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

transport/manager: Refill dial addresses after a dial failure to make dialing more robust

2 participants