Conversation
When NB_FORCE_RELAY is enabled, skip WorkerICE creation entirely, suppress ICE credentials in offer/answer messages, disable the periodic ICE candidate monitor, and fix isConnectedOnAllWay to only check relay status so the guard stops sending unnecessary offers.
…tials Track whether the remote peer includes ICE credentials in its offers/answers. When remote stops sending ICE credentials, skip ICE listener dispatch, suppress ICE credentials in responses, and exclude ICE from the guard connectivity check. When remote resumes sending ICE credentials, re-enable all ICE behavior.
… transition Fix nil pointer dereference in signalOfferAnswer when SessionID is nil (relay-only offers). Close stale ICE agent immediately when remote peer stops sending ICE credentials to avoid traffic black-hole during the ICE disconnect timeout.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughSR watcher and connection startup now respect a "force relayed" flag: ICE monitor and ICE worker creation can be disabled, Handshaker tracks remote ICE support and omits ICE credentials when appropriate, and Guard uses a tri-state Changes
Sequence Diagram(s)sequenceDiagram
participant Engine
participant Conn
participant SRWatcher
participant Handshaker
participant RemotePeer
participant Guard
rect rgba(100,150,200,0.5)
Note over Engine,RemotePeer: Force-relayed path (ICE disabled)
Engine->>Conn: Open(forceRelay=true)
Conn->>Conn: Skip ICE worker creation
Conn->>Handshaker: Register relay-only listeners
Engine->>SRWatcher: Start(disableICEMonitor=true)
SRWatcher->>SRWatcher: Skip ICE monitor goroutine
Handshaker->>RemotePeer: Send Offer (no ICE creds)
RemotePeer->>Handshaker: Send Answer (no ICE creds)
Handshaker->>Handshaker: remoteICESupported = false
Guard->>Guard: Eval ConnStatus (relay-based)
end
rect rgba(150,200,100,0.5)
Note over Engine,RemotePeer: Normal path (ICE enabled)
Engine->>Conn: Open(forceRelay=false)
Conn->>Conn: Create ICE worker
Conn->>Handshaker: Register ICE listener
Engine->>SRWatcher: Start(disableICEMonitor=false)
SRWatcher->>SRWatcher: Start ICE monitor goroutine
Handshaker->>RemotePeer: Send Offer (with ICE creds)
RemotePeer->>Handshaker: Send Answer (with ICE creds)
Handshaker->>Handshaker: remoteICESupported = true
Guard->>Guard: Eval ConnStatus (ICE+relay -> Connected/PartiallyConnected)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
client/internal/peer/conn.go (1)
735-749:⚠️ Potential issue | 🟠 MajorRequire at least one transport when remote ICE is disabled.
Once
RemoteICESupported()is false, this path skips the ICE gate. If relay is also unsupported for the peer, the function now falls through totrue, so the guard stops re-offering even though there is no working transport.💡 Suggested fix
func (conn *Conn) isConnectedOnAllWay() (connected bool) { @@ - // For non-forced platforms: check ICE connection status only if remote peer supports ICE - if conn.handshaker.RemoteICESupported() { + relaySupported := conn.workerRelay.IsRelayConnectionSupportedWithPeer() + + // For non-forced platforms: check ICE connection status only if remote peer supports ICE. + // If the remote disabled ICE, relay has to be available instead. + if conn.handshaker.RemoteICESupported() { if conn.statusICE.Get() == worker.StatusDisconnected && !conn.workerICE.InProgress() { return false } + } else if !relaySupported { + return false } // If relay is supported with peer, it must also be connected - if conn.workerRelay.IsRelayConnectionSupportedWithPeer() { + if relaySupported { if conn.statusRelay.Get() == worker.StatusDisconnected { return false } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/peer/conn.go` around lines 735 - 749, The current transport-readiness check incorrectly returns true when RemoteICESupported() is false and the peer also doesn't support relay; update the logic in the readiness function that contains conn.handshaker.RemoteICESupported(), conn.statusICE, and conn.workerRelay.IsRelayConnectionSupportedWithPeer() so that when RemoteICESupported() is false you still require at least one working transport: if relay is not supported with the peer (workerRelay.IsRelayConnectionSupportedWithPeer() == false) then return false; otherwise continue to validate the relay status as currently done (i.e., require statusRelay != StatusDisconnected).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@client/internal/peer/conn.go`:
- Around line 735-749: The current transport-readiness check incorrectly returns
true when RemoteICESupported() is false and the peer also doesn't support relay;
update the logic in the readiness function that contains
conn.handshaker.RemoteICESupported(), conn.statusICE, and
conn.workerRelay.IsRelayConnectionSupportedWithPeer() so that when
RemoteICESupported() is false you still require at least one working transport:
if relay is not supported with the peer
(workerRelay.IsRelayConnectionSupportedWithPeer() == false) then return false;
otherwise continue to validate the relay status as currently done (i.e., require
statusRelay != StatusDisconnected).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 97653c97-31db-42e6-973f-df416c625882
📒 Files selected for processing (6)
client/internal/engine.goclient/internal/peer/conn.goclient/internal/peer/env.goclient/internal/peer/guard/sr_watcher.goclient/internal/peer/handshaker.goclient/internal/peer/signaler.go
Ensure the relay connection is supported with the peer when ICE is disabled to prevent connectivity issues.
…ry (#5828) * [client] Add tri-state connection status to guard for smarter ICE retry Refactor isConnectedOnAllWay to return a ConnStatus enum (Connected, Disconnected, PartiallyConnected) instead of a boolean. When relay is up but ICE is not (PartiallyConnected), limit ICE offers to 3 retries with exponential backoff then fall back to hourly attempts, reducing unnecessary signaling traffic. Fully disconnected peers continue to retry aggressively. External events (relay/ICE disconnect, signal/relay reconnect) reset retry state to give ICE a fresh chance.
Split iceRetryState.attempt into shouldRetry (pure predicate) and enterHourlyMode (explicit state transition) so the caller in reconnectLoopWithRetry reads top-to-bottom. Restore the original trace-log behavior in isConnectedOnAllWay so it only logs on full disconnection, not on the new PartiallyConnected state.
Split isConnectedOnAllWay into a thin method that snapshots state and a pure evalConnStatus helper that takes a connStatusInputs struct, so the tri-state decision logic can be exercised without constructing full Worker or Handshaker objects. Add table-driven tests covering force-relay, ICE-unavailable and fully-available code paths, plus unit tests for iceRetryState budget/hourly transitions and reset.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
client/internal/peer/conn.go (2)
620-629: Minor: defensive nil-check forworkerICEinonWGDisconnected.In force-relay mode
conn.workerICEisnil. TheICEP2P/ICETurnbranch is today unreachable in that mode (priority can only beNoneorRelaywhen there is no ICE worker), so this isn't an active bug — but the rest of the file (lines 253, 298) now guardsworkerICEexplicitly, so it would be consistent (and future-proof against a refactor that introduces an ICE priority without a worker) to guard here too.🛡️ Defensive guard
case conntype.ICEP2P, conntype.ICETurn: - conn.workerICE.Close() + if conn.workerICE != nil { + conn.workerICE.Close() + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/peer/conn.go` around lines 620 - 629, The ICE branch in the switch on conn.currentConnPriority calls conn.workerICE.Close() without a nil-check; make it consistent with other uses by guarding before calling Close (e.g., if conn.workerICE != nil { conn.workerICE.Close() } else { conn.Log.Debugf(...) }) so that when priority is ICEP2P or ICETurn we only call Close on an existing workerICE; keep the Relay branch behavior (workerRelay.Close() and conn.handleRelayDisconnectedLocked()) unchanged.
738-746: Nit: renameiceStatusConnecting— it covers Connected too.
iceStatusConnectingis populated asconn.statusICE.Get() != worker.StatusDisconnected, so it is true in both connecting and connected states. The name reads as "only connecting", which is misleading for readers ofevalConnStatus(where it's OR'd withiceInProgressto formiceUp). ConsidericeStatusActiveoriceNotDisconnectedto match the semantics. No functional change.♻️ Rename suggestion
- return evalConnStatus(connStatusInputs{ - forceRelay: IsForceRelayed(), - peerUsesRelay: conn.workerRelay.IsRelayConnectionSupportedWithPeer(), - relayConnected: conn.statusRelay.Get() == worker.StatusConnected, - remoteSupportsICE: conn.handshaker.RemoteICESupported(), - iceWorkerCreated: iceWorkerCreated, - iceStatusConnecting: conn.statusICE.Get() != worker.StatusDisconnected, - iceInProgress: iceInProgress, - }) + return evalConnStatus(connStatusInputs{ + forceRelay: IsForceRelayed(), + peerUsesRelay: conn.workerRelay.IsRelayConnectionSupportedWithPeer(), + relayConnected: conn.statusRelay.Get() == worker.StatusConnected, + remoteSupportsICE: conn.handshaker.RemoteICESupported(), + iceWorkerCreated: iceWorkerCreated, + iceStatusActive: conn.statusICE.Get() != worker.StatusDisconnected, + iceInProgress: iceInProgress, + })(Also update
connStatusInputs,evalConnStatus, and the test file accordingly.)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/peer/conn.go` around lines 738 - 746, Rename the misleading field name iceStatusConnecting in connStatusInputs and its usage sites (including the call site in conn.go and the evalConnStatus function) to something that reflects it is true for both connecting and connected states, e.g., iceStatusActive or iceNotDisconnected; update the struct definition of connStatusInputs, the parameter name in evalConnStatus, all references in conn.go (the call that constructs connStatusInputs), and any tests that assert on this field so they use the new name—no behavior changes, only a consistent rename across connStatusInputs, evalConnStatus, and tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@client/internal/peer/conn.go`:
- Around line 620-629: The ICE branch in the switch on conn.currentConnPriority
calls conn.workerICE.Close() without a nil-check; make it consistent with other
uses by guarding before calling Close (e.g., if conn.workerICE != nil {
conn.workerICE.Close() } else { conn.Log.Debugf(...) }) so that when priority is
ICEP2P or ICETurn we only call Close on an existing workerICE; keep the Relay
branch behavior (workerRelay.Close() and conn.handleRelayDisconnectedLocked())
unchanged.
- Around line 738-746: Rename the misleading field name iceStatusConnecting in
connStatusInputs and its usage sites (including the call site in conn.go and the
evalConnStatus function) to something that reflects it is true for both
connecting and connected states, e.g., iceStatusActive or iceNotDisconnected;
update the struct definition of connStatusInputs, the parameter name in
evalConnStatus, all references in conn.go (the call that constructs
connStatusInputs), and any tests that assert on this field so they use the new
name—no behavior changes, only a consistent rename across connStatusInputs,
evalConnStatus, and tests.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5a4bb3d8-1696-49dd-8e63-ce610260d2ba
📒 Files selected for processing (6)
client/internal/peer/conn.goclient/internal/peer/conn_status.goclient/internal/peer/conn_status_eval_test.goclient/internal/peer/guard/guard.goclient/internal/peer/guard/ice_retry_state.goclient/internal/peer/guard/ice_retry_state_test.go
✅ Files skipped from review due to trivial changes (1)
- client/internal/peer/conn_status.go
mlsmaycon
left a comment
There was a problem hiding this comment.
pending title and description update
Done |
|



Describe your changes
Suppress ICE signaling and periodic offers in force-relay mode
Optimize ICE retry strategy
In case established Relayed connection will stop ICE signaling after some retry.
When relay is up but ICE is not (
PartiallyConnected):When fully disconnected:
Improved resiliency:
Issue ticket number and link
Stack
Checklist
Documentation
Select exactly one:
Docs PR URL (required if "docs added" is checked)
Paste the PR link from https://github.com/netbirdio/docs here:
https://github.com/netbirdio/docs/pull/__
Summary by CodeRabbit
Refactor
Bug Fixes
Tests