Fix UI stuck on “Disconnected” during network-change engine restart#167
Fix UI stuck on “Disconnected” during network-change engine restart#167
Conversation
When EngineRestarter stopped and restarted the Go engine after a network type change, the UI only saw the engine's onDisconnected callback and had no visibility into the reconnect attempt. If the restart stalled (e.g. on a stale management RPC), the UI stayed on Disconnected for the full stall window, making it look like the client never reconnected. Emit onConnecting() from EngineRestarter at stop and at re-launch to keep the UI in the Connecting state throughout the restart, and emit onDisconnected() on error or the 30s safety timeout so a truly failed restart doesn't leave the UI stuck on Connecting.
Pin the process's outgoing sockets to the current default Android Network via ConnectivityManager.bindProcessToNetwork so fresh dials after a WiFi/cellular switch do not stall on TCP SYN retransmits through the departing interface. Skip the initial onAvailable burst fired right after registering the NetworkCallback. That burst reflects current state, not a transition, and was triggering a spurious EngineRestarter restart that cancelled the in-flight login on cold start.
📝 WalkthroughWalkthroughNetwork restart handling is enhanced through debounced scheduling and callback suppression mechanisms in Changes
Sequence Diagram(s)sequenceDiagram
participant NetworkChangeDetector
participant ConcreteNetworkAvailabilityListener
participant EngineRestarter
participant EngineRunner
participant VPNService
NetworkChangeDetector->>ConcreteNetworkAvailabilityListener: onDefaultNetworkTypeChanged(newType)
Note over ConcreteNetworkAvailabilityListener: Check lastDefaultType,<br/>validate shouldNotify
ConcreteNetworkAvailabilityListener->>EngineRestarter: onNetworkTypeChanged()
EngineRestarter->>EngineRestarter: Mark restartScheduled=true
EngineRestarter->>EngineRunner: restartEngine()
EngineRunner->>EngineRunner: Wrap ConnectionListener<br/>in FilteringConnectionListener
EngineRunner->>EngineRunner: Snapshot & suppress<br/>ServiceStateListeners
EngineRunner->>EngineRunner: Emit synthetic connecting/disconnected
Note over EngineRunner: Suppress callbacks during<br/>restart window
EngineRunner->>EngineRunner: onStarted() or onError()
EngineRunner->>EngineRunner: Restore original listener<br/>Unsuppress service listeners
EngineRestarter->>EngineRestarter: restartScheduled=false
sequenceDiagram
participant VPNService
participant EngineRestarter
participant EngineRunner
VPNService->>EngineRunner: setConnectionListener()
EngineRunner->>EngineRunner: Unwrap existing wrapper,<br/>install new ObservingConnectionListener
EngineRunner->>EngineRunner: Register connectedObservers
EngineRestarter->>EngineRunner: addOnConnectedObserver(runnable)
Note over EngineRunner: Observer triggered on<br/>onConnected() callback
EngineRunner->>EngineRestarter: Notify observer
EngineRestarter->>EngineRestarter: cancelPendingRestart()
EngineRestarter->>EngineRunner: removeOnConnectedObserver()
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 60 minutes.Comment |
Replace the time-based grace window with an isEngineRunning predicate. The initial onAvailable burst that Android fires right after registerNetworkCallback cannot trigger an EngineRestarter run because the engine is not up yet at that point. Tests updated accordingly; adds coverage for the engine-not-running path.
Use IMPORTANCE_LOW and explicitly clear sound/vibration on the channel so the persistent VPN notification does not play a sound or vibrate on creation or each connection state update.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tool/src/main/java/io/netbird/client/tool/EngineRestarter.java (1)
52-57:⚠️ Potential issue | 🟠 MajorRemove the restart listener when the timeout fires.
After the 30s timeout,
isRestartInProgressis reset andonDisconnected()is emitted, but the anonymousServiceStateListenerremains registered. If the engine stops later, that stale listener can still callrunWithoutAuth()and restart after the timeout path declared failure.🔧 Proposed fix
timeoutCallback = () -> { if (isRestartInProgress) { Log.e(LOGTAG, "engine restart timeout - forcing flag reset"); isRestartInProgress = false; + if (currentListener != null) { + engineRunner.removeServiceStateListener(currentListener); + currentListener = null; + } notifyDisconnected(); } }; @@ public void onStarted() { Log.d(LOGTAG, "engine restarted successfully"); isRestartInProgress = false; // Reset flag on success handler.removeCallbacks(timeoutCallback); // Cancel timeout engineRunner.removeServiceStateListener(this); + currentListener = null; } @@ public void onError(String msg) { Log.e(LOGTAG, "restart failed: " + msg); isRestartInProgress = false; // Resetting flag on error as well handler.removeCallbacks(timeoutCallback); // Cancel timeout engineRunner.removeServiceStateListener(this); + currentListener = null; notifyDisconnected(); }Also applies to: 65-90, 142-155
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tool/src/main/java/io/netbird/client/tool/EngineRestarter.java` around lines 52 - 57, The timeoutCallback currently resets isRestartInProgress and calls notifyDisconnected() but leaves the anonymous ServiceStateListener registered; update timeoutCallback to also unregister/remove that listener (the same instance registered earlier) when the timeout fires so the stale ServiceStateListener can't later call runWithoutAuth() and trigger another restart; apply the same fix to the other restart-timeout handlers in the file (the blocks around the ServiceStateListener registration in the 65-90 and 142-155 sections) ensuring you keep a reference to the ServiceStateListener instance so you can call the appropriate remove/unregister method when cancelling on success or timeout.tool/src/main/java/io/netbird/client/tool/ForegroundNotification.java (1)
25-31:⚠️ Potential issue | 🟠 MajorUse a new channel ID for silent/low-importance notification settings to affect existing installs.
When
createNotificationChannel()is called with an existing channel ID, Android ignores updates to sound, vibration, and lights properties—these can only be set on initial channel creation. Importance can only be lowered if the user hasn't modified channel settings. Existing users will retain their original audible behavior unless you use a different channel ID.Suggested fix
- String channelId = service.getPackageName(); + String channelId = service.getPackageName() + ".foreground.silent";🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tool/src/main/java/io/netbird/client/tool/ForegroundNotification.java` around lines 25 - 31, ForegroundNotification currently re-uses service.getPackageName() as the NotificationChannel ID which prevents changing sound/vibration for existing installs; change to a new, distinct channel ID (e.g., a constant like FOREGROUND_CHANNEL_ID_SILENT or service.getPackageName() + ".fg_silent") used when creating the NotificationChannel in the same creation block (the NotificationChannel constructor and channel.setSound/enableVibration calls) and update any places that build/post the foreground Notification to use that new channel ID so the silent/low-importance settings apply to all users.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@netbird`:
- Line 1: The submodule pointer references commit
5b09078da2ac14550741ce8731e7cf4b4a62a728 which is not reachable; verify whether
that commit exists in the official netbirdio/netbird.git or a private fork, then
update the submodule to a valid commit: check the branch containing the intended
change, fetch the correct commit hash (or switch the submodule URL if it should
point to a different repo), and update the submodule reference (e.g., via git
submodule update --init --remote or by committing the corrected SHA in the
superproject and updating .gitmodules if the URL must change) so cloning with
--recurse-submodules succeeds.
In
`@tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java`:
- Around line 63-85: The onAvailable/onLost handlers in
initDefaultNetworkCallback must be guarded by an atomic/current bound-network
state and a callback-active flag so late onLost or onAvailable after unregister
don't undo a newer binding; add a field (e.g., defaultNetworkCallbackActive) set
to true before registering the defaultNetworkCallback and set to false when
unregisterNetworkCallback begins, track the currentlyBoundDefaultNetwork (or
similar) when bindProcessToNetwork(network) succeeds, and in onLost only clear
the binding if the lost network equals the currentlyBoundDefaultNetwork and
defaultNetworkCallbackActive is true; likewise ignore onAvailable if
defaultNetworkCallbackActive is false to avoid rebinding after shutdown.
---
Outside diff comments:
In `@tool/src/main/java/io/netbird/client/tool/EngineRestarter.java`:
- Around line 52-57: The timeoutCallback currently resets isRestartInProgress
and calls notifyDisconnected() but leaves the anonymous ServiceStateListener
registered; update timeoutCallback to also unregister/remove that listener (the
same instance registered earlier) when the timeout fires so the stale
ServiceStateListener can't later call runWithoutAuth() and trigger another
restart; apply the same fix to the other restart-timeout handlers in the file
(the blocks around the ServiceStateListener registration in the 65-90 and
142-155 sections) ensuring you keep a reference to the ServiceStateListener
instance so you can call the appropriate remove/unregister method when
cancelling on success or timeout.
In `@tool/src/main/java/io/netbird/client/tool/ForegroundNotification.java`:
- Around line 25-31: ForegroundNotification currently re-uses
service.getPackageName() as the NotificationChannel ID which prevents changing
sound/vibration for existing installs; change to a new, distinct channel ID
(e.g., a constant like FOREGROUND_CHANNEL_ID_SILENT or service.getPackageName()
+ ".fg_silent") used when creating the NotificationChannel in the same creation
block (the NotificationChannel constructor and channel.setSound/enableVibration
calls) and update any places that build/post the foreground Notification to use
that new channel ID so the silent/low-importance settings apply to all users.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b5d47af5-9d72-4650-b86d-dbd761803e47
📒 Files selected for processing (8)
netbirdtool/src/main/java/io/netbird/client/tool/EngineRestarter.javatool/src/main/java/io/netbird/client/tool/EngineRunner.javatool/src/main/java/io/netbird/client/tool/ForegroundNotification.javatool/src/main/java/io/netbird/client/tool/VPNService.javatool/src/main/java/io/netbird/client/tool/networks/ConcreteNetworkAvailabilityListener.javatool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.javatool/src/test/java/io/netbird/client/tool/ConcreteNetworkAvailabilityListenerUnitTest.java
Track the currently bound default network and an active flag so late onLost callbacks cannot clear a newer binding and post-unregister onAvailable callbacks cannot rebind after shutdown.
…fication # Conflicts: # netbird
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java (1)
110-116:registerNetworkCallback()is not idempotent and can leak callback state.Two concerns here:
- If a caller invokes
registerNetworkCallback()twice without an interveningunregisterNetworkCallback(),registerDefaultNetworkCallback(defaultNetworkCallback)will throwIllegalArgumentException("NetworkCallback was already registered").defaultNetworkCallbackActiveis alreadytrueby then, and the firstnetworkCallbackis also re-registered — same failure mode applies toregisterNetworkCallback(..., networkCallback)on L113.- If
registerDefaultNetworkCallbackon L115 throws for any reason (e.g., transient service error),defaultNetworkCallbackActiveis left astruewhile no callback is actually subscribed. SubsequentunregisterNetworkCallback()will try to unregister an unregistered callback (caught) but the state flag is otherwise harmless. Still, wrapping L115 in try/catch that reverts the flag on failure would keep state consistent.Not blocking, but worth a small defensive tweak.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java` around lines 110 - 116, Make registerNetworkCallback() idempotent and keep the boolean state consistent: before calling connectivityManager.registerNetworkCallback(...) or registerDefaultNetworkCallback(...), check whether networkCallback/defaultNetworkCallback are already registered (use defaultNetworkCallbackActive and a similar flag for networkCallback) and return early if so; set the corresponding active flag only after the register call succeeds; wrap registerDefaultNetworkCallback(defaultNetworkCallback) in a try/catch that reverts defaultNetworkCallbackActive on failure and gracefully handles IllegalArgumentException/RuntimeException to avoid leaking state; ensure unregisterNetworkCallback() relies on these flags to decide whether to call connectivityManager.unregisterNetworkCallback(...).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java`:
- Around line 68-108: The onAvailable/onLost callbacks in
initDefaultNetworkCallback have a TOCTOU race with unregisterNetworkCallback
because defaultNetworkCallbackActive is checked outside the bindProcessToNetwork
calls; make the gate check and the bind (and the currentlyBoundDefaultNetwork
updates) atomic by synchronizing them on a dedicated lock object (e.g., a
private final Object networkCallbackLock), i.e., wrap the checks of
defaultNetworkCallbackActive plus the subsequent
connectivityManager.bindProcessToNetwork(...) and
currentlyBoundDefaultNetwork.set/compareAndSet(...) in a
synchronized(networkCallbackLock) block, and also wrap the
unregisterNetworkCallback logic that sets defaultNetworkCallbackActive to false,
unregisters the callback, calls bindProcessToNetwork(null), and clears
currentlyBoundDefaultNetwork inside the same synchronized(networkCallbackLock)
to eliminate the race.
---
Nitpick comments:
In
`@tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java`:
- Around line 110-116: Make registerNetworkCallback() idempotent and keep the
boolean state consistent: before calling
connectivityManager.registerNetworkCallback(...) or
registerDefaultNetworkCallback(...), check whether
networkCallback/defaultNetworkCallback are already registered (use
defaultNetworkCallbackActive and a similar flag for networkCallback) and return
early if so; set the corresponding active flag only after the register call
succeeds; wrap registerDefaultNetworkCallback(defaultNetworkCallback) in a
try/catch that reverts defaultNetworkCallbackActive on failure and gracefully
handles IllegalArgumentException/RuntimeException to avoid leaking state; ensure
unregisterNetworkCallback() relies on these flags to decide whether to call
connectivityManager.unregisterNetworkCallback(...).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f8b16d92-e50e-4c8b-95fb-329fb80c36fb
📒 Files selected for processing (1)
tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java
Add a dedicated lock and wrap the default network callback's onAvailable and onLost bodies, plus the unregister teardown, in synchronized blocks to close the TOCTOU race where a stale callback could re-bind the process after unregisterNetworkCallback had cleared the binding.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java`:
- Around line 115-121: The registerNetworkCallback() method sets
defaultNetworkCallbackActive and calls
connectivityManager.registerDefaultNetworkCallback(...) outside the
networkCallbackLock, creating a race with unregisterNetworkCallback(); move the
defaultNetworkCallbackActive.set(true) and the call to
connectivityManager.registerDefaultNetworkCallback(defaultNetworkCallback)
inside the same synchronized(networkCallbackLock) block used for registering the
primary networkCallback so the flag and default callback registration are
performed atomically, mirroring unregisterNetworkCallback()'s symmetry and
preventing a leaked defaultNetworkCallback when concurrent register/unregister
occur.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c6d65f2e-b0d1-4d0c-b801-df75e81ddbf2
📒 Files selected for processing (1)
tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java
If registerDefaultNetworkCallback ever delivers our own TUN as the default network, binding the process to it would create a routing loop. Log a warning to surface that case if it happens on any device.
Mirror the unregister teardown's locking by wrapping the active flag and registerDefaultNetworkCallback in the same synchronized block. Closes the asymmetry between register and unregister so concurrent calls cannot leak the default callback or leave the active flag inconsistent.
Detach the ConnectionListener before stopping the engine so the old engine's Disconnecting/Disconnected teardown events do not reach the UI and cause a brief visible Disconnected flash before the restart kicks in. The listener is re-attached after the new engine starts; the Go notifier delivers the current state on attach so the UI converges without our help. While the engine is detached, the EngineRestarter drives the UI itself via notifyConnecting on stop and notifyDisconnected on timeout/error.
Replace the per-network onAvailable/onLost pairing with a default-network type observation. Android sometimes skips onLost on seamless WiFi handovers, leaving the previous mechanism unable to detect the transition. The default-network callback delivers the authoritative current transport, so any change of type triggers an engine restart.
Two related changes to avoid disrupting a working connection during a network handover: - Filter Disconnecting/Disconnected events from the old engine teardown via a wrapper around ConnectionListener, and suppress per-listener the ServiceStateListener.onStopped/onStarted notifications so the UI does not flash through Disconnected during the restart window. - Subscribe to OnConnected events from the engine. If the Go core reconnects autonomously while the 2s restart debounce is still pending, cancel the restart instead of tearing down the working connection.
Picks up the fix that prevents transient JOB stream errors from being reported as a management disconnect, which would otherwise stick the UI on Connecting after the JOB stream silently reconnects.
On some devices the default network callback delivers our own TUN as the default within a VpnService process. Binding the process to that risks a routing loop. The Android default-network signal is replayed seconds later with the underlying physical network, so skipping the bind on a VPN result waits for that follow-up signal instead.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tool/src/main/java/io/netbird/client/tool/EngineRestarter.java (1)
39-68:⚠️ Potential issue | 🟡 MinorMinor: tiny race between
onEngineReconnectedcancellation and the runnable executing.
onEngineReconnecteduseshandler.removeCallbacks(restartRunnable)to abort, but ifrestartRunnablehas already begun executing,removeCallbackswon't stop it and we'll proceed to tear down a freshly-reconnected engine. Both flow on the main looper so the window is small, but a single extra check after theisRestartInProgressguard would close it cleanly:private void restartEngine() { - restartScheduled = false; + if (!restartScheduled) { + // onEngineReconnected (or cleanup) cancelled this run before it executed. + return; + } + restartScheduled = false;Couple this with an explicit
restartScheduled = trueset insideonNetworkTypeChanged(already present) and the runnable becomes idempotent against late cancellation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tool/src/main/java/io/netbird/client/tool/EngineRestarter.java` around lines 39 - 68, There’s a tiny race where handler.removeCallbacks(restartRunnable) in onEngineReconnected won’t stop the runnable if it already started; make the runnable idempotent by checking the cancellation flag again at its start (e.g., read restartScheduled and/or isRestartInProgress) so a late-started runnable bails out, and ensure onNetworkTypeChanged sets restartScheduled = true as noted; update restartRunnable and restartEngine to re-check restartScheduled (and respect isRestartInProgress) immediately before doing any teardown so late cancellations are honored.
🧹 Nitpick comments (2)
tool/src/main/java/io/netbird/client/tool/EngineRestarter.java (1)
75-80:FilteringConnectionListeneraccumulates across successive restarts.On
onStarted/onError/timeout we only flip the filter to passthrough mode (allowAfterFirstConnectingOrConnected()/allowAll()) but never reinstall the originalsavedListeneron the runner. The next restart then readsengineRunner.getConnectionListener()— which is the previousFilteringConnectionListener— and wraps it again, so each WiFi↔cellular handover adds another wrapper layer to the chain.Functionally each wrapper becomes transparent after release, so this isn't a correctness bug, but over many handovers it's an unbounded growth of nested delegates and
try/catchhops on every connection event. The cleanest fix is to restoresavedListeneron the success path (after the firstonConnecting/onConnectedarrives, where the filter window naturally ends) and on the error/timeout paths.♻️ One option: restore the original on terminal events
public void onStarted() { ... engineRunner.removeServiceStateListener(this); - if (filteringListener != null) { - filteringListener.allowAfterFirstConnectingOrConnected(); - } + if (filteringListener != null) { + filteringListener.allowAfterFirstConnectingOrConnected(); + // Once filtering naturally releases on the first Connecting/Connected, + // swap the wrapper out so it doesn't accumulate across restarts. + filteringListener.setOnRelease(() -> engineRunner.setConnectionListener(savedListener)); + } unsuppressAll(suppressedHolder.get()); } @@ public void onError(String msg) { ... if (filteringListener != null) { - filteringListener.allowAll(); + filteringListener.allowAll(); + engineRunner.setConnectionListener(savedListener); } ... }
FilteringConnectionListenerwould gain a one-shotsetOnRelease(Runnable)invoked from the spot that flipsdropDisconnects = false. Alternatively, on entry torestartEngineyou can unwrap any pre-existingFilteringConnectionListenerbefore re-wrapping.Also applies to: 114-117, 132-136
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tool/src/main/java/io/netbird/client/tool/EngineRestarter.java` around lines 75 - 80, The FilteringConnectionListener wrappers accumulate because restartEngine wraps whatever is returned by engineRunner.getConnectionListener() without restoring the original; fix by ensuring the original savedListener is reinstalled on terminal paths (onStarted/onError/timeout) or by unwrapping any existing FilteringConnectionListener before creating a new one: in EngineRestarter.restartEngine, detect if engineRunner.getConnectionListener() is an instance of FilteringConnectionListener and retrieve its delegate (the original savedListener) before creating a new FilteringConnectionListener, and/or add a one-shot release callback on FilteringConnectionListener that invokes engineRunner.setConnectionListener(savedListener) when the filter window ends (allowAfterFirstConnectingOrConnected()/allowAll()) so setConnectionListener(savedListener) is called on success, error, or timeout to prevent nested wrappers.tool/src/main/java/io/netbird/client/tool/networks/ConcreteNetworkAvailabilityListener.java (1)
9-9:availableNetworkTypesis unused and should be removed.This field is maintained in
onNetworkAvailable(line 29) andonNetworkLost(line 34) but never read anywhere in this class or referenced elsewhere in the codebase. Notification logic is driven bylastDefaultTypeinonDefaultNetworkTypeChanged, making this tracking redundant state from the pre-refactor logic.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tool/src/main/java/io/netbird/client/tool/networks/ConcreteNetworkAvailabilityListener.java` at line 9, Remove the redundant field availableNetworkTypes from ConcreteNetworkAvailabilityListener and any updates to it in onNetworkAvailable and onNetworkLost; keep notification logic driven solely by lastDefaultType as handled in onDefaultNetworkTypeChanged and ensure no other methods reference availableNetworkTypes before deletion.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java`:
- Around line 80-94: The code currently proceeds to bindProcessToNetwork when
getNetworkCapabilities(network) returns null, which bypasses the VPN
routing-loop guard; change the early capability check in the
NetworkChangeDetector logic to treat a null NetworkCapabilities as
unknown/unsafe and skip binding (i.e., return when caps == null ||
!caps.hasCapability(NetworkCapabilities.NET_CAPABILITY_NOT_VPN)), mirroring the
behavior of checkNetworkCapabilities and ensuring bindProcessToNetwork(network)
is only attempted when capabilities are present and explicitly indicate non-VPN;
update any logging (LOGTAG) to reflect skipping due to unknown capabilities and
leave currentlyBoundDefaultNetwork.set only inside the successful bind branch.
---
Outside diff comments:
In `@tool/src/main/java/io/netbird/client/tool/EngineRestarter.java`:
- Around line 39-68: There’s a tiny race where
handler.removeCallbacks(restartRunnable) in onEngineReconnected won’t stop the
runnable if it already started; make the runnable idempotent by checking the
cancellation flag again at its start (e.g., read restartScheduled and/or
isRestartInProgress) so a late-started runnable bails out, and ensure
onNetworkTypeChanged sets restartScheduled = true as noted; update
restartRunnable and restartEngine to re-check restartScheduled (and respect
isRestartInProgress) immediately before doing any teardown so late cancellations
are honored.
---
Nitpick comments:
In `@tool/src/main/java/io/netbird/client/tool/EngineRestarter.java`:
- Around line 75-80: The FilteringConnectionListener wrappers accumulate because
restartEngine wraps whatever is returned by engineRunner.getConnectionListener()
without restoring the original; fix by ensuring the original savedListener is
reinstalled on terminal paths (onStarted/onError/timeout) or by unwrapping any
existing FilteringConnectionListener before creating a new one: in
EngineRestarter.restartEngine, detect if engineRunner.getConnectionListener() is
an instance of FilteringConnectionListener and retrieve its delegate (the
original savedListener) before creating a new FilteringConnectionListener,
and/or add a one-shot release callback on FilteringConnectionListener that
invokes engineRunner.setConnectionListener(savedListener) when the filter window
ends (allowAfterFirstConnectingOrConnected()/allowAll()) so
setConnectionListener(savedListener) is called on success, error, or timeout to
prevent nested wrappers.
In
`@tool/src/main/java/io/netbird/client/tool/networks/ConcreteNetworkAvailabilityListener.java`:
- Line 9: Remove the redundant field availableNetworkTypes from
ConcreteNetworkAvailabilityListener and any updates to it in onNetworkAvailable
and onNetworkLost; keep notification logic driven solely by lastDefaultType as
handled in onDefaultNetworkTypeChanged and ensure no other methods reference
availableNetworkTypes before deletion.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8918a2d0-38fa-470d-91f2-7420fe32b624
📒 Files selected for processing (7)
netbirdtool/src/main/java/io/netbird/client/tool/EngineRestarter.javatool/src/main/java/io/netbird/client/tool/EngineRunner.javatool/src/main/java/io/netbird/client/tool/networks/ConcreteNetworkAvailabilityListener.javatool/src/main/java/io/netbird/client/tool/networks/NetworkAvailabilityListener.javatool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.javatool/src/test/java/io/netbird/client/tool/ConcreteNetworkAvailabilityListenerUnitTest.java
🚧 Files skipped from review as they are similar to previous changes (2)
- tool/src/test/java/io/netbird/client/tool/ConcreteNetworkAvailabilityListenerUnitTest.java
- tool/src/main/java/io/netbird/client/tool/EngineRunner.java
Treat a null NetworkCapabilities as unsafe and skip the bind. The previous null-tolerant check would have bypassed the VPN routing-loop guard if Android happened to return null in a race between the default-network callback firing and getNetworkCapabilities.
A debounced restart scheduled in response to a network change can fire after the user has manually started or stopped the engine, killing the user's action mid-flight (auth context canceled, restart fails, UI stays Disconnected). Cancel any pending restart before the user-facing entry points run: binder runEngine/stopEngine, broadcast stop, always-on start, and VPN permission revoke. The EngineRestarter's own internal stop+restart remains unaffected.
Reverts the bindProcessToNetwork side of f0df3f5. Pinning the process to the current default network helps when the kernel routing table lags the actual network change, but hurts when Android lingers a departing network as default for tens of seconds: every fresh socket gets stuck on a dying interface. The default-network callback now only feeds the type-change signal used for engine restart decisions; the kernel decides which interface new sockets actually use.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java (1)
116-129:⚠️ Potential issue | 🔴 CriticalClear the active flag before unregistering callbacks.
unregisterNetworkCallback(networkCallback)at line 118 runs whiledefaultNetworkCallbackActiveis stilltrue, creating a window where a pendingdefaultNetworkCallback.onAvailable()can acquire the lock, see the flag as true (line 74), and notify the listener (line 100) even though shutdown has started. Set the flag tofalsefirst undernetworkCallbackLock, then unregister both callbacks.🔧 Suggested fix
public void unregisterNetworkCallback() { + synchronized (networkCallbackLock) { + defaultNetworkCallbackActive.set(false); + } try { connectivityManager.unregisterNetworkCallback(networkCallback); } catch (Exception e) { Log.e(LOGTAG, "failed to unregister network callback", e); } synchronized (networkCallbackLock) { - defaultNetworkCallbackActive.set(false); try { connectivityManager.unregisterNetworkCallback(defaultNetworkCallback); } catch (Exception e) { Log.e(LOGTAG, "failed to unregister default network callback", e); } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java` around lines 116 - 129, Clear the race by setting defaultNetworkCallbackActive to false under networkCallbackLock before any unregister calls: acquire networkCallbackLock, set defaultNetworkCallbackActive.set(false), then call connectivityManager.unregisterNetworkCallback(defaultNetworkCallback) and connectivityManager.unregisterNetworkCallback(networkCallback) (each guarded with try/catch and logging) so that defaultNetworkCallback.onAvailable() cannot observe the flag as true during shutdown; update the logic around networkCallback, defaultNetworkCallback, defaultNetworkCallbackActive, networkCallbackLock, and unregisterNetworkCallback accordingly.
♻️ Duplicate comments (1)
tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java (1)
106-113:⚠️ Potential issue | 🟠 MajorRollback the active flag if default-callback registration fails.
The
registerDefaultNetworkCallback()call on Android can throw aSecurityException(especially on Android 11 due to a platform bug) orTooManyRequestsException. If it throws, thedefaultNetworkCallbackActiveflag remainstrueeven though the callback was never registered, creating a state inconsistency that leaves teardown in an incorrect state.Suggested fix
synchronized (networkCallbackLock) { defaultNetworkCallbackActive.set(true); - connectivityManager.registerDefaultNetworkCallback(defaultNetworkCallback); + try { + connectivityManager.registerDefaultNetworkCallback(defaultNetworkCallback); + } catch (RuntimeException e) { + defaultNetworkCallbackActive.set(false); + throw e; + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java` around lines 106 - 113, In registerNetworkCallback(), avoid leaving defaultNetworkCallbackActive set to true if connectivityManager.registerDefaultNetworkCallback(defaultNetworkCallback) throws (e.g., SecurityException or TooManyRequestsException): perform the registerDefaultNetworkCallback call inside the synchronized (networkCallbackLock) block and only set defaultNetworkCallbackActive.set(true) after the call succeeds, or catch those specific exceptions inside the synchronized block, rollback defaultNetworkCallbackActive.set(false) and rethrow or handle/log as appropriate; ensure you reference the existing symbols (registerNetworkCallback, defaultNetworkCallbackActive, networkCallbackLock, connectivityManager.registerDefaultNetworkCallback, defaultNetworkCallback) so the flag state stays consistent on failure.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tool/src/main/java/io/netbird/client/tool/EngineRestarter.java`:
- Around line 86-95: The timeout branch currently leaves currentListener
registered so a late onStopped() can call engineRunner.runWithoutAuth(); update
the timeoutCallback (and the similar branches around the other noted blocks) to
unregister the listener before resetting state: explicitly set currentListener =
null (or call the existing unregister method if present) right after handling
filteringListener/unsuppressAll and before notifyDisconnected(savedListener),
and do the same in the other restart-failure/abort paths so that onStopped()
cannot run against a stale listener and trigger engineRunner.runWithoutAuth().
- Around line 75-80: Saved listener wrappers are being stacked on each restart
because EngineRestarter calls getConnectionListener() (EngineRestarter) which
returns an already-wrapped listener and then wraps it again in
FilteringConnectionListener before calling EngineRunner.setConnectionListener(),
causing nested wrappers; fix by unwrapping any existing wrapper before
reinstalling: when reading savedListener from
engineRunner.getConnectionListener(), detect if savedListener is an instance of
FilteringConnectionListener (or the EngineRunner anonymous wrapper) and extract
the underlying delegate (add a small accessor like
FilteringConnectionListener.getDelegate() if needed) and pass that raw
user-provided listener into engineRunner.setConnectionListener(); apply the same
unwrapping logic to the other similar sites mentioned (the blocks around lines
114–116 and 146–148) or alternatively change
EngineRunner.setConnectionListener() to ignore/wrap only raw user listeners to
prevent double-wrapping.
---
Outside diff comments:
In
`@tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java`:
- Around line 116-129: Clear the race by setting defaultNetworkCallbackActive to
false under networkCallbackLock before any unregister calls: acquire
networkCallbackLock, set defaultNetworkCallbackActive.set(false), then call
connectivityManager.unregisterNetworkCallback(defaultNetworkCallback) and
connectivityManager.unregisterNetworkCallback(networkCallback) (each guarded
with try/catch and logging) so that defaultNetworkCallback.onAvailable() cannot
observe the flag as true during shutdown; update the logic around
networkCallback, defaultNetworkCallback, defaultNetworkCallbackActive,
networkCallbackLock, and unregisterNetworkCallback accordingly.
---
Duplicate comments:
In
`@tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java`:
- Around line 106-113: In registerNetworkCallback(), avoid leaving
defaultNetworkCallbackActive set to true if
connectivityManager.registerDefaultNetworkCallback(defaultNetworkCallback)
throws (e.g., SecurityException or TooManyRequestsException): perform the
registerDefaultNetworkCallback call inside the synchronized
(networkCallbackLock) block and only set defaultNetworkCallbackActive.set(true)
after the call succeeds, or catch those specific exceptions inside the
synchronized block, rollback defaultNetworkCallbackActive.set(false) and rethrow
or handle/log as appropriate; ensure you reference the existing symbols
(registerNetworkCallback, defaultNetworkCallbackActive, networkCallbackLock,
connectivityManager.registerDefaultNetworkCallback, defaultNetworkCallback) so
the flag state stays consistent on failure.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 709606a8-dd3f-437a-8071-fc93c36b7105
📒 Files selected for processing (3)
tool/src/main/java/io/netbird/client/tool/EngineRestarter.javatool/src/main/java/io/netbird/client/tool/VPNService.javatool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java
EngineRunner.setConnectionListener stacked a fresh ObservingConnectionListener around whatever it received, so EngineRestarter snapshotting the current listener and re-installing a FilteringConnectionListener around it grew the chain by one level on every restart cycle. Unwrap any prior ObservingConnectionListener inside setConnectionListener and any prior FilteringConnectionListener when EngineRestarter snapshots, so the chain stays at most two layers deep across repeated restarts. Also unregister the restart's ServiceStateListener (and clear currentListener) on the 30s timeout path so a late onStopped cannot fire runWithoutAuth against a stale listener and silently restart the engine after the timeout already gave up. Mirror the cleanup in onStarted and onError for consistency.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tool/src/main/java/io/netbird/client/tool/EngineRestarter.java (1)
113-127: 🛠️ Refactor suggestion | 🟠 MajorFilteringConnectionListener is not restored to original on successful restart, causing wrapper accumulation.
After
onStarted(), the code callsfilteringListener.allowAfterFirstConnectingOrConnected()but does not restore the original listener viasetConnectionListener(savedListener). TheFilteringConnectionListenerremains installed (with filtering disabled).On the next restart cycle:
getConnectionListener()returnsObservingConnectionListener→FilteringConnectionListener→ originalunwrapFilter()doesn't drill throughObservingConnectionListener, so it returns theObservingConnectionListener- A new
FilteringConnectionListenerwraps thatsetConnectionListener()wraps it in anotherObservingConnectionListenerThis causes listener wrappers to accumulate with each restart. While functionally correct (events still propagate), it wastes memory and causes
connectedObserversto run multiple times.💡 Option 1: Restore original listener on success
public void onStarted() { Log.d(LOGTAG, "engine restarted successfully"); isRestartInProgress = false; handler.removeCallbacks(timeoutCallback); engineRunner.removeServiceStateListener(this); currentListener = null; - if (filteringListener != null) { - filteringListener.allowAfterFirstConnectingOrConnected(); - } + // Restore the original listener now that restart completed + if (savedListener != null) { + engineRunner.setConnectionListener(savedListener); + } unsuppressAll(suppressedHolder.get()); }💡 Option 2: Make unwrapFilter handle both wrapper types
Since
ObservingConnectionListenerisprivateinEngineRunner, this would require exposing it or adding a package-level unwrap helper inEngineRunnerthatEngineRestartercan call.Also applies to: 185-191
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tool/src/main/java/io/netbird/client/tool/EngineRestarter.java` around lines 113 - 127, The FilteringConnectionListener wrapper is never removed after a successful restart, causing wrapper accumulation; in EngineRestarter.onStarted() (and the corresponding success path around lines ~185-191) after calling filteringListener.allowAfterFirstConnectingOrConnected() and unsuppressAll(...), restore the original listener by calling engineRunner.setConnectionListener(savedListener) (or setConnectionListener(currentListenerSaved) if you store it) and clear filteringListener/currentListener as done elsewhere; ensure you reference and use the saved original listener variable used when installing the filter (and update any other success/cleanup paths to do the same) so unwrapFilter() is no longer necessary to traverse ObservingConnectionListener wrappers.
🧹 Nitpick comments (2)
tool/src/main/java/io/netbird/client/tool/VPNService.java (1)
88-90: Consistent cancellation of pending restarts across lifecycle events - minor style inconsistency.All the
cancelPendingRestart()call sites correctly ensure that any pending network-change-driven restart is cancelled when explicit engine lifecycle actions occur (stop broadcast, always-on start, revoke, user-driven start/stop). This prevents spurious restarts from interfering with explicit user/system actions.Minor observation: The null check pattern
if (engineRestarter != null)is used in the broadcast receiver (line 88) andonRevoke(line 174), but not inonStartCommand(line 115) or the binder methods (lines 203, 208). SinceengineRestarteris initialized inonCreate()before any of these methods can be invoked, the null check isn't strictly necessary anywhere. For consistency, either add null checks everywhere or remove them where present.Also applies to: 115-115, 174-176, 203-203, 208-208
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tool/src/main/java/io/netbird/client/tool/VPNService.java` around lines 88 - 90, Remove the redundant null-checks around engineRestarter so the call sites are consistent: call engineRestarter.cancelPendingRestart() directly (without if (engineRestarter != null) guards) in the broadcast receiver and onRevoke to match the existing direct calls in onStartCommand and the binder methods in VPNService; locate usages by searching for engineRestarter and cancelPendingRestart within the VPNService class and update the broadcast receiver and onRevoke sites accordingly.tool/src/main/java/io/netbird/client/tool/EngineRunner.java (1)
129-146: Wrapper stacking occurs when EngineRestarter installs its FilteringConnectionListener.The problem:
getConnectionListener()returns the wrapped listener (ObservingConnectionListener), andEngineRestarter.unwrapFilter()only removesFilteringConnectionListenerwrappers. WhenEngineRestartersnapshots and re-wraps:
savedListener = unwrapFilter(getConnectionListener())returnsObservingConnectionListener → userListener(unwrapFilter stops at non-FilteringConnectionListener types)filteringListener = new FilteringConnectionListener(savedListener)createsFilteringConnectionListener → ObservingConnectionListener → userListenersetConnectionListener(filteringListener)callsunwrap()which returns it unchanged (not an ObservingConnectionListener), then wraps it again- Final chain:
ObservingConnectionListener → FilteringConnectionListener → ObservingConnectionListener → userListenerThis causes
connectedObserversto fire twice on eachonConnectedevent. The impact is limited becauseEngineRestarter.onEngineReconnected()is idempotent (due to therestartScheduledflag), making this wasteful rather than breaking.Consider tracking the raw user-provided listener separately so
EngineRestartercan safely wrap and unwrap without stacking layers.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tool/src/main/java/io/netbird/client/tool/EngineRunner.java` around lines 129 - 146, The current setConnectionListener/unwrap design allows ObservingConnectionListener wrappers to be duplicated when EngineRestarter rewraps a saved listener; modify EngineRunner to track the raw user-provided listener separately (e.g., add a private field rawConnectionListener) so setConnectionListener(rawOrWrapped) will: 1) extract the true user listener (using or replacing unwrap if needed) and store it in rawConnectionListener, 2) create exactly one ObservingConnectionListener around that raw listener and assign that to connectionListener and goClient.setConnectionListener; also add a getter like getRawConnectionListener() so EngineRestarter can snapshot and wrap the raw listener with FilteringConnectionListener without causing stacking of ObservingConnectionListener wrappers (keep references to ObservingConnectionListener, FilteringConnectionListener, setConnectionListener, unwrap, getConnectionListener and EngineRestarter.unwrapFilter in mind).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@tool/src/main/java/io/netbird/client/tool/EngineRestarter.java`:
- Around line 113-127: The FilteringConnectionListener wrapper is never removed
after a successful restart, causing wrapper accumulation; in
EngineRestarter.onStarted() (and the corresponding success path around lines
~185-191) after calling filteringListener.allowAfterFirstConnectingOrConnected()
and unsuppressAll(...), restore the original listener by calling
engineRunner.setConnectionListener(savedListener) (or
setConnectionListener(currentListenerSaved) if you store it) and clear
filteringListener/currentListener as done elsewhere; ensure you reference and
use the saved original listener variable used when installing the filter (and
update any other success/cleanup paths to do the same) so unwrapFilter() is no
longer necessary to traverse ObservingConnectionListener wrappers.
---
Nitpick comments:
In `@tool/src/main/java/io/netbird/client/tool/EngineRunner.java`:
- Around line 129-146: The current setConnectionListener/unwrap design allows
ObservingConnectionListener wrappers to be duplicated when EngineRestarter
rewraps a saved listener; modify EngineRunner to track the raw user-provided
listener separately (e.g., add a private field rawConnectionListener) so
setConnectionListener(rawOrWrapped) will: 1) extract the true user listener
(using or replacing unwrap if needed) and store it in rawConnectionListener, 2)
create exactly one ObservingConnectionListener around that raw listener and
assign that to connectionListener and goClient.setConnectionListener; also add a
getter like getRawConnectionListener() so EngineRestarter can snapshot and wrap
the raw listener with FilteringConnectionListener without causing stacking of
ObservingConnectionListener wrappers (keep references to
ObservingConnectionListener, FilteringConnectionListener, setConnectionListener,
unwrap, getConnectionListener and EngineRestarter.unwrapFilter in mind).
In `@tool/src/main/java/io/netbird/client/tool/VPNService.java`:
- Around line 88-90: Remove the redundant null-checks around engineRestarter so
the call sites are consistent: call engineRestarter.cancelPendingRestart()
directly (without if (engineRestarter != null) guards) in the broadcast receiver
and onRevoke to match the existing direct calls in onStartCommand and the binder
methods in VPNService; locate usages by searching for engineRestarter and
cancelPendingRestart within the VPNService class and update the broadcast
receiver and onRevoke sites accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: dce469c4-7c40-4618-b94d-df1c3e7d9ab1
📒 Files selected for processing (4)
tool/src/main/java/io/netbird/client/tool/EngineRestarter.javatool/src/main/java/io/netbird/client/tool/EngineRunner.javatool/src/main/java/io/netbird/client/tool/VPNService.javatool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java
|
@coderabbitai full review, close the deprecated comments |
|
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
tool/src/main/java/io/netbird/client/tool/EngineRunner.java (1)
129-146: Good unwrapping, butsetConnectionListenercan still double-wrap during restarts.The unwrap logic handles
ObservingConnectionListener, but whenEngineRestarterpasses aFilteringConnectionListenerwrapping anObservingConnectionListener, this method wraps it again with anotherObservingConnectionListener. This causesconnectedObserversto be notified twice peronConnectedduring restart.The double notification is benign for the current use case (cancellation is idempotent), but could be cleaned up by extending
unwrapto also handleFilteringConnectionListener.🔧 Optional fix to prevent double-wrapping
private static ConnectionListener unwrap(ConnectionListener listener) { ConnectionListener current = listener; - while (current instanceof ObservingConnectionListener) { - current = ((ObservingConnectionListener) current).delegate; + while (true) { + if (current instanceof ObservingConnectionListener) { + current = ((ObservingConnectionListener) current).delegate; + } else if (current instanceof EngineRestarter.FilteringConnectionListener) { + current = ((EngineRestarter.FilteringConnectionListener) current).delegate; + } else { + break; + } } return current; }This would require making
FilteringConnectionListener.delegateaccessible or adding a getter.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tool/src/main/java/io/netbird/client/tool/EngineRunner.java` around lines 129 - 146, The unwrap logic in EngineRunner.setConnectionListener only strips ObservingConnectionListener, so when EngineRestarter hands a FilteringConnectionListener wrapping an ObservingConnectionListener the code double-wraps and causes duplicate notifications; update the private static unwrap(ConnectionListener) to iteratively peel both ObservingConnectionListener and FilteringConnectionListener wrappers (use FilteringConnectionListener's delegate getter or make its delegate accessible if necessary) so the returned listener is the innermost raw delegate before re-wrapping with a single ObservingConnectionListener.tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java (1)
20-21: Consider using plain boolean since all access is already synchronized.
AtomicBooleanwithnetworkCallbackLocksynchronization is redundant. Since all reads/writes todefaultNetworkCallbackActiveoccur withinsynchronized(networkCallbackLock)blocks, a plainvolatile booleanor even a regularbooleanwould suffice and be clearer.🔧 Optional simplification
- private final AtomicBoolean defaultNetworkCallbackActive = new AtomicBoolean(false); + private boolean defaultNetworkCallbackActive = false; private final Object networkCallbackLock = new Object();Then replace
.get()with direct reads and.set(...)with direct assignments within the synchronized blocks.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java` around lines 20 - 21, The field defaultNetworkCallbackActive is an AtomicBoolean but all access is already guarded by synchronized(networkCallbackLock), so replace AtomicBoolean defaultNetworkCallbackActive with a plain boolean (or volatile boolean if you prefer visibility guarantees) and update all usages inside synchronized(networkCallbackLock) in NetworkChangeDetector to use direct reads/assignments instead of .get()/.set(...); ensure references to defaultNetworkCallbackActive and the existing networkCallbackLock remain unchanged and that synchronized blocks continue to surround read/write operations (e.g., where methods check or toggle the callback active state).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tool/src/main/java/io/netbird/client/tool/EngineRunner.java`:
- Around line 129-146: The unwrap logic in EngineRunner.setConnectionListener
only strips ObservingConnectionListener, so when EngineRestarter hands a
FilteringConnectionListener wrapping an ObservingConnectionListener the code
double-wraps and causes duplicate notifications; update the private static
unwrap(ConnectionListener) to iteratively peel both ObservingConnectionListener
and FilteringConnectionListener wrappers (use FilteringConnectionListener's
delegate getter or make its delegate accessible if necessary) so the returned
listener is the innermost raw delegate before re-wrapping with a single
ObservingConnectionListener.
In
`@tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java`:
- Around line 20-21: The field defaultNetworkCallbackActive is an AtomicBoolean
but all access is already guarded by synchronized(networkCallbackLock), so
replace AtomicBoolean defaultNetworkCallbackActive with a plain boolean (or
volatile boolean if you prefer visibility guarantees) and update all usages
inside synchronized(networkCallbackLock) in NetworkChangeDetector to use direct
reads/assignments instead of .get()/.set(...); ensure references to
defaultNetworkCallbackActive and the existing networkCallbackLock remain
unchanged and that synchronized blocks continue to surround read/write
operations (e.g., where methods check or toggle the callback active state).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: fdcd2a3c-6143-4b11-8009-10f03eaffb4e
📒 Files selected for processing (9)
netbirdtool/src/main/java/io/netbird/client/tool/EngineRestarter.javatool/src/main/java/io/netbird/client/tool/EngineRunner.javatool/src/main/java/io/netbird/client/tool/ForegroundNotification.javatool/src/main/java/io/netbird/client/tool/VPNService.javatool/src/main/java/io/netbird/client/tool/networks/ConcreteNetworkAvailabilityListener.javatool/src/main/java/io/netbird/client/tool/networks/NetworkAvailabilityListener.javatool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.javatool/src/test/java/io/netbird/client/tool/ConcreteNetworkAvailabilityListenerUnitTest.java
- Restore the original ConnectionListener on successful restart so FilteringConnectionListener wrappers do not accumulate across cycles. Drop the now-unused allowAfterFirstConnectingOrConnected hook; only one engine ever runs at a time on Android, so there is no source of late events that would justify keeping the filter past onStarted. - NetworkChangeDetector: replace AtomicBoolean with a plain boolean now that all access is guarded by networkCallbackLock. - VPNService: drop the redundant engineRestarter null-checks in the stop-broadcast receiver and onRevoke; engineRestarter is initialized in onCreate before any of these paths are reachable. - Bump netbird submodule to current main HEAD (ed828b7a).
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tool/src/main/java/io/netbird/client/tool/EngineRestarter.java`:
- Around line 86-87: The cleanup() method can permanently leave external
ServiceStateListener instances suppressed because suppressedHolder is a local
variable inside restartEngine(); make suppressedHolder a class-level
AtomicReference<List<ServiceStateListener>> (or otherwise accessible state) so
cleanup() can see and restore/unsuppress any listeners removed during a restart
window; update restartEngine() to set the shared suppressedHolder when it
suppresses listeners and update cleanup() to check that shared suppressedHolder,
re-register or re-enable those listeners (or clear suppression) and then clear
the AtomicReference to avoid leaks; apply the same change where
suppressedHolder-like locals appear (lines referenced around restartEngine(),
and the other similar blocks noted).
- Around line 39-47: The scheduling/cancel flow races because restartScheduled
and handler.postDelayed/removeCallbacks are not atomic; wrap all accesses that
set restartScheduled and invoke handler.postDelayed(handler, restartRunnable) or
handler.removeCallbacks(restartRunnable) in a single lock to make them atomic.
Add a private final Object restartLock and surround the code in
onEngineReconnected, cancelPendingRestart and the restart scheduling code paths
(e.g., onNetworkTypeChanged where restartScheduled=true and postDelayed is
called, lines referenced 302–304 and 313–318) with synchronized(restartLock)
blocks so setting restartScheduled and adding/removing the runnable happen as
one atomic operation. Ensure every place that reads restartScheduled is also
synchronized on the same restartLock.
In
`@tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java`:
- Around line 108-111: The code sets defaultNetworkCallbackActive before calling
connectivityManager.registerDefaultNetworkCallback(defaultNetworkCallback),
which can throw and leave the flag true while registration failed; change the
synchronized block so you call
connectivityManager.registerDefaultNetworkCallback(defaultNetworkCallback) first
and only set defaultNetworkCallbackActive = true after that call returns
successfully (still inside the synchronized(networkCallbackLock) block), and
optionally catch and handle registration exceptions so the flag is not modified
on failure; reference networkCallbackLock, defaultNetworkCallbackActive,
connectivityManager.registerDefaultNetworkCallback(...), and
defaultNetworkCallback (onAvailable() reads the flag).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d783e6ad-133d-424b-814b-e5849281cdbe
📒 Files selected for processing (3)
tool/src/main/java/io/netbird/client/tool/EngineRestarter.javatool/src/main/java/io/netbird/client/tool/VPNService.javatool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java
🚧 Files skipped from review as they are similar to previous changes (1)
- tool/src/main/java/io/netbird/client/tool/VPNService.java
| private void onEngineReconnected() { | ||
| // The Go core reconnected on its own; the pending restart is no | ||
| // longer needed. Cancel the debounced restart so we do not tear | ||
| // down a working connection. | ||
| if (restartScheduled) { | ||
| Log.d(LOGTAG, "engine reconnected on its own, cancelling pending restart"); | ||
| restartScheduled = false; | ||
| handler.removeCallbacks(restartRunnable); | ||
| } |
There was a problem hiding this comment.
Make restart schedule/cancel operations atomic to avoid stray restarts.
onNetworkTypeChanged() (Line 302–304) can interleave with cancelPendingRestart() / onEngineReconnected() so cancellation runs between restartScheduled=true and postDelayed(...). That leaves a runnable queued even though cancellation already happened.
Suggested fix
class EngineRestarter implements NetworkToggleListener {
+ private final Object restartScheduleLock = new Object();
@@
private void onEngineReconnected() {
@@
- if (restartScheduled) {
- Log.d(LOGTAG, "engine reconnected on its own, cancelling pending restart");
- restartScheduled = false;
- handler.removeCallbacks(restartRunnable);
- }
+ synchronized (restartScheduleLock) {
+ if (restartScheduled) {
+ Log.d(LOGTAG, "engine reconnected on its own, cancelling pending restart");
+ }
+ handler.removeCallbacks(restartRunnable);
+ restartScheduled = false;
+ }
}
@@
public void onNetworkTypeChanged() {
@@
- restartScheduled = true;
- handler.removeCallbacks(restartRunnable);
- handler.postDelayed(restartRunnable, DEBOUNCE_DELAY_MS);
+ synchronized (restartScheduleLock) {
+ restartScheduled = true;
+ handler.removeCallbacks(restartRunnable);
+ handler.postDelayed(restartRunnable, DEBOUNCE_DELAY_MS);
+ }
}
@@
public void cancelPendingRestart() {
- if (restartScheduled) {
- Log.d(LOGTAG, "external action took over engine lifecycle; cancelling pending restart");
- handler.removeCallbacks(restartRunnable);
- restartScheduled = false;
- }
+ synchronized (restartScheduleLock) {
+ if (restartScheduled) {
+ Log.d(LOGTAG, "external action took over engine lifecycle; cancelling pending restart");
+ }
+ handler.removeCallbacks(restartRunnable);
+ restartScheduled = false;
+ }
}Also applies to: 302-304, 313-318
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tool/src/main/java/io/netbird/client/tool/EngineRestarter.java` around lines
39 - 47, The scheduling/cancel flow races because restartScheduled and
handler.postDelayed/removeCallbacks are not atomic; wrap all accesses that set
restartScheduled and invoke handler.postDelayed(handler, restartRunnable) or
handler.removeCallbacks(restartRunnable) in a single lock to make them atomic.
Add a private final Object restartLock and surround the code in
onEngineReconnected, cancelPendingRestart and the restart scheduling code paths
(e.g., onNetworkTypeChanged where restartScheduled=true and postDelayed is
called, lines referenced 302–304 and 313–318) with synchronized(restartLock)
blocks so setting restartScheduled and adding/removing the runnable happen as
one atomic operation. Ensure every place that reads restartScheduled is also
synchronized on the same restartLock.
| AtomicReference<List<ServiceStateListener>> suppressedHolder = new AtomicReference<>(); | ||
|
|
There was a problem hiding this comment.
cleanup() can leave external listeners permanently suppressed.
Suppressed listeners are tracked only via local suppressedHolder in restartEngine(). If cleanup() happens during a restart window, callbacks/listeners are removed but suppressed external listeners are never unsuppressed.
Suggested fix
class EngineRestarter implements NetworkToggleListener {
@@
private Runnable timeoutCallback;
private ServiceStateListener currentListener;
+ private final AtomicReference<List<ServiceStateListener>> suppressedListenersRef = new AtomicReference<>();
@@
- AtomicReference<List<ServiceStateListener>> suppressedHolder = new AtomicReference<>();
@@
- unsuppressAll(suppressedHolder.get());
+ unsuppressAll(suppressedListenersRef.getAndSet(null));
@@
- unsuppressAll(suppressedHolder.get());
+ unsuppressAll(suppressedListenersRef.getAndSet(null));
@@
- unsuppressAll(suppressedHolder.get());
+ unsuppressAll(suppressedListenersRef.getAndSet(null));
@@
- suppressedHolder.set(suppressed);
+ suppressedListenersRef.set(suppressed);
@@
public void cleanup() {
@@
+ unsuppressAll(suppressedListenersRef.getAndSet(null));
+
if (timeoutCallback != null) {
handler.removeCallbacks(timeoutCallback);
}Also applies to: 164-170, 325-340
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tool/src/main/java/io/netbird/client/tool/EngineRestarter.java` around lines
86 - 87, The cleanup() method can permanently leave external
ServiceStateListener instances suppressed because suppressedHolder is a local
variable inside restartEngine(); make suppressedHolder a class-level
AtomicReference<List<ServiceStateListener>> (or otherwise accessible state) so
cleanup() can see and restore/unsuppress any listeners removed during a restart
window; update restartEngine() to set the shared suppressedHolder when it
suppresses listeners and update cleanup() to check that shared suppressedHolder,
re-register or re-enable those listeners (or clear suppression) and then clear
the AtomicReference to avoid leaks; apply the same change where
suppressedHolder-like locals appear (lines referenced around restartEngine(),
and the other similar blocks noted).
| synchronized (networkCallbackLock) { | ||
| defaultNetworkCallbackActive = true; | ||
| connectivityManager.registerDefaultNetworkCallback(defaultNetworkCallback); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Android ConnectivityManager.registerDefaultNetworkCallback: what exceptions can it throw (including already-registered callback cases), and what is the recommended failure-handling pattern?
💡 Result:
ConnectivityManager.registerDefaultNetworkCallback throws a RuntimeException (specifically ConnectivityManager.TooManyRequestsException) if the app has exceeded the limit of outstanding network requests/callbacks (typically around 100 per app/UID, shared across registerNetworkCallback, requestNetwork, registerDefaultNetworkCallback, and related methods). No exceptions are documented or evidenced for registering the same callback instance multiple times while it is already registered. The official documentation states "A NetworkCallback should be registered at most once at any time," but does not specify an exception for violations; instead, it counts toward the total limit. Stack traces show TooManyRequestsException specifically from exceeding the total count, not per-callback duplication. Unregistering an already-unregistered callback throws IllegalArgumentException: "NetworkCallback was already unregistered". Recommended failure-handling pattern: Wrap the registration in a try-catch for RuntimeException (or TooManyRequestsException). Log the error, but do not crash—treat it as a non-fatal failure (e.g., network monitoring unavailable). Always properly unregister callbacks in lifecycle methods (e.g., onDestroy/onStop) using try-catch for IllegalArgumentException to handle double-unregister safely. Use a single shared callback instance per app where possible to minimize registration count. Check activeNetwork or NetworkCapabilities before/after to fallback for connectivity status.
Citations:
- 1: https://developer.android.com/reference/android/net/ConnectivityManager
- 2: [Android] ConnectivityManager$TooManyRequestsException caused by multiple registerDefaultNetworkCallback calls fluttercommunity/plus_plugins#3756
- 3: https://developer.android.com/reference/kotlin/android/net/ConnectivityManager.NetworkCallback
- 4: NetworkCallbackConnectivityReceiver.register android.net.ConnectivityManager$TooManyRequestsException react-native-netinfo/react-native-netinfo#719
- 5: https://brightinventions.pl/blog/android-connectivity-exception-too-many-requests/
- 6: https://www.javatips.net/api/android.net.connectivitymanager.toomanyrequestsexception
- 7: https://stackoverflow.com/questions/76540108/android-net-connectivitymanagertoomanyrequestsexception
- 8: https://developer.android.com/reference/android/net/ConnectivityManager.NetworkCallback
- 9: https://stackoverflow.com/questions/53935621/connectivity-manager-unregisternetworkcallback-already-unregistered
🏁 Script executed:
fd -t f "NetworkChangeDetector.java" --exec wc -l {} \;Repository: netbirdio/android-client
Length of output: 152
🏁 Script executed:
fd -t f "NetworkChangeDetector.java" -exec cat -n {} \; | head -150 | tail -50Repository: netbirdio/android-client
Length of output: 299
🏁 Script executed:
fd -t f "NetworkChangeDetector.java" -exec sed -n '100,120p' {} \;Repository: netbirdio/android-client
Length of output: 299
🏁 Script executed:
find . -name "NetworkChangeDetector.java" -type fRepository: netbirdio/android-client
Length of output: 148
🏁 Script executed:
cat -n ./tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java | sed -n '100,120p'Repository: netbirdio/android-client
Length of output: 1018
🏁 Script executed:
cat -n ./tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java | sed -n '1,50p'Repository: netbirdio/android-client
Length of output: 2518
🏁 Script executed:
cat -n ./tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java | sed -n '110,137p'Repository: netbirdio/android-client
Length of output: 1145
🏁 Script executed:
grep -n "defaultNetworkCallbackActive" ./tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.javaRepository: netbirdio/android-client
Length of output: 296
🏁 Script executed:
cat -n ./tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java | sed -n '65,85p'Repository: netbirdio/android-client
Length of output: 1412
Move defaultNetworkCallbackActive = true after successful registration to ensure atomic state.
If registerDefaultNetworkCallback(defaultNetworkCallback) throws an exception (e.g., TooManyRequestsException), the flag remains true while registration fails, leaving the lifecycle state inconsistent. The callback's onAvailable() method checks this flag (line 72) under the same lock and will see stale state. This is especially problematic in restart/stop paths.
🔧 Suggested fix
synchronized (networkCallbackLock) {
- defaultNetworkCallbackActive = true;
- connectivityManager.registerDefaultNetworkCallback(defaultNetworkCallback);
+ try {
+ connectivityManager.registerDefaultNetworkCallback(defaultNetworkCallback);
+ defaultNetworkCallbackActive = true;
+ } catch (Exception e) {
+ defaultNetworkCallbackActive = false;
+ Log.e(LOGTAG, "failed to register default network callback", e);
+ }
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@tool/src/main/java/io/netbird/client/tool/networks/NetworkChangeDetector.java`
around lines 108 - 111, The code sets defaultNetworkCallbackActive before
calling
connectivityManager.registerDefaultNetworkCallback(defaultNetworkCallback),
which can throw and leave the flag true while registration failed; change the
synchronized block so you call
connectivityManager.registerDefaultNetworkCallback(defaultNetworkCallback) first
and only set defaultNetworkCallbackActive = true after that call returns
successfully (still inside the synchronized(networkCallbackLock) block), and
optionally catch and handle registration exceptions so the flag is not modified
on failure; reference networkCallbackLock, defaultNetworkCallbackActive,
connectivityManager.registerDefaultNetworkCallback(...), and
defaultNetworkCallback (onAvailable() reads the flag).
Summary
Fixes the Android app appearing "Disconnected" during WiFi ↔ cellular roams, even though the engine eventually recovered. Also avoids unnecessary engine restarts when the Go core reconnects on its own faster than the Android debounce can fire.
Problem
Diagnosed from user logs of a WiFi → cellular → WiFi roam:
1. UI stuck on "Disconnected" during restart
EngineRestarterstops/restarts the Go engine on network changes but never emits a Connecting state. The old engine's teardown emitsDisconnectedbefore the new engine starts, leaving the UI stuck.2. Spurious restart on cold start
Android's initial
onAvailableburst afterregisterNetworkCallbackwas treated as a transition, cancelling the first login.3. Missed network handover (seamless WiFi switch)
The previous detector relied on per-network
onAvailable/onLostpairing. On seamless handovers, Android may omitonLostfor the departing WiFi, leaving the WIFI flag set and preventing restart.4. Unnecessary restart when Go reconnects on its own
The Go core reconnects management/signal/relay within ~1s of a network change, but Android's 2s debounce still triggers an engine restart, disrupting a healthy connection.
5. Pending restart fired after explicit stop / revoke
A debounced restart scheduled just before the user (or the system) stopped the VPN would still fire after the engine was already torn down, causing a spurious re-start of the engine.
Changes
Engine restart UX
Suppress old engine state events during restart
(
EngineRestarter.java,EngineRunner.java)ConnectionListenerto droponDisconnecting/onDisconnectedduring restart.onConnecting/onConnectedfrom the new engine.EngineRunnersoServiceStateListener.onStopped/onStartedare not delivered to UI consumers during restart.ConnectionListener.onConnecting()during restart for proper UI feedback.onDisconnected()only on real failure or after the 30s timeout.ServiceStateListeneron the timeout path so a lateonStoppedcannot triggerrunWithoutAuthagainst a stale listener.ObservingConnectionListener(inEngineRunner.setConnectionListener) andFilteringConnectionListener(inEngineRestarter) so wrappers do not stack on repeated restart cycles.Skip restart when engine reconnects autonomously
(
EngineRestarter.java,EngineRunner.java)EngineRunner.setConnectionListenerwraps the listener and fans outonConnectedto observers.EngineRestarterregisters an observer; if the engine reconnects while the 2s debounce is still pending, the scheduled restart is cancelled.Cancel pending restart on explicit stop
(
VPNService.java,EngineRestarter.java)EngineRestarter.cancelPendingRestart()and call it on the stop-engine broadcast, ononRevoke, onrunEngine, onstopEngine, and on the always-on start path beforerunWithoutAuth.Network detection
Default-network-based type detection
(
NetworkAvailabilityListener.java,ConcreteNetworkAvailabilityListener.java,NetworkChangeDetector.java)onDefaultNetworkTypeChanged(int)sourced from the default-network callback.onAvailable/onLostpairing for transition detection; the default-network signal is authoritative.subscribe()is treated as the current state, not a transition — avoids the cold-start spurious restart.BooleanSuppliergate (engineRunner::isRunning) further suppresses notifications until the engine is actually running.defaultNetworkCallbackActiveflag so callbacks delivered afterunregisterNetworkCallbackare ignored.Notification
(
ForegroundNotification.java)IMPORTANCE_LOW, disable sound and vibration so the persistent VPN notification does not nag the user on every state change.Summary by CodeRabbit
Bug Fixes
Improvements