[client] Fix exit node menu not refreshing on Windows by mlsmaycon · Pull Request #5553 · netbirdio/netbird

mlsmaycon · 2026-03-09T15:39:50Z

Describe your changes

TrayOpenedCh is not implemented in the systray library on Windows, so exit nodes were never refreshed after the initial connect. Combined with the management sync not having populated routes yet when the Connected status fires, this caused the exit node menu to remain empty permanently after disconnect/reconnect cycles.

Add a background poller on Windows that refreshes exit nodes while connected, with fast initial polling to catch routes from management sync followed by a steady 10s interval. On macOS/Linux, TrayOpenedCh continues to handle refreshes on each tray open.

Also fix a data race on connectClient assignment in the server's connect() method and add nil checks in CleanState/DeleteState to prevent panics when connectClient is nil.

Issue ticket number and link

Stack

Checklist

By submitting this pull request, you confirm that you have read and agree to the terms of the Contributor License Agreement.

Documentation

Select exactly one:

I added/updated documentation for this change
Documentation is not needed for this change (explain why)

Docs PR URL (required if "docs added" is checked)

Paste the PR link from https://github.com/netbirdio/docs here:

https://github.com/netbirdio/docs/pull/__

Summary by CodeRabbit

Bug Fixes
- Prevented nil-pointer issues by guarding connection state checks
- Ensured cleanup clears connection state safely and cancels exit-node retries on disconnect
- Improved concurrent access safety for connection client
New Features
- OS-aware exit-node refresh with optimized polling and more reliable menu refresh
- Safer connection initialization and assignment flow
Tests
- Added tests covering connection synchronization, concurrent access, nil/edge cases, and exit-node refresh behavior

TrayOpenedCh is not implemented in the systray library on Windows, so exit nodes were never refreshed after the initial connect. Combined with the management sync not having populated routes yet when the Connected status fires, this caused the exit node menu to remain empty permanently after disconnect/reconnect cycles. Add a background poller on Windows that refreshes exit nodes while connected, with fast initial polling to catch routes from management sync followed by a steady 10s interval. On macOS/Linux, TrayOpenedCh continues to handle refreshes on each tray open. Also fix a data race on connectClient assignment in the server's connect() method and add nil checks in CleanState/DeleteState to prevent panics when connectClient is nil.

coderabbitai · 2026-03-09T16:23:23Z

📝 Walkthrough

Walkthrough

Adds mutex-protected assignment for server connectClient, guards state cleanup against nil connectClient, adds tests for concurrent connectClient access and edge cases, and refactors exit-node refresh to use a cancelable OS-aware poller and boolean-returning update flow.

Changes

Cohort / File(s)	Summary
Server connection & tests `client/server/server.go`, `client/server/state.go`, `client/server/server_connect_test.go`	Assigns newly created ConnectClient to a local variable, sets persistence, then mutex-protects assignment to `s.connectClient`; switches to running the local client. Adds nil-checks in `CleanState`/`DeleteState`. Introduces tests covering mutex protection, concurrent access, cleanup, nil handling, and stale running-channel scenarios.
Exit-node refresh refactor (UI) `client/ui/client_ui.go`, `client/ui/event_handler.go`, `client/ui/network.go`	Replaces `exitNodeStates` slice with `exitNodeRetryCancel` cancel func; adds `startExitNodeRefresh`, `cancelExitNodeRetry`, and `pollExitNodes`; changes `updateExitNodes()` to return `bool`; implements OS-specific polling (Windows background poller vs. direct update), removes state-diff logic, and updates disconnect handling to cancel ongoing retries.

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

[client] Update fyne and add exit menu retry #5187: Touches exit-node refresh and retry logic in client/ui, overlapping updateExitNodes and related flows.
[client, gui] fix exit nodes menu on reconnect, remove tooltips #5167: Alters exit-node disconnect/reconnect handling and exitNodeStates behavior—closely related to the cancel-on-disconnect change.
[client] Fix Linux UI flickering on state updates #4886: Changes serviceClient fields and status/update behavior tied to exit-node menu management.

Suggested reviewers

pappz
braginini

Poem

🐰 I tunneled through mutexes, neat and swift,
Assigned with care, no racing drift.
Exit nodes now poll with a thoughtful beat,
Cancel when we part, and rest when we meet.
Hop, hop—synchrony complete! 🥕

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 77.78% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	❓ Inconclusive	The description covers the bug, root cause, solution approach, and additional fixes, with the bug fix checkbox marked; however, no issue ticket link is provided as required by the template.	Add the issue ticket number and link in the 'Issue ticket number and link' section of the description.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely describes the main bugfix: exit node menu not refreshing on Windows, which is the primary issue addressed in the changeset.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/exit-menu-refresh-windows

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

client/ui/network.go (1)
368-419: ⚠️ Potential issue | 🟠 Major

Cancellation doesn't stop the in-flight refresh from mutating the tray.

pollExitNodes() checks ctx.Done() only between iterations. If disconnect cancels the poller while updateExitNodes() is already fetching routes, that call still recreates the menu and can re-enable stale exit nodes after cancelExitNodeRetry() ran. Pass the polling context through to updateExitNodes() and bail again before touching the menu.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@client/ui/network.go` around lines 368 - 419, Pass the polling context into
updateExitNodes and abort menu mutation if the context is cancelled: change
pollExitNodes to call s.updateExitNodes(ctx) and updateExitNodes to accept ctx
context.Context; after long-running steps (after getSrvClient/getExitNodes
return) check ctx.Done()/ctx.Err() and if cancelled return false without
touching s.recreateExitNodeMenu or changing mExitNode/mExitNodeItems; keep the
s.exitNodeMu.Lock/Unlock around only the menu mutation and ensure you bail
before acquiring the lock when ctx is cancelled.

🧹 Nitpick comments (1)

client/server/server_connect_test.go (1)
28-88: These tests won't fail if Server.connect() regresses.

Both tests reimplement the mutex assignment inline instead of exercising connect(), and TestConcurrentConnectClientAccess only proves that 50 goroutines returned. An unlocked write in production would still leave this file green. Please drive the real publish path or extract it behind a small helper shared by prod and tests.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@client/server/server_connect_test.go` around lines 28 - 88, Both tests bypass
Server.connect() and directly mutate s.connectClient, so regressions in
connect() won't be caught; update tests to exercise the real publish path by
calling s.connect(ctx) (or extract a small exported helper like
Server.setConnectClientLocked(client) used by both connect() and tests) instead
of assigning s.connectClient directly; change TestConnectSetsClientWithMutex and
TestConcurrentConnectClientAccess to invoke connect() (or the new helper) with
newDummyConnectClient/newTestServer so the mutex behavior in connect() is
actually exercised and concurrent readers observe the real write path.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@client/server/server_connect_test.go`:
- Around line 131-179: The test TestDownThenUp_StaleRunningChan codifies the
stale-channel bug by expecting waitForUp() to succeed while clientRunningChan is
already closed and connectClient is nil; either mark this reproducer as
skipped/TODO (use t.Skip or add a TODO comment) or change the assertions to
verify the desired reconnect behavior: after calling cleanupConnection() and
launching waitForUp(), assert that waitForUp() only returns success when
s.connectClient is non-nil (or returns an error if appropriate), and remove the
current assertion that treats success with nil connectClient as correct; update
references to waitForUp(), clientRunningChan, connectClient,
cleanupConnection(), and TestDownThenUp_StaleRunningChan accordingly.

In `@client/server/state.go`:
- Line 42: The guard reads s.connectClient unsafely and calls Status() multiple
times; fix by taking a local snapshot of s.connectClient under s.mutex (the same
mutex used in connect()), then release the mutex and call client.Status() only
once to decide the guard. Apply this change in the checks inside CleanState and
DeleteState (where the current if uses s.connectClient and Status()), ensuring
you reference the same s.mutex and s.connectClient field and replace the
duplicated Status() calls with a single status check on the snapped local
variable.

---

Outside diff comments:
In `@client/ui/network.go`:
- Around line 368-419: Pass the polling context into updateExitNodes and abort
menu mutation if the context is cancelled: change pollExitNodes to call
s.updateExitNodes(ctx) and updateExitNodes to accept ctx context.Context; after
long-running steps (after getSrvClient/getExitNodes return) check
ctx.Done()/ctx.Err() and if cancelled return false without touching
s.recreateExitNodeMenu or changing mExitNode/mExitNodeItems; keep the
s.exitNodeMu.Lock/Unlock around only the menu mutation and ensure you bail
before acquiring the lock when ctx is cancelled.

---

Nitpick comments:
In `@client/server/server_connect_test.go`:
- Around line 28-88: Both tests bypass Server.connect() and directly mutate
s.connectClient, so regressions in connect() won't be caught; update tests to
exercise the real publish path by calling s.connect(ctx) (or extract a small
exported helper like Server.setConnectClientLocked(client) used by both
connect() and tests) instead of assigning s.connectClient directly; change
TestConnectSetsClientWithMutex and TestConcurrentConnectClientAccess to invoke
connect() (or the new helper) with newDummyConnectClient/newTestServer so the
mutex behavior in connect() is actually exercised and concurrent readers observe
the real write path.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d72faf2c-45a8-4255-a334-c7aa11aaf8c1

📥 Commits

Reviewing files that changed from the base of the PR and between 11eb725 and d7f357f.

📒 Files selected for processing (6)

client/server/server.go
client/server/server_connect_test.go
client/server/state.go
client/ui/client_ui.go
client/ui/event_handler.go
client/ui/network.go

sonarqubecloud · 2026-03-09T16:51:58Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

client/ui/network.go (1)

391-415: ⚠️ Potential issue | 🟠 Major

Refresh failures still leave stale exit-node actions in the tray.

Both early return false paths exit before clearing or disabling the menu. With the new cancel-only disconnect flow, a failed refresh can leave old exit-node items visible and clickable even though the daemon state is gone or changed.

🧹 Proposed fix

+func (s *serviceClient) clearExitNodeMenu() {
+	s.exitNodeMu.Lock()
+	defer s.exitNodeMu.Unlock()
+	s.recreateExitNodeMenu(nil)
+	s.mExitNode.Disable()
+}
+
 func (s *serviceClient) updateExitNodes() bool {
 	conn, err := s.getSrvClient(defaultFailTimeout)
 	if err != nil {
 		log.Errorf("get client: %v", err)
+		s.clearExitNodeMenu()
 		return false
 	}
 	exitNodes, err := s.getExitNodes(conn)
 	if err != nil {
 		log.Errorf("get exit nodes: %v", err)
+		s.clearExitNodeMenu()
 		return false
 	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@client/ui/network.go` around lines 391 - 415, The early returns in
updateExitNodes (when getSrvClient or getExitNodes fail) skip clearing or
disabling the exit-node menu, leaving stale clickable items; acquire
s.exitNodeMu before those error checks (or ensure cleanup runs on error) and
call s.recreateExitNodeMenu(nil) then disable s.mExitNode and clear
s.mExitNodeItems before returning false so the tray is always cleared when
updateExitNodes fails; keep references to getSrvClient, getExitNodes,
s.exitNodeMu, s.recreateExitNodeMenu, s.mExitNode and s.mExitNodeItems to locate
and update the function.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@client/ui/network.go`:
- Around line 336-345: The Windows poller swap is not atomic because
cancelExitNodeRetry() unlocks exitNodeMu before startExitNodeRefresh stores the
new cancel func, allowing a disconnect to observe a nil cancel and let the new
goroutine survive; fix by performing the swap under exitNodeMu: change
startExitNodeRefresh to create the ctx/cancel, then lock exitNodeMu and
atomically replace exitNodeRetryCancel (calling the old cancel if non-nil)
before unlocking, and only after that launch the goroutine with the new ctx when
calling pollExitNodes; ensure cancelExitNodeRetry still uses exitNodeMu
consistently so there is no gap between clearing and setting
exitNodeRetryCancel.

---

Outside diff comments:
In `@client/ui/network.go`:
- Around line 391-415: The early returns in updateExitNodes (when getSrvClient
or getExitNodes fail) skip clearing or disabling the exit-node menu, leaving
stale clickable items; acquire s.exitNodeMu before those error checks (or ensure
cleanup runs on error) and call s.recreateExitNodeMenu(nil) then disable
s.mExitNode and clear s.mExitNodeItems before returning false so the tray is
always cleared when updateExitNodes fails; keep references to getSrvClient,
getExitNodes, s.exitNodeMu, s.recreateExitNodeMenu, s.mExitNode and
s.mExitNodeItems to locate and update the function.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0f08ff84-6a76-40af-9e17-171ab3a61c80

📥 Commits

Reviewing files that changed from the base of the PR and between d7f357f and bd1d153.

📒 Files selected for processing (1)

client/ui/network.go

mlsmaycon requested a review from lixmal March 9, 2026 15:39

lixmal previously approved these changes Mar 9, 2026

View reviewed changes

Remove unused exitNodeIDs

d7f357f

mlsmaycon dismissed lixmal’s stale review via d7f357f March 9, 2026 16:22

coderabbitai Bot reviewed Mar 9, 2026

View reviewed changes

Comment thread client/server/server_connect_test.go

Comment thread client/server/state.go

Remove unused exitNodeState struct

bd1d153

coderabbitai Bot reviewed Mar 9, 2026

View reviewed changes

Comment thread client/ui/network.go

lixmal approved these changes Mar 9, 2026

View reviewed changes

mlsmaycon merged commit 15aa6ba into main Mar 9, 2026
57 of 58 checks passed

mlsmaycon deleted the fix/exit-menu-refresh-windows branch March 9, 2026 17:39

MobileManiC mentioned this pull request Mar 10, 2026

Netbird client 0.61.0 Tray Bar menu and icon not updating #5005

Open

This was referenced Mar 24, 2026

[client] Add IPv6 overlay address support to WireGuard interface and engine #5675

Merged

[client] Fix Exit Node submenu separator accumulation on Windows #5691

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[client] Fix exit node menu not refreshing on Windows#5553

[client] Fix exit node menu not refreshing on Windows#5553
mlsmaycon merged 3 commits intomainfrom
fix/exit-menu-refresh-windows

mlsmaycon commented Mar 9, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 9, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

sonarqubecloud Bot commented Mar 9, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

mlsmaycon commented Mar 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes

Issue ticket number and link

Stack

Checklist

Documentation

Docs PR URL (required if "docs added" is checked)

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sonarqubecloud Bot commented Mar 9, 2026

Quality Gate passed

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mlsmaycon commented Mar 9, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 9, 2026 •

edited

Loading