Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 48 additions & 12 deletions client/internal/dns/host_darwin.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@ import (
)

const (
netbirdDNSStateKeyFormat = "State:/Network/Service/NetBird-%s/DNS"
netbirdDNSStateKeyIndexedFormat = "State:/Network/Service/NetBird-%s-%d/DNS"
netbirdDNSStateKeyFormat = "State:/Network/Service/NetBird-%s-%s/DNS"
netbirdDNSStateKeyIndexedFormat = "State:/Network/Service/NetBird-%s-%s-%d/DNS"
globalIPv4State = "State:/Network/Global/IPv4"
primaryServiceStateKeyFormat = "State:/Network/Service/%s/DNS"
keySupplementalMatchDomains = "SupplementalMatchDomains"
Expand All @@ -51,14 +51,19 @@ const (
type systemConfigurator struct {
createdKeys map[string]struct{}
systemDNSSettings SystemDNSSettings
interfaceName string

mu sync.RWMutex
origNameservers []netip.Addr
}

func newHostManager() (*systemConfigurator, error) {
func newHostManager(interfaceName string) (*systemConfigurator, error) {
if interfaceName == "" {
return nil, fmt.Errorf("interfaceName must not be empty")
}
return &systemConfigurator{
createdKeys: make(map[string]struct{}),
createdKeys: make(map[string]struct{}),
interfaceName: interfaceName,
}, nil
}

Expand All @@ -67,6 +72,12 @@ func (s *systemConfigurator) supportCustomPort() bool {
}

func (s *systemConfigurator) applyDNSConfig(config HostDNSConfig, stateManager *statemanager.Manager) error {
if err := stateManager.UpdateState(&ShutdownState{
InterfaceName: s.interfaceName,
}); err != nil {
log.Errorf("failed to update shutdown state: %s", err)
}
Comment on lines +75 to +79
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't continue past a failed interface checkpoint.

This is the only state write that happens before we create any scoped NetBird-<iface>-* keys. If it fails and we keep going, an unclean restart can end up with no persisted InterfaceName, and Cleanup() then falls back to legacy non-scoped discovery only, leaving the new scoped keys behind.

🩹 Proposed fix
 	if err := stateManager.UpdateState(&ShutdownState{
 		InterfaceName: s.interfaceName,
 	}); err != nil {
-		log.Errorf("failed to update shutdown state: %s", err)
+		return fmt.Errorf("update dns shutdown state: %w", err)
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if err := stateManager.UpdateState(&ShutdownState{
InterfaceName: s.interfaceName,
}); err != nil {
log.Errorf("failed to update shutdown state: %s", err)
}
if err := stateManager.UpdateState(&ShutdownState{
InterfaceName: s.interfaceName,
}); err != nil {
return fmt.Errorf("update dns shutdown state: %w", err)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@client/internal/dns/host_darwin.go` around lines 75 - 79, The call to
stateManager.UpdateState(&ShutdownState{InterfaceName: s.interfaceName}) must
not be ignored on failure; if UpdateState returns an error you must stop further
execution so the InterfaceName is persisted before creating scoped keys. Modify
the surrounding function (the one invoking UpdateState in host_darwin.go) to
propagate or handle the error immediately (e.g., return the error or abort
initialization) instead of logging and continuing; reference
stateManager.UpdateState, ShutdownState, s.interfaceName and ensure Cleanup will
see a persisted InterfaceName on restart.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a false positive — no code change needed. Here's why:

  1. UpdateState can only fail if the state was never registered — the sole error path in setState() is errStateNotRegistered, which is a programming bug, not a runtime condition. If it fails here, every subsequent UpdateState call in this function (lines 95, 117, 127 via s.updateState(stateManager)) will also fail, so returning early from this one call doesn't improve the situation.

  2. Codebase convention is log-and-continue — 9 of 10 UpdateState call sites across the codebase (routemanager, engine_ssh, dns/systemd_linux, dns/network_manager_unix, dns/resolvconf_unix, dns/host_windows, updatemanager, etc.) all use log.Errorf/log.Warnf and continue. Changing just this one call to return an error would be inconsistent.

  3. The updateState() helper at line 136 also logs-and-continues — even if we returned early here, the same non-fatal pattern is used for all subsequent state updates in this function. Making only the first call fatal while the rest are non-fatal is incoherent.

  4. Returning an error here would break DNS entirelyapplyDNSConfig is the function that configures the client's DNS. Returning early because state persistence failed means DNS won't be configured at all, which is strictly worse than the theoretical risk of orphaned keys on an unclean shutdown.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siriobalmelli, you're right — I accept all four points. The errStateNotRegistered path is a programming invariant (not a runtime condition), and blocking DNS configuration because of a state-persistence failure would be strictly worse than the theoretical risk of orphaned keys. The log-and-continue pattern is consistent with the rest of the codebase. I'll retract this comment.


✏️ Learnings added
Learnt from: siriobalmelli
Repo: netbirdio/netbird PR: 5504
File: client/internal/dns/host_darwin.go:75-79
Timestamp: 2026-03-13T09:37:32.529Z
Learning: In netbirdio/netbird, `stateManager.UpdateState()` failures are handled with log.Errorf/log.Warnf and execution continues (never return an error). The sole error path is `errStateNotRegistered`, which is a programming bug not a runtime condition. All call sites across routemanager, engine_ssh, dns/systemd_linux, dns/network_manager_unix, dns/resolvconf_unix, dns/host_windows, and updatemanager follow this log-and-continue convention. Do not flag log-and-continue on UpdateState as a bug in this codebase.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: lixmal
Repo: netbirdio/netbird PR: 5202
File: client/internal/portforward/manager.go:219-222
Timestamp: 2026-01-30T14:43:13.106Z
Learning: In client/internal/portforward/manager.go, the state persistence pattern is designed for retry-on-failure: if DeletePortMapping fails in Stop(), the state is intentionally left intact (via early return) so that cleanupResidual() can attempt deletion again on the next startup. This is the intended behavior, not a bug.

Learnt from: pappz
Repo: netbirdio/netbird PR: 5448
File: client/server/server.go:142-146
Timestamp: 2026-02-25T17:44:13.439Z
Learning: In the netbirdio/netbird client/server/server.go file, the update manager is intentionally designed to operate only with GUI flows, not in headless/daemon mode. When the update manager is initialized in the Start() method, only CheckUpdateSuccess is called; the manager's Start() method is deliberately not invoked in the daemon lifecycle and is instead called only for GUI flows via startUpdateManagerForGUI().
</learning]

Learnt from: pappz
Repo: netbirdio/netbird PR: 5512
File: client/internal/metrics/push.go:167-181
Timestamp: 2026-03-13T09:34:29.533Z
Learning: In netbirdio/netbird PR `#5512`, the `NB_METRICS_INTERVAL` env var (`p.envInterval`) in `client/internal/metrics/push.go` does NOT bypass remote config version gating (`isVersionInRange`) or the `RefreshIfNeeded` nil check. The earlier documentation describing this override behavior was deprecated. Do not flag the `resolve()` function for failing to skip version/config checks when `envInterval > 0` — the current behavior is intentional.

Learnt from: bcmmbaga
Repo: netbirdio/netbird PR: 5441
File: management/server/user.go:745-749
Timestamp: 2026-02-24T19:32:13.189Z
Learning: In Go codebases like netbirdio/netbird, methods returning (T, error) should follow the convention: if an error is returned, propagate it and return early. When the caller handles the error immediately, explicit nil checks on the returned value are unnecessary; rely on the error to guard control flow and only inspect the value when err is nil. This guidance applies broadly to Store methods such as GetUserByUserID in Go code; prefer early return on error and avoid redundant nil checks after a successful error path.

Learnt from: lixmal
Repo: netbirdio/netbird PR: 5530
File: proxy/internal/udp/relay.go:276-278
Timestamp: 2026-03-07T13:59:16.506Z
Learning: For Go 1.25+ projects, use sync.WaitGroup.Go() to run goroutines because it handles Add(1) and defer Done() automatically. Do not flag wg.Go(...) as a compilation error in codebases that require Go 1.25 or newer. If supporting older Go versions, gate such usage with build tags or avoid wg.Go() to maintain compatibility; verify the module targets Go 1.25+ before adopting this pattern.


var (
searchDomains []string
matchDomains []string
Expand Down Expand Up @@ -123,7 +134,10 @@ func (s *systemConfigurator) applyDNSConfig(config HostDNSConfig, stateManager *
}

func (s *systemConfigurator) updateState(stateManager *statemanager.Manager) {
if err := stateManager.UpdateState(&ShutdownState{CreatedKeys: maps.Keys(s.createdKeys)}); err != nil {
if err := stateManager.UpdateState(&ShutdownState{
InterfaceName: s.interfaceName,
CreatedKeys: maps.Keys(s.createdKeys),
}); err != nil {
log.Errorf("failed to update shutdown state: %s", err)
}
}
Expand Down Expand Up @@ -167,6 +181,7 @@ func (s *systemConfigurator) getRemovableKeysWithDefaults() []string {

// discoverExistingKeys probes scutil for all NetBird DNS keys that may exist.
// This handles the case where createdKeys is empty (e.g., state file lost after unclean shutdown).
// It also discovers legacy-format keys from older versions for upgrade migration.
func (s *systemConfigurator) discoverExistingKeys() []string {
dnsKeys, err := getSystemDNSKeys()
if err != nil {
Expand All @@ -176,23 +191,44 @@ func (s *systemConfigurator) discoverExistingKeys() []string {

var keys []string

// Current format: interface-scoped named keys
for _, suffix := range []string{searchSuffix, matchSuffix, localSuffix} {
key := getKeyWithInput(netbirdDNSStateKeyFormat, suffix)
key := getKeyWithInput(netbirdDNSStateKeyFormat, s.interfaceName, suffix)
if strings.Contains(dnsKeys, key) {
keys = append(keys, key)
}
}

// Current format: interface-scoped indexed keys
for _, suffix := range []string{searchSuffix, matchSuffix} {
for i := 0; ; i++ {
key := fmt.Sprintf(netbirdDNSStateKeyIndexedFormat, suffix, i)
key := fmt.Sprintf(netbirdDNSStateKeyIndexedFormat, s.interfaceName, suffix, i)
if !strings.Contains(dnsKeys, key) {
break
}
keys = append(keys, key)
}
}

// Legacy format: non-interface-scoped keys from older versions
for _, suffix := range []string{searchSuffix, matchSuffix, localSuffix} {
legacyKey := fmt.Sprintf("State:/Network/Service/NetBird-%s/DNS", suffix)
if strings.Contains(dnsKeys, legacyKey) {
log.Infof("discovered legacy DNS key (no interface scope): %s", legacyKey)
keys = append(keys, legacyKey)
}
}
for _, suffix := range []string{searchSuffix, matchSuffix} {
for i := 0; ; i++ {
legacyKey := fmt.Sprintf("State:/Network/Service/NetBird-%s-%d/DNS", suffix, i)
if !strings.Contains(dnsKeys, legacyKey) {
break
}
log.Infof("discovered legacy indexed DNS key (no interface scope): %s", legacyKey)
keys = append(keys, legacyKey)
}
}

return keys
}

Expand Down Expand Up @@ -224,7 +260,7 @@ func (s *systemConfigurator) addLocalDNS() error {
return fmt.Errorf("recordSystemDNSSettings(): %w", err)
}
}
localKey := getKeyWithInput(netbirdDNSStateKeyFormat, localSuffix)
localKey := getKeyWithInput(netbirdDNSStateKeyFormat, s.interfaceName, localSuffix)
if !s.systemDNSSettings.ServerIP.IsValid() || len(s.systemDNSSettings.Domains) == 0 {
log.Info("Not enabling local DNS server")
return nil
Expand Down Expand Up @@ -258,7 +294,7 @@ func (s *systemConfigurator) getSystemDNSSettings() (SystemDNSSettings, error) {
if err != nil || primaryServiceKey == "" {
return SystemDNSSettings{}, fmt.Errorf("couldn't find the primary service key: %w", err)
}
dnsServiceKey := getKeyWithInput(primaryServiceStateKeyFormat, primaryServiceKey)
dnsServiceKey := fmt.Sprintf(primaryServiceStateKeyFormat, primaryServiceKey)
line := buildCommandLine("show", dnsServiceKey, "")
stdinCommands := wrapCommand(line)

Expand Down Expand Up @@ -385,7 +421,7 @@ func (s *systemConfigurator) addBatchedDomains(suffix string, domains []string,
batches := splitDomainsIntoBatches(domains)

for i, batch := range batches {
key := fmt.Sprintf(netbirdDNSStateKeyIndexedFormat, suffix, i)
key := fmt.Sprintf(netbirdDNSStateKeyIndexedFormat, s.interfaceName, suffix, i)
domainsStr := strings.Join(batch, " ")

if err := s.addDNSState(key, domainsStr, ip, port, enableSearch); err != nil {
Expand Down Expand Up @@ -469,8 +505,8 @@ func (s *systemConfigurator) restoreUncleanShutdownDNS() error {
return nil
}

func getKeyWithInput(format, key string) string {
return fmt.Sprintf(format, key)
func getKeyWithInput(format, iface, key string) string {
return fmt.Sprintf(format, iface, key)
}

func buildAddCommandLine(key, value string) string {
Expand Down
Loading