Phase 1: connection-mode enum + backwards-compat (issue #5989)#6047
Phase 1: connection-mode enum + backwards-compat (issue #5989)#6047MichaelUray wants to merge 14 commits intonetbirdio:mainfrom
Conversation
…nfig Additive change for issue netbirdio#5989 Phase 1. New fields use new tag numbers (11, 12, 13); existing fields (including LazyConnectionEnabled tag 6) are unchanged so old clients ignore the additions and old servers send UNSPECIFIED, which the new client maps back via the legacy boolean. Note: the regenerated pb.go files now report protoc v5.29.3 in their header (this branch was generated with locally-installed protoc 29.3 instead of upstream's v7.34.1). Functionally identical; header diff is the only delta beyond the actual schema additions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Defines Mode enum (relay-forced, p2p, p2p-lazy, p2p-dynamic plus the client-only sentinels Unspecified and FollowServer), ParseString for CLI/env input, ToProto/FromProto for wire translation, and the two backwards-compat helpers ResolveLegacyLazyBool / ToLazyConnectionEnabled that bridge the old Settings.LazyConnectionEnabled boolean. Phase 1 of issue netbirdio#5989. Pure addition -- no existing callers touched in this commit; the engine/conn_mgr migration follows in subsequent commits in the same PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…on warns NB_CONNECTION_MODE wins over the legacy pair (NB_FORCE_RELAY, NB_ENABLE_EXPERIMENTAL_LAZY_CONN); when the legacy pair is set together, NB_FORCE_RELAY wins (most-restrictive, mirrors the group-conflict rule from issue netbirdio#5990). Each legacy var emits a one-shot deprecation warning when it actually contributes to the resolved mode. NB_LAZY_CONN_INACTIVITY_THRESHOLD becomes an alias for the future relay_timeout setting and warns once. IsForceRelayed() is kept for callers that have not yet been migrated (conn.go, statusrecorder); they will be updated in the engine/conn refactor commits later in this PR. Phase 1 of issue netbirdio#5989. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three new CLI flags map onto the new connection-mode plumbing: - --connection-mode <relay-forced|p2p|p2p-lazy|p2p-dynamic|follow-server> - --relay-timeout <seconds> - --p2p-timeout <seconds> Plumbed through three sites in cmd/up.go (SetConfigRequest, ConfigInput, LoginRequest), persisted in profilemanager.Config, and added as new fields on the daemon.proto IPC messages. Empty / not-changed flags fall back to the server-pushed value (which itself falls back to the legacy lazy_connection_enabled boolean for old servers). Phase 1 of issue netbirdio#5989. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EngineConfig gains ConnectionMode, RelayTimeoutSeconds, P2pTimeoutSeconds. ConnMgr now stores the resolved Mode plus the raw inputs (env, config) so it can re-resolve when the server pushes a new PeerConfig. UpdatedRemoteFeatureFlag is renamed to UpdatedRemotePeerConfig and takes the full PeerConfig pointer; a thin shim with the old name delegates to it for callers that haven't been updated yet. connect.go copies the three new fields from profilemanager.Config into the EngineConfig builder, with a tolerant parser that logs and falls through to Unspecified on invalid input. Phase 1 of issue netbirdio#5989. peer/conn.go forwarding follows in C4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ConnConfig gains a Mode field forwarded from the engine. Open() now checks Mode == ModeRelayForced instead of calling the global env-reader IsForceRelayed(). The local 'forceRelay' variable name is renamed to 'skipICE' to make the new branching intent explicit. The PeerStateUpdate block at the end of Open() also reads from conn.config.Mode now, so the StatusRecorder sees the per-peer mode rather than the global env var. A single remaining caller of IsForceRelayed() (srWatcher.Start in engine.go) is left for a follow-up; that path uses a process-wide flag not per-peer state, so it can be migrated in Phase 2 once srWatcher itself learns about ConnectionMode. Phase 1 of issue netbirdio#5989. Engine forwarding (C5) follows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
createPeerConn now reads ConnMgr.Mode() and copies it into peer.ConnConfig, so the per-peer Open() loop in conn.go can take the ModeRelayForced skip-ICE branch without reading the global env var. This is the last wiring commit for the client side of Phase 1; the server-side mgmt changes (Settings + OpenAPI + handler + audit + NetworkMap-build) follow in Section D. Phase 1 of issue netbirdio#5989. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
All three fields are nullable to distinguish 'use built-in default' (NULL) from explicit values (incl. 0 = never tear down). Copy() now deep-clones the new pointer fields via two small helpers. GORM AutoMigrate creates the new columns at first start; existing accounts have NULL in all three columns and resolve via the legacy LazyConnectionEnabled boolean. Phase 1 of issue netbirdio#5989. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tings Three new optional, nullable fields with descriptions of the NULL = built-in-default semantics and the Phase-1-vs-Phase-2 status of p2p-dynamic. Regenerated types.gen.go via the existing oapi-codegen tooling. The generated AccountSettingsConnectionMode enum has the canonical values relay-forced / p2p / p2p-lazy / p2p-dynamic, plus a Valid() helper for handler-side validation. Phase 1 of issue netbirdio#5989. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PUT /api/accounts/{id} now accepts connection_mode (validated against
the four-value enum via the generated AccountSettingsConnectionMode.
Valid()), p2p_timeout_seconds and relay_timeout_seconds. NULL in the
JSON body keeps the existing value untouched (= "no client-side
override on this round-trip"); explicit NULL-clear via API uses a
distinct PATCH-style call which is out-of-scope for Phase 1.
Response payload mirrors the input fields back as nullable so the
dashboard can distinguish "use default" from "explicit value".
Phase 1 of issue netbirdio#5989. Audit-event emission follows in D5.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
AccountConnectionModeChanged (121), AccountRelayTimeoutChanged (122), AccountP2pTimeoutChanged (123) -- emitted from account.go when settings change. Per-peer / per-group event codes are reserved for Phase 3 (issue netbirdio#5990). Phase 1 of issue netbirdio#5989. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
handleConnectionModeSettings is invoked from the same diff-detection block as handleLazyConnectionSettings; emits one StoreEvent per changed field (ConnectionMode, RelayTimeoutSeconds, P2pTimeoutSeconds) with old/new values in the meta payload. Four small ptr-equality / deref helpers are added for nullable string and uint32 fields. They are package-private and named after the existing convention used elsewhere in the package. Phase 1 of issue netbirdio#5989. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…elds Two changes in one commit because they're inseparable: 1. Move client/internal/peer/connectionmode/ to shared/connectionmode/. The package now needs to be importable from BOTH client/ and management/ (which is impossible while it lives under client/internal/ per Go's internal-package rule). All imports updated; tests pass on both sides. 2. Extend management/internals/shared/grpc/conversion.go::toPeerConfig to populate the three new PeerConfig fields (ConnectionMode, P2PTimeoutSeconds, RelayTimeoutSeconds) using the connectionmode helpers. The legacy LazyConnectionEnabled boolean is now derived from the resolved Mode via ToLazyConnectionEnabled() rather than copied verbatim from Settings -- this is the central backwards-compat contract: old clients see only the boolean, new clients prefer the explicit enum and ignore the bool. Resolution rules (Phase 1, account-wide only): - Settings.ConnectionMode != nil and parses -> wins - Otherwise -> ResolveLegacyLazyBool(LazyConnectionEnabled) - timeouts: Settings.RelayTimeoutSeconds / P2pTimeoutSeconds when non-NULL, else 0 (= server has no preference; client uses built-in default) Per-peer / per-group resolution comes in Phase 3 (netbirdio#5990). Phase 1 of issue netbirdio#5989. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nine sub-cases cover the Phase-1 resolution matrix from spec section 3: - no settings -> default (P2P + lazy=false) - legacy bool only -> mapped via ResolveLegacyLazyBool - explicit ConnectionMode -> wins over the legacy bool - timeouts propagate - garbage ConnectionMode value -> tolerant fallback to legacy bool Particular attention to the structural compat gap: relay-forced cannot be expressed via the legacy boolean, so the wire field for old clients is sent as false. Documented in the spec, asserted here. Existing TestAccount_GetPeerNetworkMap remains green: existing test peers have ConnectionMode=NULL in Settings, falls through to the legacy ResolveLegacyLazyBool(false) -> ModeP2P -> wire bool false. Phase 1 of issue netbirdio#5989. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR introduces comprehensive peer connection mode and timeout configuration across the NetBird client and server stack. It adds CLI flags for connection-mode and timeout parameters, defines a new Changes
Sequence DiagramsequenceDiagram
actor User
participant CLI as Client CLI
participant ConnMgr as Connection Manager
participant Engine as Engine
participant Peer as Peer Conn
participant Server as Management Server
User->>CLI: Set connection-mode flag<br/>(or env var/config file)
CLI->>ConnMgr: NewConnMgr(env, config)
ConnMgr->>ConnMgr: resolveConnectionMode()<br/>(env → config → default)
ConnMgr->>ConnMgr: Conditionally start<br/>lazy manager per mode
Engine->>Server: updateNetworkMap()
Server->>Engine: PeerConfig with<br/>ConnectionMode + timeouts
Engine->>ConnMgr: UpdatedRemotePeerConfig(ctx, pc)
ConnMgr->>ConnMgr: Re-resolve mode<br/>(env → config → server)
ConnMgr->>ConnMgr: Start/stop lazy manager<br/>only on state change
Engine->>Peer: Create connection<br/>with config.Mode
Peer->>Peer: Skip ICE/relay logic<br/>per Mode value
Peer->>Peer: isConnectedOnAllWay()<br/>computed from mode
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.Comment |
|
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@client/cmd/up.go`:
- Around line 437-439: The connection-mode flag value must be normalized before
persisting or sending: when the flag value (connectionMode variable read via
connectionModeFlag) equals "relay" (case-insensitive, trimmed) convert it to the
canonical "relay-forced" string and then assign that normalized value to
req.ConnectionMode and any profile write paths; leave other values unchanged and
ensure the same normalization is applied in the other occurrences that set
req.ConnectionMode or write the profile (the other blocks referenced around the
same flag usage).
In `@client/internal/peer/env_test.go`:
- Line 25: Update the test case name string in the table entry that currently
reads "connection_mode unparseable falls through to legacy" to use the correct
spelling "connection_mode unparsable falls through to legacy"; locate the tuple
in peer/env_test.go (the table entry with values "garbage", "true",
connectionmode.ModeRelayForced, etc.) and change the first element only to keep
codespell green.
In `@client/internal/peer/env.go`:
- Around line 75-77: Validate the parsed duration from time.ParseDuration(raw)
before assigning to timeoutSecs: ensure the duration is non-negative and that
d.Seconds() does not exceed the maximum uint32 value (so it won't wrap when
cast). Specifically, in the block handling envInactivityThreshold, check d >= 0
and d.Seconds() <= float64(math.MaxUint32) (or compare against
time.Duration(maxUint32Seconds)*time.Second) and only then set timeoutSecs =
uint32(d.Seconds()); otherwise log or handle the invalid value (e.g., ignore the
env value or clamp to max) while still calling
warnDeprecated(envInactivityThreshold, "the relay_timeout setting on the
management server").
In `@client/internal/profilemanager/config.go`:
- Around line 177-179: The timeout fields RelayTimeoutSeconds and
P2pTimeoutSeconds should be made nullable so the persisted profile can
distinguish "unset" from an explicit zero: change their types from uint32 to
*uint32 in the profile struct(s) (e.g., the struct containing ConnectionMode,
RelayTimeoutSeconds, P2pTimeoutSeconds) and keep json:",omitempty" so nil is
omitted while an explicit zero is preserved when set; update any code that
reads/writes these fields (decoders, defaults, and consumers) to handle nil vs
non-nil dereferencing and apply the same change for the corresponding
p2p_timeout_seconds occurrences referenced around lines 604-618.
In `@management/server/account.go`:
- Around line 372-375: UpdateAccountSettings currently doesn't mark
updateAccountPeers or bump the network serial when ConnectionMode,
RelayTimeoutSeconds, or P2pTimeoutSeconds change, so peers won't receive the new
values; inside UpdateAccountSettings (near the existing updateAccountPeers
gating above the calls to am.handleRoutingPeerDNSResolutionSettings /
handleLazyConnectionSettings / handleConnectionModeSettings /
handlePeerLoginExpirationSettings) detect differences on ConnectionMode,
RelayTimeoutSeconds, and P2pTimeoutSeconds between oldSettings and newSettings,
set updateAccountPeers = true (or equivalent), and ensure the account/network
serial is bumped so connected peers are notified of the change.
In `@management/server/http/handlers/accounts/accounts_handler.go`:
- Around line 229-236: The code casts signed API timeout fields
(req.Settings.P2pTimeoutSeconds, req.Settings.RelayTimeoutSeconds) straight to
uint32 which will silently wrap on negative or out-of-range values; add explicit
validation before casting: check each non-nil value is >= 0 and <=
math.MaxUint32 (or 4294967295), and if not return a 400 validation error (use
the same error/response helper used elsewhere in accounts_handler.go). Implement
the checks inline where returnSettings.P2pTimeoutSeconds and
returnSettings.RelayTimeoutSeconds are set (or extract to a small helper
validateTimeout(name string, v *int64) error) and only perform the uint32
conversion after the value passes validation.
In `@shared/management/http/api/types.gen.go`:
- Around line 1517-1521: The handler currently casts pointer timeout fields like
P2pTimeoutSeconds (and the other *int64 timeout fields) directly to uint32,
which allows negative values to wrap; update the account handler code that
performs the cast to first check for nil, then validate the pointed int64 is >=
0 and <= math.MaxUint32, returning a validation error (or reject the request) if
out of range; only then convert safely to uint32. Identify the conversion sites
in the accounts handler where P2pTimeoutSeconds (and the similar timeout fields)
are dereferenced/cast and apply this guard before performing the uint32 cast.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a58c2ed6-4fbf-4f05-a219-4fc44e13ce33
⛔ Files ignored due to path filters (4)
client/proto/daemon.pb.gois excluded by!**/*.pb.goclient/proto/daemon_grpc.pb.gois excluded by!**/*.pb.goshared/management/proto/management.pb.gois excluded by!**/*.pb.goshared/management/proto/proxy_service.pb.gois excluded by!**/*.pb.go
📒 Files selected for processing (23)
client/cmd/root.goclient/cmd/up.goclient/internal/conn_mgr.goclient/internal/conn_mgr_test.goclient/internal/connect.goclient/internal/engine.goclient/internal/lazyconn/env.goclient/internal/peer/conn.goclient/internal/peer/env.goclient/internal/peer/env_test.goclient/internal/profilemanager/config.goclient/proto/daemon.protomanagement/internals/shared/grpc/conversion.gomanagement/internals/shared/grpc/conversion_test.gomanagement/server/account.gomanagement/server/activity/codes.gomanagement/server/http/handlers/accounts/accounts_handler.gomanagement/server/types/settings.goshared/connectionmode/mode.goshared/connectionmode/mode_test.goshared/management/http/api/openapi.ymlshared/management/http/api/types.gen.goshared/management/proto/management.proto
| if cmd.Flag(connectionModeFlag).Changed { | ||
| req.ConnectionMode = &connectionMode | ||
| } |
There was a problem hiding this comment.
Normalize the advertised relay value before persisting or sending it.
client/cmd/root.go:195-210 exposes --connection-mode as (p2p, relay), but shared/connectionmode/mode.go:47-71 only accepts relay-forced. These three paths currently forward the raw flag value unchanged, so netbird up --connection-mode relay follows the help text yet gets rejected later or written into the profile as an invalid value in foreground mode.
Also applies to: 564-566, 688-690
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@client/cmd/up.go` around lines 437 - 439, The connection-mode flag value must
be normalized before persisting or sending: when the flag value (connectionMode
variable read via connectionModeFlag) equals "relay" (case-insensitive, trimmed)
convert it to the canonical "relay-forced" string and then assign that
normalized value to req.ConnectionMode and any profile write paths; leave other
values unchanged and ensure the same normalization is applied in the other
occurrences that set req.ConnectionMode or write the profile (the other blocks
referenced around the same flag usage).
| {"lazy alone", "", "", "true", "", connectionmode.ModeP2PLazy, 0}, | ||
| {"force_relay AND lazy: force_relay wins", "", "true", "true", "", connectionmode.ModeRelayForced, 0}, | ||
| {"only inactivity threshold", "", "", "", "30m", connectionmode.ModeUnspecified, 1800}, | ||
| {"connection_mode unparseable falls through to legacy", "garbage", "true", "", "", connectionmode.ModeRelayForced, 0}, |
There was a problem hiding this comment.
Fix typo in test case name to keep codespell green.
"unparseable" should be "unparsable".
✏️ Suggested patch
- {"connection_mode unparseable falls through to legacy", "garbage", "true", "", "", connectionmode.ModeRelayForced, 0},
+ {"connection_mode unparsable falls through to legacy", "garbage", "true", "", "", connectionmode.ModeRelayForced, 0},📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| {"connection_mode unparseable falls through to legacy", "garbage", "true", "", "", connectionmode.ModeRelayForced, 0}, | |
| {"connection_mode unparsable falls through to legacy", "garbage", "true", "", "", connectionmode.ModeRelayForced, 0}, |
🧰 Tools
🪛 GitHub Check: codespell
[failure] 25-25:
unparseable ==> unparsable
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@client/internal/peer/env_test.go` at line 25, Update the test case name
string in the table entry that currently reads "connection_mode unparseable
falls through to legacy" to use the correct spelling "connection_mode unparsable
falls through to legacy"; locate the tuple in peer/env_test.go (the table entry
with values "garbage", "true", connectionmode.ModeRelayForced, etc.) and change
the first element only to keep codespell green.
| if d, err := time.ParseDuration(raw); err == nil { | ||
| timeoutSecs = uint32(d.Seconds()) | ||
| warnDeprecated(envInactivityThreshold, "the relay_timeout setting on the management server") |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In Go, what are the conversion rules and edge-case behaviors when converting negative or out-of-range float/duration-derived values to uint32?
💡 Result:
In Go, converting float64 or time.Duration-derived values to uint32 follows the language specification for non-constant numeric conversions. For float64 to uint32: - The conversion always succeeds without panic. - The fractional part is truncated toward zero. - If the float64 value is negative or exceeds the uint32 range [0, 4294967295], the result is implementation-dependent (platform/architecture-specific). - On amd64/x86_64: negative float64 converts to a large uint32 value (e.g., uint32(-2.99) ≈ 18446744073709551614 for uint64, truncated for uint32). - On arm64: negative float64 converts to 0. - This behavior is documented in the Go spec and confirmed in compiler issues. For time.Duration to uint32 (Duration is int64 nanoseconds): - First converts int64 to uint32 per integer conversion rules: sign-extended to infinite precision, then truncated to 32 bits. - Positive Duration in uint32 range: exact value (ns % 2^32). - Large positive Duration: wraps around (int64 % 2^32 as uint32). - Negative Duration: sign-extended, so large negative becomes large uint32 (all 1s for truncation, e.g., uint32(int64(-1)) == 4294967295). - No panic; always succeeds. Edge cases with extreme values (near int64 limits) may wrap due to arithmetic. Avoid direct conversions for negative/out-of-range values; check range first or use safe alternatives like conditional clamping. Examples (platform-dependent for float): f := float64(-1.5); u := uint32(f) // amd64: large value near 4294967295; arm64: 0 d := time.Duration(-1000); u := uint32(d) // 4294967295 (uint32(-1000))
Citations:
- 1: https://github.com/golang/go/blob/99d4927f/doc/go_spec.html
- 2: spec: Float to int conversions underspecified golang/go#19405
- 3: https://boldlygo.tech/archive/2024-03-06-conversions-between-numeric-types/
- 4: https://go.dev/ref/spec
- 5: https://go.dev/issue/52943
- 6: https://go.dev/issue/43047
- 7: DIfferent behavior in Go 1.19/TinyGo when casting float with negative value to uint tinygo-org/tinygo#3390
- 8: https://pkg.go.dev/time@latest
- 9: Time.Duration overflowing while it accepts 64 bit integers golang/go#32501
- 10: Subtracting a small positive duration from a large negative duration results in a large positive duration golang/go#49740
🏁 Script executed:
fd -t f "env.go" client/internal/peer/Repository: netbirdio/netbird
Length of output: 155
🏁 Script executed:
# Find and read the env.go file with context
fd -t f "env.go" client/internal/peer/ --exec cat -n {} \;Repository: netbirdio/netbird
Length of output: 8815
🏁 Script executed:
# Search for uses of timeoutSecs to understand downstream validation
rg "timeoutSecs|envInactivityThreshold" client/internal/peer/ -B 2 -A 5Repository: netbirdio/netbird
Length of output: 1554
🏁 Script executed:
# Check if there's any validation of this timeout value elsewhere
rg "inactivity|InactivityThreshold" client/ -t go --max-count=20Repository: netbirdio/netbird
Length of output: 6003
🏁 Script executed:
# Find callers of ResolveModeFromEnv()
rg "ResolveModeFromEnv" client/ --type go -B 2 -A 5Repository: netbirdio/netbird
Length of output: 4525
🏁 Script executed:
# Check where the timeout value is ultimately used
rg "conn_mgr\.go|inactivityThresholdEnv" client/ --type go -B 3 -A 8Repository: netbirdio/netbird
Length of output: 2144
🏁 Script executed:
# Check the test expectations for parsing
cat -n client/internal/peer/env_test.go | head -80Repository: netbirdio/netbird
Length of output: 2672
🏁 Script executed:
# Check if there's any usage of the timeout that expects valid uint32 bounds
rg "relayTimeoutSecs|RelayTimeout" client/ --type go -B 2 -A 3Repository: netbirdio/netbird
Length of output: 12907
Guard duration bounds before converting to uint32.
The code accepts negative durations via time.ParseDuration() without validation. Converting a negative time.Duration to uint32 produces a wrapped integer with platform-dependent behavior. Additionally, oversized durations exceeding uint32 range will wrap around, producing invalid timeout values.
🛠️ Suggested patch
if raw := os.Getenv(envInactivityThreshold); raw != "" {
if d, err := time.ParseDuration(raw); err == nil {
- timeoutSecs = uint32(d.Seconds())
- warnDeprecated(envInactivityThreshold, "the relay_timeout setting on the management server")
+ secs := d / time.Second
+ const maxTimeoutSecs = int64(^uint32(0))
+ switch {
+ case secs < 0:
+ log.Warnf("ignoring %s=%q: duration must be non-negative", envInactivityThreshold, raw)
+ case secs > maxTimeoutSecs:
+ log.Warnf("ignoring %s=%q: duration exceeds max supported value", envInactivityThreshold, raw)
+ default:
+ timeoutSecs = uint32(secs)
+ warnDeprecated(envInactivityThreshold, "the relay_timeout setting on the management server")
+ }
} else {
log.Warnf("ignoring %s=%q: %v", envInactivityThreshold, raw, err)
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if d, err := time.ParseDuration(raw); err == nil { | |
| timeoutSecs = uint32(d.Seconds()) | |
| warnDeprecated(envInactivityThreshold, "the relay_timeout setting on the management server") | |
| if raw := os.Getenv(envInactivityThreshold); raw != "" { | |
| if d, err := time.ParseDuration(raw); err == nil { | |
| secs := d / time.Second | |
| const maxTimeoutSecs = int64(^uint32(0)) | |
| switch { | |
| case secs < 0: | |
| log.Warnf("ignoring %s=%q: duration must be non-negative", envInactivityThreshold, raw) | |
| case secs > maxTimeoutSecs: | |
| log.Warnf("ignoring %s=%q: duration exceeds max supported value", envInactivityThreshold, raw) | |
| default: | |
| timeoutSecs = uint32(secs) | |
| warnDeprecated(envInactivityThreshold, "the relay_timeout setting on the management server") | |
| } | |
| } else { | |
| log.Warnf("ignoring %s=%q: %v", envInactivityThreshold, raw, err) | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@client/internal/peer/env.go` around lines 75 - 77, Validate the parsed
duration from time.ParseDuration(raw) before assigning to timeoutSecs: ensure
the duration is non-negative and that d.Seconds() does not exceed the maximum
uint32 value (so it won't wrap when cast). Specifically, in the block handling
envInactivityThreshold, check d >= 0 and d.Seconds() <= float64(math.MaxUint32)
(or compare against time.Duration(maxUint32Seconds)*time.Second) and only then
set timeoutSecs = uint32(d.Seconds()); otherwise log or handle the invalid value
(e.g., ignore the env value or clamp to max) while still calling
warnDeprecated(envInactivityThreshold, "the relay_timeout setting on the
management server").
| ConnectionMode string `json:",omitempty"` | ||
| RelayTimeoutSeconds uint32 `json:",omitempty"` | ||
| P2pTimeoutSeconds uint32 `json:",omitempty"` |
There was a problem hiding this comment.
Keep timeout overrides nullable in the persisted profile.
RelayTimeoutSeconds and P2pTimeoutSeconds are stored as plain uint32 values, so once the profile is written you lose the distinction between “unset” and “explicit 0”. That breaks the documented relay semantics (0 = never tear down) across restarts, because omitempty drops the zero and the next load treats it like “no override”. p2p_timeout_seconds has the same bug waiting for Phase 2.
Also applies to: 604-618
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@client/internal/profilemanager/config.go` around lines 177 - 179, The timeout
fields RelayTimeoutSeconds and P2pTimeoutSeconds should be made nullable so the
persisted profile can distinguish "unset" from an explicit zero: change their
types from uint32 to *uint32 in the profile struct(s) (e.g., the struct
containing ConnectionMode, RelayTimeoutSeconds, P2pTimeoutSeconds) and keep
json:",omitempty" so nil is omitted while an explicit zero is preserved when
set; update any code that reads/writes these fields (decoders, defaults, and
consumers) to handle nil vs non-nil dereferencing and apply the same change for
the corresponding p2p_timeout_seconds occurrences referenced around lines
604-618.
| am.handleRoutingPeerDNSResolutionSettings(ctx, oldSettings, newSettings, userID, accountID) | ||
| am.handleLazyConnectionSettings(ctx, oldSettings, newSettings, userID, accountID) | ||
| am.handleConnectionModeSettings(ctx, oldSettings, newSettings, userID, accountID) | ||
| am.handlePeerLoginExpirationSettings(ctx, oldSettings, newSettings, userID, accountID) |
There was a problem hiding this comment.
Trigger peer updates when these new settings change.
These fields are now sent to clients via management/internals/shared/grpc/conversion.go, but UpdateAccountSettings still doesn't set updateAccountPeers / bump the network serial when ConnectionMode, RelayTimeoutSeconds, or P2pTimeoutSeconds change. The new audit events added here will fire, but connected peers won't receive the new mode/timeouts until some unrelated settings change or they reconnect.
Suggested fix (apply near the existing `updateAccountPeers` gating above)
if oldSettings.RoutingPeerDNSResolutionEnabled != newSettings.RoutingPeerDNSResolutionEnabled ||
oldSettings.LazyConnectionEnabled != newSettings.LazyConnectionEnabled ||
+ !equalStringPtr(oldSettings.ConnectionMode, newSettings.ConnectionMode) ||
+ !equalUint32Ptr(oldSettings.RelayTimeoutSeconds, newSettings.RelayTimeoutSeconds) ||
+ !equalUint32Ptr(oldSettings.P2pTimeoutSeconds, newSettings.P2pTimeoutSeconds) ||
oldSettings.DNSDomain != newSettings.DNSDomain ||
oldSettings.AutoUpdateVersion != newSettings.AutoUpdateVersion ||
oldSettings.AutoUpdateAlways != newSettings.AutoUpdateAlways {
updateAccountPeers = true
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@management/server/account.go` around lines 372 - 375, UpdateAccountSettings
currently doesn't mark updateAccountPeers or bump the network serial when
ConnectionMode, RelayTimeoutSeconds, or P2pTimeoutSeconds change, so peers won't
receive the new values; inside UpdateAccountSettings (near the existing
updateAccountPeers gating above the calls to
am.handleRoutingPeerDNSResolutionSettings / handleLazyConnectionSettings /
handleConnectionModeSettings / handlePeerLoginExpirationSettings) detect
differences on ConnectionMode, RelayTimeoutSeconds, and P2pTimeoutSeconds
between oldSettings and newSettings, set updateAccountPeers = true (or
equivalent), and ensure the account/network serial is bumped so connected peers
are notified of the change.
| if req.Settings.P2pTimeoutSeconds != nil { | ||
| v := uint32(*req.Settings.P2pTimeoutSeconds) | ||
| returnSettings.P2pTimeoutSeconds = &v | ||
| } | ||
| if req.Settings.RelayTimeoutSeconds != nil { | ||
| v := uint32(*req.Settings.RelayTimeoutSeconds) | ||
| returnSettings.RelayTimeoutSeconds = &v | ||
| } |
There was a problem hiding this comment.
Validate timeout ranges before casting to uint32.
p2p_timeout_seconds and relay_timeout_seconds come from signed API fields here. Negative values or anything above uint32 max will silently wrap on the cast (-1 becomes 4294967295), so the handler can persist a completely different timeout instead of returning a 400.
Suggested fix
if req.Settings.P2pTimeoutSeconds != nil {
+ if *req.Settings.P2pTimeoutSeconds < 0 || *req.Settings.P2pTimeoutSeconds > int64(^uint32(0)) {
+ return nil, status.Errorf(status.InvalidArgument, "p2p_timeout_seconds must be between 0 and %d", ^uint32(0))
+ }
v := uint32(*req.Settings.P2pTimeoutSeconds)
returnSettings.P2pTimeoutSeconds = &v
}
if req.Settings.RelayTimeoutSeconds != nil {
+ if *req.Settings.RelayTimeoutSeconds < 0 || *req.Settings.RelayTimeoutSeconds > int64(^uint32(0)) {
+ return nil, status.Errorf(status.InvalidArgument, "relay_timeout_seconds must be between 0 and %d", ^uint32(0))
+ }
v := uint32(*req.Settings.RelayTimeoutSeconds)
returnSettings.RelayTimeoutSeconds = &v
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if req.Settings.P2pTimeoutSeconds != nil { | |
| v := uint32(*req.Settings.P2pTimeoutSeconds) | |
| returnSettings.P2pTimeoutSeconds = &v | |
| } | |
| if req.Settings.RelayTimeoutSeconds != nil { | |
| v := uint32(*req.Settings.RelayTimeoutSeconds) | |
| returnSettings.RelayTimeoutSeconds = &v | |
| } | |
| if req.Settings.P2pTimeoutSeconds != nil { | |
| if *req.Settings.P2pTimeoutSeconds < 0 { | |
| return nil, status.Errorf(codes.InvalidArgument, "p2p_timeout_seconds must be non-negative") | |
| } | |
| v := uint32(*req.Settings.P2pTimeoutSeconds) | |
| returnSettings.P2pTimeoutSeconds = &v | |
| } | |
| if req.Settings.RelayTimeoutSeconds != nil { | |
| if *req.Settings.RelayTimeoutSeconds < 0 { | |
| return nil, status.Errorf(codes.InvalidArgument, "relay_timeout_seconds must be non-negative") | |
| } | |
| v := uint32(*req.Settings.RelayTimeoutSeconds) | |
| returnSettings.RelayTimeoutSeconds = &v | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@management/server/http/handlers/accounts/accounts_handler.go` around lines
229 - 236, The code casts signed API timeout fields
(req.Settings.P2pTimeoutSeconds, req.Settings.RelayTimeoutSeconds) straight to
uint32 which will silently wrap on negative or out-of-range values; add explicit
validation before casting: check each non-nil value is >= 0 and <=
math.MaxUint32 (or 4294967295), and if not return a 400 validation error (use
the same error/response helper used elsewhere in accounts_handler.go). Implement
the checks inline where returnSettings.P2pTimeoutSeconds and
returnSettings.RelayTimeoutSeconds are set (or extract to a small helper
validateTimeout(name string, v *int64) error) and only perform the uint32
conversion after the value passes validation.
| // P2pTimeoutSeconds Default ICE-worker idle timeout in seconds. 0 = never tear down. | ||
| // Effective only in p2p-dynamic mode (added in Phase 2). | ||
| // NULL means "use built-in default" (180 minutes). | ||
| P2pTimeoutSeconds *int64 `json:"p2p_timeout_seconds,omitempty"` | ||
|
|
There was a problem hiding this comment.
Validate non-negative timeout values before cast.
Line 1517 and Line 1543 describe non-negative timeout semantics, but these fields are *int64; the account handler currently casts to uint32 directly. Negative inputs can wrap into very large values and silently corrupt timeout behavior.
Suggested fix (in management/server/http/handlers/accounts/accounts_handler.go)
if req.Settings.P2pTimeoutSeconds != nil {
- v := uint32(*req.Settings.P2pTimeoutSeconds)
+ if *req.Settings.P2pTimeoutSeconds < 0 {
+ return nil, fmt.Errorf("invalid p2p_timeout_seconds %d: must be >= 0", *req.Settings.P2pTimeoutSeconds)
+ }
+ v := uint32(*req.Settings.P2pTimeoutSeconds)
returnSettings.P2pTimeoutSeconds = &v
}
if req.Settings.RelayTimeoutSeconds != nil {
- v := uint32(*req.Settings.RelayTimeoutSeconds)
+ if *req.Settings.RelayTimeoutSeconds < 0 {
+ return nil, fmt.Errorf("invalid relay_timeout_seconds %d: must be >= 0", *req.Settings.RelayTimeoutSeconds)
+ }
+ v := uint32(*req.Settings.RelayTimeoutSeconds)
returnSettings.RelayTimeoutSeconds = &v
}Also applies to: 1543-1547
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@shared/management/http/api/types.gen.go` around lines 1517 - 1521, The
handler currently casts pointer timeout fields like P2pTimeoutSeconds (and the
other *int64 timeout fields) directly to uint32, which allows negative values to
wrap; update the account handler code that performs the cast to first check for
nil, then validate the pointed int64 is >= 0 and <= math.MaxUint32, returning a
validation error (or reject the request) if out of range; only then convert
safely to uint32. Identify the conversion sites in the accounts handler where
P2pTimeoutSeconds (and the similar timeout fields) are dereferenced/cast and
apply this guard before performing the uint32 cast.




Summary
This PR is Phase 1 of three that together implement issues #5989 and #5990 (the connection-mode consolidation RFC and its server-side per-peer/per-group companion).
In Scope:
ConnectionModeproto enum with four reserved values (relay-forced,p2p,p2p-lazy,p2p-dynamic); three are functional in this PR.p2p-dynamicis reserved on the wire and in the DB but the daemon currently treats it likep2p(pass-through). The decoupledworker_relay/worker_icelifecycle that makesp2p-dynamicdistinct comes in Phase 2.relay_timeout_seconds,p2p_timeout_seconds(both nullable, NULL = built-in default).client/internal/conn_mgr.gowith a single resolver: client env > client config > server-pushed.lazy_connection_enabledboolean (mapped from the resolved mode bytoPeerConfig).connection_mode = UNSPECIFIED(proto default 0) and the new client falls back to the legacy boolean -- verified via tests.NB_FORCE_RELAY,NB_ENABLE_EXPERIMENTAL_LAZY_CONN,--enable-lazy-connection, andNB_LAZY_CONN_INACTIVITY_THRESHOLDcontinue to work and emit one-shot deprecation warnings.The companion Dashboard PR is at netbirdio/dashboard#627 (will be added once it exists).
Implementation map
shared/management/proto/management.proto,client/proto/daemon.protoshared/connectionmode/(Mode, ParseString, FromProto, ToProto, ResolveLegacyLazyBool, ToLazyConnectionEnabled)client/internal/peer/env.go(ResolveModeFromEnv),client/cmd/{root,up}.go(--connection-mode,--relay-timeout,--p2p-timeout)client/internal/{engine,conn_mgr,connect}.go(resolveConnectionMode + UpdatedRemotePeerConfig + EngineConfig fields)client/internal/peer/conn.go(Mode-driven skip-ICE branch)management/server/types/settings.go(new nullable columns),management/server/http/handlers/accounts/accounts_handler.go(PUT validation + response),management/server/account.go(audit emission),management/server/activity/codes.go(3 new event codes),management/internals/shared/grpc/conversion.go(toPeerConfigwrites both old and new wire fields)shared/management/http/api/openapi.yml+ regeneratedtypes.gen.goshared/connectionmode/mode_test.go,client/internal/peer/env_test.go,client/internal/conn_mgr_test.go,management/internals/shared/grpc/conversion_test.goHardware-tested
Verified on a real production NetBird instance with 32 connected peers across 12 OpenWrt-router versions (22.03 to 25.12), Windows 10/11, Debian, Android 12/14, and iOS 26.3.1:
lazy_connection_enabledboolean whichtoPeerConfigkeeps writing via the back-compat mapping).connection_modebetweenp2pandp2p-lazyvia the API: 32 connected throughout, no disconnect.The daemon-side change is exercised end-to-end by switching the account-wide setting and observing the daemon pick up the new mode on the next NetworkMap update (handled by
UpdatedRemotePeerConfig).Known limitations (called out in spec section 8)
relay-forcedcannot be pushed to old clients because the legacylazy_connection_enabledboolean cannot express it. The wire field falls back tofalseand old clients run in normalp2p. Workaround: upgrade the client, or set localNB_FORCE_RELAY=true.p2p-dynamicis not visible in the Dashboard dropdown in Phase 1 even though the API accepts it; setting it via API givesp2pbehaviour until the Phase-2 daemon implementation lands.Phase 2 / 3 follow-ups
p2p-dynamicdaemon implementation -- decoupleworker_relay/worker_iceOnNewOfferregistration, two-tier inactivity manager, plus theDeactivatePeerno-op fix inconn_mgr.gothat Consolidate connection-mode flags; add p2p-dynamic and p2p-dynamic-lazy modes #5989 calls out for the lazy/eager mismatch.connection_mode,p2p_timeout,relay_timeoutwith most-restrictive group-conflict resolution and per-scope audit events.Test plan
client/internal/...andmanagement/...greenTestAccount_GetPeerNetworkMap(existing) remains green -- old peers withoutconnection_moderesolve viaResolveLegacyLazyBoolexactly as beforeTestToPeerConfig_ConnectionModeResolution(new, 9 sub-cases) covers the resolution matrixTestResolveConnectionMode(new, 10 sub-cases) covers the env > config > server precedence chainTestResolveModeFromEnv(new, 9 sub-cases) covers the env-var matrix including the legacy-conflict case🤖 Generated with Claude Code
Summary by CodeRabbit