Skip to content

[client] Add excluded port range handling for PKCE flow#4853

Merged
lixmal merged 3 commits intomainfrom
handle-reserved-windows-ports
Nov 26, 2025
Merged

[client] Add excluded port range handling for PKCE flow#4853
lixmal merged 3 commits intomainfrom
handle-reserved-windows-ports

Conversation

@mlsmaycon
Copy link
Copy Markdown
Collaborator

@mlsmaycon mlsmaycon commented Nov 24, 2025

Describe your changes

Introduce support for handling system-excluded port ranges in PKCE flow. Update port validation to skip excluded and in-use ports, including integration of Windows-specific logic to retrieve excluded ranges from the registry. Add tests for various usage scenarios across platforms.

Issue ticket number and link

Stack

Checklist

  • Is it a bug fix
  • Is a typo/documentation fix
  • Is a feature enhancement
  • It is a refactor
  • Created tests that fail without the change (if possible)

By submitting this pull request, you confirm that you have read and agree to the terms of the Contributor License Agreement.

Documentation

Select exactly one:

  • I added/updated documentation for this change
  • Documentation is not needed for this change (explain why)

Docs PR URL (required if "docs added" is checked)

Paste the PR link from https://github.com/netbirdio/docs here:

https://github.com/netbirdio/docs/pull/__

Summary by CodeRabbit

  • Bug Fixes

    • Detect and avoid system-reserved port ranges when selecting redirect URLs on Windows.
    • Port availability checks now respect OS-specific excluded port ranges for more consistent redirect selection.
  • Tests

    • Added tests covering port exclusion parsing and redirect URL selection behavior across scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

Introduce support for handling system-excluded port ranges in PKCE flow. Update port validation to skip excluded and in-use ports, including integration of Windows-specific logic to retrieve excluded ranges from the registry. Add tests for various usage scenarios across platforms.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Nov 24, 2025

Walkthrough

Adds platform-aware port exclusion handling to the PKCE authentication flow. The code now obtains system-excluded TCP port ranges on Windows (via netsh), skips those ports when selecting redirect URLs, adds helpers and tests for port-range checks, and provides a non-Windows no-op variant.

Changes

Cohort / File(s) Summary
Core PKCE Flow Updates
client/internal/auth/pkce_flow.go
Add excludedPortRange type and isPortInExcludedRange helper; pass excluded ranges into redirect-URL availability checks; skip URLs whose ports fall in excluded ranges; fail-fast with warning when port is excluded; add strconv import for port parsing.
Windows-specific port retrieval
client/internal/auth/pkce_flow_windows.go
New Windows build file that runs netsh interface ipv4 show excludedportrange protocol=tcp, parses its output via parseExcludedPortRanges, and returns a slice of excludedPortRange; logs debug on failure and returns nil on error; parsing tolerates malformed lines.
Non-Windows stub
client/internal/auth/pkce_flow_other.go
New non-Windows build-tagged file providing getSystemExcludedPortRanges() that returns nil (no-op on non-Windows platforms).
Unit tests (common)
client/internal/auth/pkce_flow_test.go
Add TestIsPortInExcludedRange and TestIsRedirectURLPortUsed covering range boundaries, invalid inputs, and URL-port usage scenarios; imports added for test helpers.
Windows tests
client/internal/auth/pkce_flow_windows_test.go
Add Windows-only tests for parseExcludedPortRanges and a flow test (TestNewPKCEAuthorizationFlow_WithActualExcludedPorts) that binds a local listener to occupy a port and verifies the PKCE flow skips excluded/in-use ports and selects an available redirect URL.

Sequence Diagram(s)

sequenceDiagram
    participant Caller as Caller
    participant PKCE as PKCE Flow
    participant Sys as System (netsh / stub)
    participant Net as Network

    Caller->>PKCE: NewPKCEAuthorizationFlow(config)
    PKCE->>Sys: getSystemExcludedPortRanges()
    alt Windows build
        Sys->>PKCE: run `netsh ... excludedportrange` output
        PKCE->>PKCE: parseExcludedPortRanges(output)
    else Non-Windows build
        Sys-->>PKCE: return nil (no excluded ranges)
    end

    PKCE->>PKCE: iterate redirect URL candidates
    PKCE->>PKCE: extract port from URL
    PKCE->>PKCE: isPortInExcludedRange(port, excludedRanges)
    alt Port excluded
        PKCE-->>PKCE: warn & skip candidate
    else Port not excluded
        PKCE->>Net: check port reachability (is in use)
        alt Port in use
            PKCE-->>PKCE: skip candidate
        else Port free
            PKCE-->>Caller: select this redirect URL
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Review parsing robustness in parseExcludedPortRanges (scanner logic, whitespace handling, skipping malformed lines).
  • Verify integration: ensure isPortInExcludedRange is applied before network reachability checks and logging is appropriate.
  • Confirm Windows tests isolate network/listener usage and clean up listeners; validate build tags and non-Windows stub compile paths.

Poem

🐰 I peeked at netsh where excluded ports hide,

Hopped past the ranges the system keeps aside.
Now PKCE picks a redirect with care,
Skipping the holes and landing elsewhere.
A happy rabbit, no port collisions to bide.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 46.67% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: adding excluded port range handling for PKCE flow, which is directly reflected in the file changes across multiple PKCE flow files.
Description check ✅ Passed The description covers the main changes, marks it as a feature enhancement, addresses documentation necessity, and indicates tests were created, aligning with the template requirements.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch handle-reserved-windows-ports

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
client/internal/auth/pkce_flow_test.go (1)

165-218: Clarify intended behavior for malformed redirect URLs

TestIsRedirectURLPortUsed does a nice job covering excluded ranges, actually-used ports (via a live listener), unused ports, and malformed URLs.

One thing to double‑check: in the "Invalid URL without port" case, the helper returns false (i.e., “port not used”), which means NewPKCEAuthorizationFlow will happily select such a redirect URL even though WaitToken later expects a concrete port. If configs can ever contain malformed URLs, you might instead want to treat “no port / unparsable port” as unavailable and skip those entries early, to fail fast at configuration time rather than later at server startup.

Would you like to tighten isRedirectURLPortUsed to return true when parsedURL.Port() is empty (or when parsing fails), and then adjust this test accordingly?

client/internal/auth/pkce_flow.go (2)

287-314: Handle missing ports explicitly in isRedirectURLPortUsed

The overall logic—parse URL, short‑circuit on excluded ranges, then DialTimeout to detect an already‑in‑use port—is sound.

However, when parsedURL.Port() is empty, addr := ":" is passed to net.DialTimeout, which will fail and cause the function to return false (“port not used”). In combination with NewPKCEAuthorizationFlow, that means a redirect URL without an explicit port can be chosen and only later fail when starting the HTTP server.

You may want to treat “no port present” as invalid/unusable and return true in that case, logging at warn/debug level, e.g.:

 func isRedirectURLPortUsed(redirectURL string, excludedRanges []excludedPortRange) bool {
   parsedURL, err := url.Parse(redirectURL)
   if err != nil {
     log.Errorf("failed to parse redirect URL: %v", err)
     return true
   }

-  port := parsedURL.Port()
+  port := parsedURL.Port()
+  if port == "" {
+    log.Warnf("redirect URL %q has no explicit port; treating as unavailable", redirectURL)
+    return true
+  }

This keeps misconfigured URLs from being selected and surfaces the issue earlier.


316-341: Port-range helper is correct; consider small robustness tweaks

excludedPortRange plus isPortInExcludedRange is straightforward and works as expected with the tests:

  • Short‑circuit when len(excludedRanges) == 0.
  • Safely ignore non‑numeric ports with a debug log.
  • Inclusive comparison start <= port <= end is intuitive for registry “start-end” specs.

If you want to harden this further, you could optionally:

  • Normalize/swallow obviously invalid ranges (e.g., start <= 0, end > 65535, or start > end) either when building the slice or when checking.
  • Document explicitly in the comment that the range is inclusive on both ends.

These are nice‑to‑haves; the current implementation is functionally fine.

client/internal/auth/pkce_flow_test_windows.go (1)

17-137: Windows registry test is effective; be mindful of side effects

The test does a solid job validating end‑to‑end behavior:

  • It snapshots ReservedPorts, overrides it with test ranges, and restores or deletes it afterward.
  • It uses live listeners to simulate “ports in use” and asserts that NewPKCEAuthorizationFlow picks the expected port or errors out when none are available.
  • The use of err from GetStringsValue in the restore defer is correct despite later shadowing inside other blocks.

Two considerations:

  • This test mutates a system‑wide HKLM key. Even though you restore it, it still requires admin rights and can briefly affect other software. Skipping on lack of privileges is good; you might also want to call this out in a comment so future maintainers understand the implications.
  • For extra clarity, you could replace the err check in the restore defer with an explicit boolean like hadReservedPorts := err != registry.ErrNotExist set right after GetStringsValue, which would make the intent a bit easier to follow.

Functionally, the test looks correct as written.

client/internal/auth/pkce_flow_windows.go (2)

14-22: Surface registry read failures at least at debug level

getSystemExcludedPortRanges currently ignores errors from getExcludedPortRangesFromRegistry and just returns ranges (which will be nil on error). That’s functionally acceptable—the flow simply falls back to “no excluded ranges”—but it makes debugging environment issues harder.

Consider logging at debug (or info) when the registry read fails, and simplifying the function:

 func getSystemExcludedPortRanges() []excludedPortRange {
-  ranges, err := getExcludedPortRangesFromRegistry()
-  if err == nil && len(ranges) > 0 {
-    return ranges
-  }
-
-  return ranges
+  ranges, err := getExcludedPortRangesFromRegistry()
+  if err != nil {
+    log.Debugf("failed to read Windows ReservedPorts; proceeding without excluded ranges: %v", err)
+    return nil
+  }
+  return ranges
 }

This preserves behavior while giving operators a clue when the system configuration can’t be read.


24-64: Registry parsing is straightforward; consider minor validation

getExcludedPortRangesFromRegistry is clean:

  • Opens the key with QUERY_VALUE only and defers close with debug logging.
  • Reads ReservedPorts as a string slice and parses "start-end" pairs with trimming.
  • Skips malformed entries or parse errors without failing the whole call.

Two optional robustness improvements:

  • Treat registry.ErrNotExist from GetStringsValue as “no ranges configured” rather than an error, returning an empty slice instead of an error.
  • Optionally filter out ranges where start > end or values are out of [0, 65535] to avoid bogus data if the registry is misconfigured.

These are defensive edges; the core logic is solid.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2097306 and 3cd9bc1.

📒 Files selected for processing (5)
  • client/internal/auth/pkce_flow.go (4 hunks)
  • client/internal/auth/pkce_flow_other.go (1 hunks)
  • client/internal/auth/pkce_flow_test.go (2 hunks)
  • client/internal/auth/pkce_flow_test_windows.go (1 hunks)
  • client/internal/auth/pkce_flow_windows.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
client/internal/auth/pkce_flow_test.go (2)
shared/relay/client/dialer/quic/conn.go (1)
  • Addr (20-22)
client/firewall/manager/port.go (1)
  • Port (10-16)
client/internal/auth/pkce_flow_test_windows.go (2)
client/internal/pkce_auth.go (1)
  • PKCEAuthProviderConfig (24-49)
client/internal/auth/pkce_flow.go (1)
  • NewPKCEAuthorizationFlow (47-78)
🪛 GitHub Check: SonarCloud Code Analysis
client/internal/auth/pkce_flow_test_windows.go

[warning] 17-17: This function has 110 lines of code, which is greater than the 100 authorized. Split it into smaller functions.

See more on https://sonarcloud.io/project/issues?id=netbirdio_netbird&issues=AZq3SCLpgAbn1_O1Gixq&open=AZq3SCLpgAbn1_O1Gixq&pullRequest=4853

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
  • GitHub Check: Management / Unit (amd64, postgres)
  • GitHub Check: Client / Unit (amd64)
  • GitHub Check: Client / Unit (386)
  • GitHub Check: Management / Unit (amd64, sqlite)
  • GitHub Check: Client (Docker) / Unit
  • GitHub Check: Management / Unit (amd64, mysql)
  • GitHub Check: Relay / Unit (amd64, -race)
  • GitHub Check: Relay / Unit (386)
  • GitHub Check: Android / Build
  • GitHub Check: Client / Unit
  • GitHub Check: Darwin
  • GitHub Check: release_ui_darwin
  • GitHub Check: Linux
  • GitHub Check: release
  • GitHub Check: Windows
  • GitHub Check: JS / Lint
  • GitHub Check: Client / Unit
  • GitHub Check: Client / Unit
🔇 Additional comments (3)
client/internal/auth/pkce_flow_other.go (1)

5-7: Non-Windows stub behavior is appropriate

Returning nil here cleanly disables excluded-range checks on non-Windows, and isPortInExcludedRange correctly handles len(nil) == 0, so semantics remain unchanged off Windows.

client/internal/auth/pkce_flow_test.go (1)

76-163: Port-range tests are thorough and cover key edge cases

The table in TestIsPortInExcludedRange exercises in-range, boundary, out-of-range, empty/nil slice, multiple ranges, and invalid/empty port strings. This gives good confidence that isPortInExcludedRange behaves as intended for realistic inputs and failure modes.

client/internal/auth/pkce_flow.go (1)

50-57: Port exclusion integration in flow constructor looks correct

Fetching excludedRanges := getSystemExcludedPortRanges() once per NewPKCEAuthorizationFlow and passing them into isRedirectURLPortUsed keeps the selection loop simple and avoids repeatedly hitting the registry. The error path (availableRedirectURL == "") clearly reports misconfiguration when all configured URLs are unusable.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
client/internal/auth/pkce_flow_windows_test.go (3)

17-41: Be careful with mutating HKLM TCP/IP settings in a unit test

This test directly changes HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\ReservedPorts, which is a global machine setting and requires elevated privileges. You do restore the value in a defer, but for the test duration you’re still altering system TCP/IP configuration.

Consider guarding this as an explicit integration test (e.g., behind an env flag) so it doesn’t run on regular developer machines/CI by default, and documenting that it mutates ReservedPorts while running.


43-51: Avoid fully overwriting ReservedPorts; append test ranges instead

Right now the test replaces the entire ReservedPorts value with testExcludedRanges, temporarily dropping any existing reserved ranges the system had configured. Even though you restore the original value afterward, during the test window other processes will see only the test ranges.

It would be safer to:

  • Read the existing list.
  • Append your test ranges to it.
  • Restore the original list in the defer.

This keeps the system’s reserved ports in place while still exercising port-exclusion behavior.


113-135: Table-driven coverage of excluded vs used ports looks solid

The table tests cleanly cover excluded ranges, multiple ranges, ports in use, and the “no ports available” error path, and they exercise the public NewPKCEAuthorizationFlow API with realistic config values. Assertions on both error and selected redirect port make the behavior clear.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3cd9bc1 and 0975883.

📒 Files selected for processing (1)
  • client/internal/auth/pkce_flow_windows_test.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
client/internal/auth/pkce_flow_windows_test.go (3)
client/firewall/manager/port.go (1)
  • Port (10-16)
client/internal/pkce_auth.go (1)
  • PKCEAuthProviderConfig (24-49)
client/internal/auth/pkce_flow.go (1)
  • NewPKCEAuthorizationFlow (47-78)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
  • GitHub Check: Management / Unit (amd64, mysql)
  • GitHub Check: Management / Unit (amd64, postgres)
  • GitHub Check: Client / Unit (amd64)
  • GitHub Check: Relay / Unit (amd64, -race)
  • GitHub Check: Client / Unit (386)
  • GitHub Check: Client (Docker) / Unit
  • GitHub Check: Client / Unit
  • GitHub Check: Windows
  • GitHub Check: Darwin
  • GitHub Check: Linux
  • GitHub Check: Android / Build
  • GitHub Check: Client / Unit
  • GitHub Check: release
  • GitHub Check: Client / Unit
  • GitHub Check: JS / Lint

Comment thread client/internal/auth/pkce_flow_windows_test.go Outdated
…ow on Windows

Replace Windows registry-based logic with `netsh` command for retrieving excluded port ranges in PKCE flow. Refactor related parsing functions and update tests to reflect the new implementation.
@sonarqubecloud
Copy link
Copy Markdown

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
client/internal/auth/pkce_flow_windows_test.go (1)

96-96: Hard-coded port can cause test flakiness.

The past review comment correctly identifies that using a fixed port 65432 can make this test flaky if that port is already in use on the test machine. The suggestion to dynamically allocate a free port (similar to usedPort1) and then close the listener before running the test is the right approach.

🧹 Nitpick comments (2)
client/internal/auth/pkce_flow_windows.go (2)

28-29: Consider adding a timeout for the netsh command.

While netsh is a system command that typically executes quickly, adding a timeout context (e.g., 5-10 seconds) would prevent indefinite blocking in edge cases.

You can apply this pattern:

 func getExcludedPortRangesFromNetsh() ([]excludedPortRange, error) {
-	cmd := exec.Command("netsh", "interface", "ipv4", "show", "excludedportrange", "protocol=tcp")
+	ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
+	defer cancel()
+	cmd := exec.CommandContext(ctx, "netsh", "interface", "ipv4", "show", "excludedportrange", "protocol=tcp")
 	output, err := cmd.Output()

And add the context and time imports at the top of the file.


68-78: Consider validating port range ordering.

While the system should provide valid ranges, adding a sanity check that startPort <= endPort would make the code more defensive and help detect unexpected netsh output format changes.

 	endPort, err := strconv.Atoi(fields[1])
 	if err != nil {
 		continue
 	}
+
+	if startPort > endPort {
+		continue
+	}

 	ranges = append(ranges, excludedPortRange{start: startPort, end: endPort})
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0975883 and 00516fb.

📒 Files selected for processing (2)
  • client/internal/auth/pkce_flow_windows.go (1 hunks)
  • client/internal/auth/pkce_flow_windows_test.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
client/internal/auth/pkce_flow_windows_test.go (2)
client/internal/pkce_auth.go (1)
  • PKCEAuthProviderConfig (24-49)
client/internal/auth/pkce_flow.go (1)
  • NewPKCEAuthorizationFlow (47-78)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
  • GitHub Check: Management / Unit (amd64, sqlite)
  • GitHub Check: Management / Unit (amd64, postgres)
  • GitHub Check: Relay / Unit (386)
  • GitHub Check: Management / Unit (amd64, mysql)
  • GitHub Check: Relay / Unit (amd64, -race)
  • GitHub Check: Client / Unit (386)
  • GitHub Check: Client / Unit (amd64)
  • GitHub Check: Client (Docker) / Unit
  • GitHub Check: release
  • GitHub Check: release_ui_darwin
  • GitHub Check: Android / Build
  • GitHub Check: Windows
  • GitHub Check: Linux
  • GitHub Check: Darwin
  • GitHub Check: Client / Unit
  • GitHub Check: JS / Lint
  • GitHub Check: Client / Unit
  • GitHub Check: Client / Unit
🔇 Additional comments (2)
client/internal/auth/pkce_flow_windows_test.go (1)

16-83: Well-structured parsing tests.

The table-driven tests cover multiple scenarios comprehensively (multiple ranges, empty output, single range) and use appropriate assertions.

client/internal/auth/pkce_flow_windows.go (1)

15-24: Graceful degradation on error is appropriate.

Logging a debug message and returning nil when netsh fails allows the PKCE flow to proceed without port exclusions, which is a reasonable fallback for resilience.

Comment thread client/internal/auth/pkce_flow_windows.go
Comment thread client/internal/auth/pkce_flow.go
@lixmal lixmal merged commit aca0398 into main Nov 26, 2025
37 checks passed
@lixmal lixmal deleted the handle-reserved-windows-ports branch November 26, 2025 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants