Skip to content

[client] Make mss clamping optional for nftables#4843

Merged
lixmal merged 3 commits intomainfrom
mss-clamping-optional
Nov 22, 2025
Merged

[client] Make mss clamping optional for nftables#4843
lixmal merged 3 commits intomainfrom
mss-clamping-optional

Conversation

@lixmal
Copy link
Copy Markdown
Collaborator

@lixmal lixmal commented Nov 22, 2025

Describe your changes

  • Flush critical setup first (chains, some rules)
  • Then flush each optional step separately (within each method)

This avoids polluting the critical setup with optional rules that can fail.

Issue ticket number and link

netbirdio/addon-netbird#311

Stack

Checklist

  • Is it a bug fix
  • Is a typo/documentation fix
  • Is a feature enhancement
  • It is a refactor
  • Created tests that fail without the change (if possible)

By submitting this pull request, you confirm that you have read and agree to the terms of the Contributor License Agreement.

Documentation

Select exactly one:

  • I added/updated documentation for this change
  • Documentation is not needed for this change (explain why)

Docs PR URL (required if "docs added" is checked)

Paste the PR link from https://github.com/netbirdio/docs here:

https://github.com/netbirdio/docs/pull/__

Summary by CodeRabbit

  • Refactor

    • Centralized firewall rule initialization to flush once after rule additions for more consistent startup.
    • Reordered rule insertion so certain return/post-routing rules are applied via the centralized post-routing path.
    • Simplified internal rule-addition flows and return handling.
  • Bug Fixes

    • Improved error handling and tightened log messages when loading or listing firewall tables/rules to reduce spurious failures.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Nov 22, 2025

Walkthrough

Refactored nftables router initialization in client/firewall/nftables/router_linux.go: several helper methods stopped returning errors, rule insertion and table initialization were centralized, and conn.Flush() calls were consolidated to a single point after rule additions.

Changes

Cohort / File(s) Summary
Router init & rule flow
client/firewall/nftables/router_linux.go
Changed signatures: addPostroutingRules(), addMSSClampingRules(), acceptFilterRulesNftables() no longer return error. Centralized rule insertion in createContainers, moved insertReturnTrafficRule placement, and consolidated conn.Flush() to a single flush point after adding rules.
Error formatting / messages
client/firewall/nftables/router_linux.go
Standardized error wrapping: replaced %v messages with %w for loadFilterTable and refreshRulesMap error returns; tightened some error messages (e.g., "list tables: %w").

Sequence Diagram

sequenceDiagram
    participant Init as createContainers
    participant Helpers as rule helpers
    participant Conn as conn (Flush)

    rect rgb(245,250,255)
      Note over Init,Helpers: Centralized initialization and single Flush
      Init->>Helpers: insertReturnTrafficRule (relocated)
      Init->>Helpers: addPostroutingRules()    -- now void
      Init->>Helpers: addMSSClampingRules()    -- now void (may Flush internally)
      Init->>Helpers: acceptFilterRulesNftables() -- now void (may Flush internally)
      Init->>Conn: conn.Flush()                -- single consolidated flush
      Conn-->>Init: flush result (error propagated)
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Verify all callers updated for changed signatures.
  • Confirm that errors from rule creation are not unintentionally suppressed by centralized Flush.
  • Check relocated insertReturnTrafficRule preserves rule ordering and semantics.
  • Ensure %w wrapping is used correctly for error propagation.

Poem

🐰 I hopped through rules and nudged their place,
One tidy Flush now clears the race,
No scattered returns, no tangled hive,
Tables aligned — the packets thrive,
A carrot cheer for order kept alive.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: making MSS clamping optional for nftables, which aligns with the refactored error handling and Flush strategy in the changeset.
Description check ✅ Passed The description covers the key objective (Flush strategy separation), references the issue, includes a bug fix checklist mark, and addresses documentation. It meets the template's core requirements.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch mss-clamping-optional

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3900128 and 594ee37.

📒 Files selected for processing (1)
  • client/firewall/nftables/router_linux.go (8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • client/firewall/nftables/router_linux.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: Management / Unit (amd64, sqlite)
  • GitHub Check: Relay / Unit (386)
  • GitHub Check: Relay / Unit (amd64, -race)
  • GitHub Check: Signal / Unit (amd64)
  • GitHub Check: Management / Unit (amd64, mysql)
  • GitHub Check: Signal / Unit (386)
  • GitHub Check: Management / Unit (amd64, postgres)
  • GitHub Check: Client / Unit (amd64)
  • GitHub Check: Client / Unit (386)
  • GitHub Check: Client (Docker) / Unit
  • GitHub Check: Client / Unit
  • GitHub Check: Android / Build
  • GitHub Check: Linux
  • GitHub Check: release
  • GitHub Check: Windows
  • GitHub Check: JS / Lint
  • GitHub Check: Darwin
  • GitHub Check: Client / Unit
  • GitHub Check: Client / Unit

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@lixmal lixmal force-pushed the mss-clamping-optional branch from 05c5323 to 677e75f Compare November 22, 2025 12:08
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
client/firewall/nftables/router_linux.go (1)

767-839: Guard MSS calculation against invalid/“disabled” MTU values

r.mtu is a uint16, so mss := r.mtu - ipTCPHeaderMinSize will silently underflow if r.mtu <= ipTCPHeaderMinSize (including the common “disabled” sentinel of mtu == 0). That yields a very large MSS value, which effectively disables clamping but in a non-obvious way and could be surprising if mtu is misconfigured.

To make MSS clamping explicitly optional and safer, you could:

  • Treat mtu == 0 (or generally mtu <= ipTCPHeaderMinSize) as “MSS clamping disabled” and skip adding rules.
  • Only compute mss once you’ve validated the MTU range.

For example:

func (r *router) addMSSClampingRules() error {
-    mss := r.mtu - ipTCPHeaderMinSize
+    if r.mtu == 0 {
+        // MSS clamping explicitly disabled for nftables
+        return nil
+    }
+    if r.mtu <= ipTCPHeaderMinSize {
+        log.Warnf("mtu %d too small for MSS clamping, skipping", r.mtu)
+        return nil
+    }
+    mss := r.mtu - ipTCPHeaderMinSize-    return r.conn.Flush()
+    return r.conn.Flush()
 }

This makes the “optional MSS clamping” behavior explicit and avoids underflowed MSS values.

🧹 Nitpick comments (3)
client/firewall/nftables/router_linux.go (3)

237-255: Clarify which initialization failures should be treated as fatal

createContainers only returns an error if the Flush at Line 241 fails; errors from addMSSClampingRules, acceptForwardRules, and refreshRulesMap are just logged and ignored. Given init() treats any createContainers error as a hard failure, this effectively makes MSS clamping and accept rules best-effort/optional and allows initialization to complete even if refreshRulesMap cannot populate r.rules.

If that matches your intent (especially for “MSS clamping optional”), this is fine; otherwise consider bubbling up some of these errors or tightening the conditions under which createContainers returns nil. While here, you might also want to wrap the Flush error with %w instead of %v to preserve the underlying cause:

if err := r.conn.Flush(); err != nil {
    return fmt.Errorf("initialize tables: %w", err)
}

695-762: Helper return type change for addPostroutingRules looks fine

Switching addPostroutingRules to a void helper that only enqueues rules and relies on the caller’s Flush is consistent with how nftables.Conn is typically used, and avoids per-helper flushes. The trailing return at Line 762 is redundant in Go and can be dropped to keep the function a bit cleaner:

-func (r *router) addPostroutingRules() {
+func (r *router) addPostroutingRules() {-  r.conn.AddRule(&nftables.Rule{
+  r.conn.AddRule(&nftables.Rule{
       …
   })
-
-  return
 }

996-1069: Direct Flush in acceptFilterRulesNftables is OK, consider consistent wrapping

Having acceptFilterRulesNftables return r.conn.Flush() directly is functionally fine, and acceptForwardRules already provides context in its log message when this fails. If you want consistent error formatting with other Flush callers in this file, you could optionally wrap it:

-func (r *router) acceptFilterRulesNftables() error {
+func (r *router) acceptFilterRulesNftables() error {-    return r.conn.Flush()
+    if err := r.conn.Flush(); err != nil {
+        return fmt.Errorf(flushError, err)
+    }
+    return nil
 }

Not required, but would align this helper with other nftables operations.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 290fe2d and 677e75f.

📒 Files selected for processing (1)
  • client/firewall/nftables/router_linux.go (7 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
  • GitHub Check: Management / Unit (amd64, postgres)
  • GitHub Check: Management / Unit (amd64, mysql)
  • GitHub Check: Relay / Unit (386)
  • GitHub Check: Relay / Unit (amd64, -race)
  • GitHub Check: Client (Docker) / Unit
  • GitHub Check: Client / Unit (amd64)
  • GitHub Check: Client / Unit (386)
  • GitHub Check: Linux
  • GitHub Check: Darwin
  • GitHub Check: Windows
  • GitHub Check: Client / Unit
  • GitHub Check: JS / Lint
  • GitHub Check: Client / Unit
  • GitHub Check: release_ui_darwin
  • GitHub Check: release
  • GitHub Check: Android / Build
  • GitHub Check: Client / Unit
🔇 Additional comments (1)
client/firewall/nftables/router_linux.go (1)

1193-1205: Improved error wrapping in refreshRulesMap looks good

Using %w in return fmt.Errorf("list rules: %w", err) together with the shared refreshRulesMapError format string at call sites gives you both consistent log messages and proper error chaining for callers that want to inspect the root cause. No functional issues here.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
client/firewall/nftables/router_linux.go (2)

233-253: createContainers now treats MSS clamping, accept rules, and rules‑map refresh as best‑effort

The new flow:

  • Creates all chains and postrouting rules (insertReturnTrafficRule, addPostroutingRules) and then does a single Flush(). This isolates critical setup and ensures either all core chains/rules are committed or none are.
  • Calls addMSSClampingRules() and acceptForwardRules() and only logs on error, so both features become non‑fatal/optional.
  • Also logs (rather than propagates) refreshRulesMap() failures.

This matches the PR goal of keeping optional features (MSS clamping and accept‑rules plumbing) from breaking core routing, with addMSSClampingRules/acceptFilterRulesNftables doing their own dedicated Flush. One minor nit: for consistency with the rest of the file’s wrapped errors, you could switch

return fmt.Errorf("initialize tables: %v", err)

to use %w.

- if err := r.conn.Flush(); err != nil {
-     return fmt.Errorf("initialize tables: %v", err)
- }
+ if err := r.conn.Flush(); err != nil {
+     return fmt.Errorf("initialize tables: %w", err)
+ }

Also, note the behavioural change: init() will now succeed even if acceptForwardRules or refreshRulesMap fail. That’s likely acceptable given they are best‑effort, but worth double‑checking against expectations for environments with restrictive host firewalls or pre‑existing nftables state.


691-759: Remove redundant return statement from void function

All call sites of addPostroutingRules() in the nftables router are correctly updated—the single invocation at line 235 properly calls it without error handling. The void signature change is clean.

Remove the redundant return at the end of the function (around line 759):

 func (r *router) addPostroutingRules() {
     r.conn.AddRule(&nftables.Rule{
         Table: r.workTable,
         Chain: r.chains[chainNameRoutingNat],
         Exprs: exprs2,
     })
-
-    return
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 677e75f and 3900128.

📒 Files selected for processing (1)
  • client/firewall/nftables/router_linux.go (9 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
  • GitHub Check: Client / Unit (amd64)
  • GitHub Check: Signal / Unit (amd64)
  • GitHub Check: Management / Unit (amd64, mysql)
  • GitHub Check: Management / Unit (amd64, sqlite)
  • GitHub Check: Signal / Unit (386)
  • GitHub Check: Management / Unit (amd64, postgres)
  • GitHub Check: Relay / Unit (amd64, -race)
  • GitHub Check: Relay / Unit (386)
  • GitHub Check: Client / Unit (386)
  • GitHub Check: Client (Docker) / Unit
  • GitHub Check: Client / Unit
  • GitHub Check: Client / Unit
  • GitHub Check: Linux
  • GitHub Check: Windows
  • GitHub Check: Darwin
  • GitHub Check: Client / Unit
  • GitHub Check: release_ui_darwin
  • GitHub Check: release
  • GitHub Check: iOS / Build
  • GitHub Check: Android / Build
  • GitHub Check: JS / Lint
🔇 Additional comments (5)
client/firewall/nftables/router_linux.go (5)

91-95: Filter-table load errors are now fully non‑fatal – confirm this change in failure semantics is desired

newRouter now logs and continues on any loadFilterTable error, leaving filterTable nil and effectively disabling accept‑rules setup (later guarded by r.filterTable == nil). This aligns with treating filter‑table manipulation as optional, but it also means:

  • nftables router construction will no longer fail on unexpected errors from ListTablesOfFamily (e.g. permission issues, corrupted state), so higher‑level fallback logic that relied on newRouter failing may no longer trigger.
  • Existing filter rules from previous runs might remain in place when we can’t read/modify the table.

If that change in behaviour is intentional, the code is fine as‑is. Otherwise, consider distinguishing errFilterTableNotFound (log+skip) from other errors (still fatal), and/or logging non‑“not found” errors at a higher level.


171-175: Improved error wrapping for ListTablesOfFamily

Switching to fmt.Errorf("list tables: %w", err) gives better error chaining and consistent formatting with other callers. No issues here.


992-1065: acceptFilterRulesNftables now owns its Flush, aligning with optional best‑effort setup

Adding return r.conn.Flush() at the end of acceptFilterRulesNftables means:

  • nftables‑based accept rules are committed in their own transaction, separate from the critical routing setup flush in createContainers.
  • Any nftables error now cleanly surfaces as an error from acceptForwardRules, which createContainers logs and ignores, keeping these rules best‑effort.

This is consistent with the new initialization strategy and doesn’t appear to introduce new failure modes, given that iptables‑based setup was already immediate and independent.


1189-1193: refreshRulesMap error wrapping improved

Changing the error from "unable to list rules: %v" to "list rules: %w" (via fmt.Errorf) makes the error message cleaner and preserves the original error value for callers. No behavioural issues with this change.


763-836: Verification confirms the change is safe—addMSSClampingRules is only called after a preceding Flush().

The function is invoked once in createContainers() at line 241, after r.conn.Flush() has already been called at line 237. The internal Flush() in addMSSClampingRules therefore does not disrupt any batched operations. The pattern matches the iptables implementation, where the error is logged but does not abort initialization. No other code path calls addMSSClampingRules, confirming the new internal flush poses no risk to existing batching contexts.

pappz
pappz previously approved these changes Nov 22, 2025
@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants