[client] Add TCP DNS support for local listener#5758
Conversation
📝 WalkthroughWalkthroughReplaces ID-based packet hook API with outbound-only atomic setters, adds on-demand gVisor-backed TCP DNS stack and TCP forwarding, propagates DNS transport context for upstream selection, introduces OUTPUT DNAT support across firewall backends, and updates tests, mocks, and logging accordingly. Changes
Sequence Diagram(s)sequenceDiagram
participant Device as Device/Packet Pipeline
participant Hook as PacketHook (atomic)
participant DNSMux as DNS Mux
participant TruncWriter as Truncation-Aware Writer
participant TCPStack as TCP DNS Stack (gVisor)
participant Upstream as Upstream Resolver
Device->>Hook: Outbound packet arrives (IP:port, bytes)
Hook->>Hook: Check exact ip:port against udpHookOut/tcpHookOut
alt Hook matches and returns true (drop/intercept)
Hook->>DNSMux: Decode DNS and call ServeDNS
DNSMux->>TruncWriter: ServeDNS(request)
TruncWriter->>TruncWriter: If msg.Truncated and tcpDNS != nil
alt Truncated
TruncWriter->>TCPStack: Ensure running / InjectPacket()
TCPStack->>TruncWriter: Handle TCP DNS flows, reply
end
TruncWriter->>Upstream: ExchangeWithFallback(ctx)
Upstream->>TruncWriter: Return response
TruncWriter->>Device: Write response (UDP/TCP framed)
else No hook match
Hook->>Device: Let packet flow normally
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
client/internal/dns/service_memory.go (1)
120-141:⚠️ Potential issue | 🟡 MinorPotential panic if UDP layer assertion fails.
At line 124,
udpLayer.(*layers.UDP)will panic ifudpLayeris nil (which happens if the packet doesn't have a UDP layer). The hook is registered for UDP traffic, but defensive coding would add a nil check.🛡️ Proposed defensive check
hook := func(packetData []byte) bool { packet := gopacket.NewPacket(packetData, firstLayerDecoder, gopacket.Default) udpLayer := packet.Layer(layers.LayerTypeUDP) + if udpLayer == nil { + log.Tracef("packet has no UDP layer") + return true + } udp := udpLayer.(*layers.UDP)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/service_memory.go` around lines 120 - 141, The hook's direct type assertion udp := udpLayer.(*layers.UDP) can panic when packet.Layer(layers.LayerTypeUDP) returns nil; update the hook (the closure assigned to variable hook) to defensively check the UDP layer before asserting — e.g., obtain udpLayer := packet.Layer(layers.LayerTypeUDP); if udpLayer == nil { return true } and perform a safe type assertion (udp, ok := udpLayer.(*layers.UDP)); if !ok { return true } — then proceed to construct truncationAwareWriter/responseWriter and call s.dnsMux.ServeDNS(writer, msg) as before.
🧹 Nitpick comments (6)
client/firewall/uspfilter/hooks_bench_test.go (1)
88-90: Move address parsing out of the timed loop.These benchmarks currently pay for
netip.MustParseAddr(...)on every iteration, so the numbers include string parsing in addition to hook lookup. Precompute thenetip.Addrbefore the timer starts if you want the hook fast path in isolation.Suggested fix
+dst := netip.MustParseAddr("100.10.255.254") + b.ResetTimer() b.ReportAllocs() for b.Loop() { - m.udpHooksDrop(53, netip.MustParseAddr("100.10.255.254"), pkt) + m.udpHooksDrop(53, dst, pkt) }Also applies to: 103-105, 117-119, 131-133, 145-148
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/firewall/uspfilter/hooks_bench_test.go` around lines 88 - 90, The benchmark is parsing addresses inside the timed loop, so move the netip.MustParseAddr(...) calls out of the for b.Loop() body and precompute netip.Addr variables (e.g., addr100 := netip.MustParseAddr("100.10.255.254")) before the loop or before b.ResetTimer(), then call m.udpHooksDrop(53, addr100, pkt) inside the loop; apply the same change for the other occurrences mentioned (lines around 103–105, 117–119, 131–133, 145–148) so the benchmark measures only the hook path and not address parsing.client/internal/dns/tcpstack.go (3)
385-387: Add brief comments to empty TSIG stubs per static analysis.These are intentional no-ops since TSIG is not supported.
📝 Proposed documentation
-func (w *tcpResponseWriter) TsigStatus() error { return nil } -func (w *tcpResponseWriter) TsigTimersOnly(bool) {} -func (w *tcpResponseWriter) Hijack() {} +func (w *tcpResponseWriter) TsigStatus() error { return nil } // TSIG not supported +func (w *tcpResponseWriter) TsigTimersOnly(bool) {} // TSIG not supported +func (w *tcpResponseWriter) Hijack() {} // not applicable🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/tcpstack.go` around lines 385 - 387, Add brief explanatory comments to the three empty TSIG stub methods on tcpResponseWriter to satisfy static analysis: above TsigStatus(), TsigTimersOnly(bool), and Hijack() add one-line comments stating these are intentional no-ops because TSIG is not supported by the TCP stack (e.g., "TsigStatus is a no-op; TSIG not supported in TCP response writer"). Keep comments short and directly reference the method names so reviewers can see these are intentional stubs.
271-292: Consider adding a write deadline for TCP DNS responses.The read timeout is set at line 272, but there's no write deadline. If the client stops reading,
WriteMsgcould block indefinitely. This is a minor concern since DNS responses are small, but a write deadline would improve robustness.💡 Suggested improvement
for { - if err := conn.SetDeadline(time.Now().Add(dnsTCPReadTimeout)); err != nil { + deadline := time.Now().Add(dnsTCPReadTimeout) + if err := conn.SetDeadline(deadline); err != nil { log.Debugf("TCP DNS: set deadline for %s: %v", remoteAddr, err) break }Note:
SetDeadlinealready applies to both reads and writes, so the current code is actually correct. Disregard this if that was intentional.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/tcpstack.go` around lines 271 - 292, The loop that reads TCP DNS messages sets a read deadline with conn.SetDeadline but does not set an explicit write deadline before sending responses; update tcpResponseWriter or the serving path (e.g., right before calling t.mux.ServeDNS in the loop or inside tcpResponseWriter.WriteMsg) to call conn.SetWriteDeadline(time.Now().Add(dnsTCPWriteTimeout)) (or conn.SetDeadline accordingly using a write-specific timeout constant) so that WriteMsg cannot block indefinitely if the client stops reading; reference tcpResponseWriter, readTCPDNSMessage, and the ServeDNS call to locate where to add the write deadline.
307-314: Add brief comments to empty interface methods per static analysis.SonarCloud flagged these empty implementations. Adding a single-line comment clarifies these are intentional no-ops required by the
stack.LinkEndpointinterface.📝 Proposed documentation
-func (e *dnsEndpoint) Wait() {} +func (e *dnsEndpoint) Wait() {} // no-op: synchronous writes func (e *dnsEndpoint) ARPHardwareType() header.ARPHardwareType { return header.ARPHardwareNone } -func (e *dnsEndpoint) AddHeader(*stack.PacketBuffer) {} -func (e *dnsEndpoint) ParseHeader(*stack.PacketBuffer) bool { return true } -func (e *dnsEndpoint) Close() {} -func (e *dnsEndpoint) SetLinkAddress(tcpip.LinkAddress) {} +func (e *dnsEndpoint) AddHeader(*stack.PacketBuffer) {} // no link-layer header +func (e *dnsEndpoint) ParseHeader(*stack.PacketBuffer) bool { return true } // no link-layer header to parse +func (e *dnsEndpoint) Close() {} // cleanup handled by stack.Close +func (e *dnsEndpoint) SetLinkAddress(tcpip.LinkAddress) {} // not applicable for TUN func (e *dnsEndpoint) SetMTU(mtu uint32) { e.mtu.Store(mtu) } -func (e *dnsEndpoint) SetOnCloseAction(func()) {} +func (e *dnsEndpoint) SetOnCloseAction(func()) {} // not needed🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/tcpstack.go` around lines 307 - 314, The empty implementations on dnsEndpoint (methods Wait, AddHeader, ParseHeader, Close, SetLinkAddress, SetOnCloseAction) are flagged by static analysis; update each of these methods to include a single-line comment inside the function body stating they are intentional no-ops to satisfy stack.LinkEndpoint (e.g., "// no-op: required by stack.LinkEndpoint" or similar), and for ParseHeader keep the existing true return but precede it with a comment explaining it intentionally accepts all packets; leave SetMTU and ARPHardwareType unchanged.client/internal/dns/service_listener.go (1)
90-103: TCP server goroutine doesn't track listener status.The UDP goroutine sets
listenerIsRunningviasetListenerStatus, but the TCP goroutine does not. If the TCP server fails to start while UDP succeeds,listenerIsRunningwill still be true. This is minor since both share the same address, but worth noting.Consider logging or tracking TCP server startup failure separately, or documenting that
listenerIsRunningonly tracks the UDP server lifecycle.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/service_listener.go` around lines 90 - 103, The TCP goroutine currently doesn't update listenerIsRunning while the UDP one does; modify the TCP goroutine that calls s.tcpServer.ListenAndServe() to mirror the UDP goroutine by calling s.setListenerStatus(true) at start and defer s.setListenerStatus(false) so listenerIsRunning reflects TCP lifecycle too, and ensure the existing log.Errorf for the ListenAndServe error remains (or add a distinct log message for TCP startup failure) to make failures explicit; reference s.tcpServer, s.server, ListenAndServe, and setListenerStatus when making the change.client/internal/dns/upstream.go (1)
79-83: Consider documenting the naming convention for unexported function.
contextWithupstreamProtocolResultuses inconsistent casing (upstreamvs typical Go camelCaseUpstream). While this is internal, consistent naming improves readability.📝 Naming suggestion
-// contextWithupstreamProtocolResult stores a mutable result holder in the context. -func contextWithupstreamProtocolResult(ctx context.Context) (context.Context, *upstreamProtocolResult) { +// contextWithUpstreamProtocolResult stores a mutable result holder in the context. +func contextWithUpstreamProtocolResult(ctx context.Context) (context.Context, *upstreamProtocolResult) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/upstream.go` around lines 79 - 83, The function name contextWithupstreamProtocolResult uses inconsistent camelCase; rename it to contextWithUpstreamProtocolResult (preserving unexported visibility) and update its declaration, any callers, tests, and associated comment to match; ensure the related types upstreamProtocolResult and upstreamProtocolKey{} remain unchanged and update any references to the old name so builds pass and godoc/comment capitalization aligns with the new identifier.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@client/firewall/uspfilter/filter_test.go`:
- Around line 190-211: The new hook tests (e.g., TestSetUDPPacketHook) create a
manager with Create(...) but never shut it down; add a manager.Close(nil) at the
end of each new test that calls Create (including the other test at lines
213-234) to stop background timers/goroutines and prevent bleed into later
tests—place the call after the final assertions (after
manager.SetUDPPacketHook(..., nil) and nil-check).
In `@client/firewall/uspfilter/hooks_bench_test.go`:
- Around line 69-76: newBenchManager creates a Manager via Create but never
tears it down, leaving goroutines/timers running; fix by registering a b.Cleanup
that closes the manager (e.g. call m.Close() or Manager's shutdown method)
immediately after require.NoError(b, err) in newBenchManager so the Manager is
stopped between benchmark cases.
In `@client/internal/dns/service_listener_test.go`:
- Line 67: The deferred call defer svc.Stop() in service_listener_test.go
swallows any shutdown error; update the defer to call svc.Stop() and assert its
returned error is nil (e.g., defer func(){ if err := svc.Stop(); err != nil {
t.Fatalf("svc.Stop failed: %v", err) } }()), so the test fails immediately if
Stop() returns an error; reference the svc variable and the Stop() method in the
test.
- Around line 32-70: The test currently closes udpLn and tcpLn then starts
svc.server.ListenAndServe()/svc.tcpServer.ListenAndServe(), causing a TOCTOU
race; instead keep the acquired udpLn/tcpLn open, assign udpLn to the UDP
server's PacketConn field and tcpLn to the TCP server's Listener field
(references: variables udpLn, tcpLn and struct instances svc.server,
svc.tcpServer), then call the deterministic startup method ActivateAndServe() on
each server (instead of ListenAndServe()) so the already-bound sockets are used
directly; update svc.listenerIsRunning and defer svc.Stop() as before and remove
the artificial time.Sleep.
In `@client/internal/dns/service_memory.go`:
- Around line 132-140: The code dereferences s.wgInterface.GetDevice().Device
when constructing responseWriter which can panic if GetDevice() returns nil;
before using it in truncationAwareWriter, check the result of
s.wgInterface.GetDevice() (or its Device field) for nil and handle that case
(e.g., log and skip starting the goroutine or provide a safe zero value) so
truncationAwareWriter/responseWriter never receives a nil Device; update the
block creating truncationAwareWriter and the call to s.dnsMux.ServeDNS to guard
against a nil device.
In `@client/internal/dns/tcpstack.go`:
- Around line 373-379: The tcpResponseWriter.Write implementation returns the
number of bytes written by w.conn.Write(buf), which includes the 2-byte length
prefix; update tcpResponseWriter.Write to write the prefixed buffer to w.conn
but return the logical count of payload bytes (len(data)) per dns.ResponseWriter
semantics, and propagate any write error; if conn.Write writes fewer than
len(buf) bytes return io.ErrShortWrite (or the error) while still ensuring the
returned byte count is len(data) on success.
In `@client/internal/dns/upstream_test.go`:
- Around line 542-602: TestExchangeWithFallback_UDPFallbackToTCP currently only
exercises UDP success; change the UDP handler to force a truncated UDP response
so ExchangeWithFallback must retry over TCP and then assert the TCP result is
returned. Concretely, in the test's handler (used by udpServer and tcpServer)
return a truncated UDP reply (either set m.Truncated = true when w is a
dns.PacketConn/udp writer or construct an oversized UDP answer by appending many
A records for the UDP path) and ensure the tcpServer returns a full,
non-truncated answer; then call ExchangeWithFallback(ctx, client, r, addr) and
assert no error, rm is non-nil, rm.Truncated == false and the Answer contains
"10.0.0.3" to verify the TCP fallback path in ExchangeWithFallback is exercised.
In `@client/internal/engine.go`:
- Around line 1809-1811: The DNS server is being created with a nil firewall
because e.firewall is passed into DefaultServerConfig before createFirewall()
runs; update initialization so the firewall is available to newDnsServer():
either move the createFirewall() call to run before constructing
DefaultServerConfig/newDnsServer(), or allow post-construction injection by
adding a SetFirewall(fw Firewall) method on the DNS server type and call
dnsServer.SetFirewall(e.firewall) immediately after createFirewall(); adjust
usages of DefaultServerConfig and newDnsServer() accordingly to ensure the DNS
listener is always given a non-nil firewall reference.
---
Outside diff comments:
In `@client/internal/dns/service_memory.go`:
- Around line 120-141: The hook's direct type assertion udp :=
udpLayer.(*layers.UDP) can panic when packet.Layer(layers.LayerTypeUDP) returns
nil; update the hook (the closure assigned to variable hook) to defensively
check the UDP layer before asserting — e.g., obtain udpLayer :=
packet.Layer(layers.LayerTypeUDP); if udpLayer == nil { return true } and
perform a safe type assertion (udp, ok := udpLayer.(*layers.UDP)); if !ok {
return true } — then proceed to construct truncationAwareWriter/responseWriter
and call s.dnsMux.ServeDNS(writer, msg) as before.
---
Nitpick comments:
In `@client/firewall/uspfilter/hooks_bench_test.go`:
- Around line 88-90: The benchmark is parsing addresses inside the timed loop,
so move the netip.MustParseAddr(...) calls out of the for b.Loop() body and
precompute netip.Addr variables (e.g., addr100 :=
netip.MustParseAddr("100.10.255.254")) before the loop or before b.ResetTimer(),
then call m.udpHooksDrop(53, addr100, pkt) inside the loop; apply the same
change for the other occurrences mentioned (lines around 103–105, 117–119,
131–133, 145–148) so the benchmark measures only the hook path and not address
parsing.
In `@client/internal/dns/service_listener.go`:
- Around line 90-103: The TCP goroutine currently doesn't update
listenerIsRunning while the UDP one does; modify the TCP goroutine that calls
s.tcpServer.ListenAndServe() to mirror the UDP goroutine by calling
s.setListenerStatus(true) at start and defer s.setListenerStatus(false) so
listenerIsRunning reflects TCP lifecycle too, and ensure the existing log.Errorf
for the ListenAndServe error remains (or add a distinct log message for TCP
startup failure) to make failures explicit; reference s.tcpServer, s.server,
ListenAndServe, and setListenerStatus when making the change.
In `@client/internal/dns/tcpstack.go`:
- Around line 385-387: Add brief explanatory comments to the three empty TSIG
stub methods on tcpResponseWriter to satisfy static analysis: above
TsigStatus(), TsigTimersOnly(bool), and Hijack() add one-line comments stating
these are intentional no-ops because TSIG is not supported by the TCP stack
(e.g., "TsigStatus is a no-op; TSIG not supported in TCP response writer"). Keep
comments short and directly reference the method names so reviewers can see
these are intentional stubs.
- Around line 271-292: The loop that reads TCP DNS messages sets a read deadline
with conn.SetDeadline but does not set an explicit write deadline before sending
responses; update tcpResponseWriter or the serving path (e.g., right before
calling t.mux.ServeDNS in the loop or inside tcpResponseWriter.WriteMsg) to call
conn.SetWriteDeadline(time.Now().Add(dnsTCPWriteTimeout)) (or conn.SetDeadline
accordingly using a write-specific timeout constant) so that WriteMsg cannot
block indefinitely if the client stops reading; reference tcpResponseWriter,
readTCPDNSMessage, and the ServeDNS call to locate where to add the write
deadline.
- Around line 307-314: The empty implementations on dnsEndpoint (methods Wait,
AddHeader, ParseHeader, Close, SetLinkAddress, SetOnCloseAction) are flagged by
static analysis; update each of these methods to include a single-line comment
inside the function body stating they are intentional no-ops to satisfy
stack.LinkEndpoint (e.g., "// no-op: required by stack.LinkEndpoint" or
similar), and for ParseHeader keep the existing true return but precede it with
a comment explaining it intentionally accepts all packets; leave SetMTU and
ARPHardwareType unchanged.
In `@client/internal/dns/upstream.go`:
- Around line 79-83: The function name contextWithupstreamProtocolResult uses
inconsistent camelCase; rename it to contextWithUpstreamProtocolResult
(preserving unexported visibility) and update its declaration, any callers,
tests, and associated comment to match; ensure the related types
upstreamProtocolResult and upstreamProtocolKey{} remain unchanged and update any
references to the old name so builds pass and godoc/comment capitalization
aligns with the new identifier.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f3bd686d-771b-49bc-a8e5-ebbd0f6b0270
📒 Files selected for processing (22)
client/firewall/uspfilter/filter.goclient/firewall/uspfilter/filter_test.goclient/firewall/uspfilter/hooks_bench_test.goclient/firewall/uspfilter/rule.goclient/firewall/uspfilter/tracer_test.goclient/iface/device/device_filter.goclient/iface/mocks/filter.goclient/iface/mocks/iface/mocks/filter.goclient/internal/dns/handler_chain.goclient/internal/dns/response_writer.goclient/internal/dns/server.goclient/internal/dns/server_test.goclient/internal/dns/service.goclient/internal/dns/service_listener.goclient/internal/dns/service_listener_test.goclient/internal/dns/service_memory.goclient/internal/dns/tcpstack.goclient/internal/dns/upstream.goclient/internal/dns/upstream_android.goclient/internal/dns/upstream_test.goclient/internal/dnsfwd/forwarder.goclient/internal/engine.go
💤 Files with no reviewable changes (1)
- client/iface/mocks/iface/mocks/filter.go
There was a problem hiding this comment.
🧹 Nitpick comments (2)
client/internal/dns/tcpstack.go (2)
316-338: Document the TUN packet offset constant.The
tunPacketOffset = 40is unexplained. Adding a brief comment would help future maintainers understand its purpose (e.g., TUN header space, alignment requirement, or buffer padding for wireguard-go).+// tunPacketOffset provides padding at the start of the buffer for TUN device writes. +// This matches the offset used by wireguard-go for packet alignment. const tunPacketOffset = 40🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/tcpstack.go` around lines 316 - 338, Add a brief doc comment above the tunPacketOffset constant explaining why it is 40, e.g., that it reserves 40 bytes of TUN header/padding required by the underlying TUN device or for alignment/compatibility with wireguard-go when calling dnsEndpoint.WritePackets (which uses tunPacketOffset as the write offset into the buffer passed to dnsEndpoint.tunDev.Write). Mention the unit (bytes) and the intent (header space / buffer alignment) so future maintainers can understand and safely change it.
130-196: Consider using defer for cleanup to reduce repetition.The cleanup pattern (
s.Close(); s.Wait()) is repeated on every error path. A helper or defer-based approach would reduce duplication.♻️ Optional refactor using defer
func (t *tcpDNSServer) startLocked() error { s := stack.New(stack.Options{ NetworkProtocols: []stack.NetworkProtocolFactory{ipv4.NewProtocol}, TransportProtocols: []stack.TransportProtocolFactory{tcp.NewProtocol}, HandleLocal: false, }) + success := false + defer func() { + if !success { + s.Close() + s.Wait() + } + }() nicID := tcpip.NICID(1) ep := &dnsEndpoint{ tunDev: t.tunDev, } ep.mtu.Store(uint32(t.mtu)) if err := s.CreateNIC(nicID, ep); err != nil { - s.Close() - s.Wait() return fmt.Errorf("create NIC: %v", err) } // ... similar for other error paths ... + success = true t.s = s t.ep = ep return nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/tcpstack.go` around lines 130 - 196, The startLocked method repeats s.Close(); s.Wait() on each error; define a deferred cleanup to eliminate duplication: after s := stack.New(...) create a local var closeStack := func(){ s.Close(); s.Wait() } and defer closeStack(), replace all explicit s.Close()/s.Wait() error returns to just return the fmt.Errorf(...) (remove those calls), and when startup succeeds (before returning nil) set closeStack = func(){} to cancel the deferred cleanup; reference tcpDNSServer.startLocked, variable s, and the existing s.Close()/s.Wait() calls when making this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@client/internal/dns/tcpstack.go`:
- Around line 316-338: Add a brief doc comment above the tunPacketOffset
constant explaining why it is 40, e.g., that it reserves 40 bytes of TUN
header/padding required by the underlying TUN device or for
alignment/compatibility with wireguard-go when calling dnsEndpoint.WritePackets
(which uses tunPacketOffset as the write offset into the buffer passed to
dnsEndpoint.tunDev.Write). Mention the unit (bytes) and the intent (header space
/ buffer alignment) so future maintainers can understand and safely change it.
- Around line 130-196: The startLocked method repeats s.Close(); s.Wait() on
each error; define a deferred cleanup to eliminate duplication: after s :=
stack.New(...) create a local var closeStack := func(){ s.Close(); s.Wait() }
and defer closeStack(), replace all explicit s.Close()/s.Wait() error returns to
just return the fmt.Errorf(...) (remove those calls), and when startup succeeds
(before returning nil) set closeStack = func(){} to cancel the deferred cleanup;
reference tcpDNSServer.startLocked, variable s, and the existing
s.Close()/s.Wait() calls when making this change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 99303097-83f9-416d-8ec0-5238e12e9e2d
📒 Files selected for processing (1)
client/internal/dns/tcpstack.go
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@client/internal/dns/service_listener.go`:
- Around line 89-103: The current startup spawns two goroutines that call
s.server.ListenAndServe() and s.tcpServer.ListenAndServe() and only log TCP
errors, which lets Listen() return successfully before TCP has actually bound;
change startup to pre-bind or synchronously verify both transports before
setting listener status: call the underlying Listen/ListenTCP (or ListenAndServe
variants that return binding errors) for both s.server and s.tcpServer (or use
net.Listen to obtain UDP and TCP listeners) and return or propagate any bind
errors upstream instead of only logging them; only call
s.setListenerStatus(true) after both binds succeed and ensure deferred
s.setListenerStatus(false) still runs on failure.
- Around line 105-114: The current branch that configures TCP DNAT when
s.ebpfService != nil and s.listenPort != DefaultPort only logs failures and
leaves TCP DNS unreachable; change it to refuse the eBPF fallback when TCP/53
can't be forwarded: inside the branch that checks s.ebpfService, s.listenPort
and DefaultPort, first guard against s.firewall == nil by either returning an
error from the enclosing startup/configure function or disabling s.ebpfService
(so the eBPF path is skipped), and if s.firewall != nil call
s.firewall.AddInboundDNAT(...) and if that call returns an error do not silently
log—propagate the error (return it) or disable s.ebpfService and set
tcpDNATConfigured = false; use the existing symbols (s.ebpfService, s.firewall,
s.listenPort, DefaultPort, AddInboundDNAT, tcpDNATConfigured) so startup fails
or the eBPF fallback is skipped rather than leaving TCP broken.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b58e9f0e-5cef-414c-ae69-5e79f040d067
📒 Files selected for processing (10)
client/firewall/uspfilter/filter_test.goclient/firewall/uspfilter/hooks_bench_test.goclient/internal/dns/mock_server.goclient/internal/dns/server.goclient/internal/dns/service.goclient/internal/dns/service_listener.goclient/internal/dns/service_listener_test.goclient/internal/dns/service_memory.goclient/internal/dns/tcpstack.goclient/internal/engine.go
✅ Files skipped from review due to trivial changes (2)
- client/firewall/uspfilter/hooks_bench_test.go
- client/internal/dns/tcpstack.go
🚧 Files skipped from review as they are similar to previous changes (3)
- client/internal/dns/service.go
- client/firewall/uspfilter/filter_test.go
- client/internal/dns/service_listener_test.go
6b259a8 to
d72fa43
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
client/internal/dns/service_listener.go (2)
105-114:⚠️ Potential issue | 🟠 MajorDon't continue with the eBPF fallback when TCP/53 forwarding is missing.
On the non-53 eBPF path, local resolvers still retry over TCP/53. If
firewallis nil orAddOutputDNATfails,Listen()still succeeds and silently leaves TCP DNS unreachable while UDP keeps working. This branch should fail startup (and tear down anything already started) or skip the eBPF fallback entirely.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/service_listener.go` around lines 105 - 114, The current eBPF TCP DNAT branch can leave TCP/53 unreachable without failing startup; change Listen() so that when s.ebpfService != nil and s.listenPort != DefaultPort you require s.firewall != nil (otherwise disable/skip the eBPF fallback path) and if AddOutputDNAT(...) returns an error you propagate that error out of Listen() and perform teardown (call the service cleanup used on failure, e.g. s.Close() or the existing shutdown sequence) instead of logging and continuing; keep s.tcpDNATConfigured set true only on successful AddOutputDNAT and ensure no half-started eBPF path is left running when DNAT cannot be installed.
70-76:⚠️ Potential issue | 🟠 MajorRecord listener startup state synchronously after both transports are known-good.
listenerIsRunningis only updated from the UDP goroutine, afterListen()has already returned. That leaves a race where a secondListen()can start another server pair, orStop()can short-circuit while TCP, DNAT, and eBPF cleanup are still outstanding. It also lets UDP startup failures be reduced to logs even if callers were told startup succeeded.Also applies to: 90-103, 123-125
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/service_listener.go` around lines 70 - 76, The Listen() path sets listenerIsRunning asynchronously in the UDP goroutine, causing races where Listen() returns success before transports are actually ready; update the implementation so Listen() only returns after both transports (TCP and UDP) have been started and verified good, and set listenerIsRunning synchronously under the same listenerFlagLock before returning. Concretely: have the UDP startup routine surface success/failure to Listen() (e.g., via channel or error return), perform TCP/DNAT/eBPF startup and validation inline or wait for their ready signals, and only when all start steps succeed acquire listenerFlagLock and set listenerIsRunning = true; if any startup fails, propagate the error instead of logging and leaving the flag unset, and ensure Stop() checks the flag under listenerFlagLock so it won’t short-circuit cleanup.
🧹 Nitpick comments (1)
client/firewall/uspfilter/filter_test.go (1)
214-235: Add a packet-path test for TCP hooks.This only validates the stored hook metadata. The new TCP DNS path depends on
filterOutboundmatching a real TCP packet and invoking that hook, so a regression in the TCP branch would still pass. Please mirrorTestProcessOutgoingHookswithSetTCPPacketHookand a serialized TCP packet.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/firewall/uspfilter/filter_test.go` around lines 214 - 235, TestSetTCPPacketHook currently only checks stored hook metadata but doesn't exercise the TCP packet path; add a parallel test to TestProcessOutgoingHooks that constructs/serializes a real TCP packet and passes it through manager.filterOutbound so the SetTCPPacketHook hook on manager (verify via manager.tcpHookOut and the fn) is actually invoked; specifically, in TestSetTCPPacketHook use the same packet-building/serialization approach as TestProcessOutgoingHooks, call manager.filterOutbound with that packet, and assert the hook's fn was called and returned the expected result, ensuring you reference SetTCPPacketHook, manager.tcpHookOut, filterOutbound and mirror TestProcessOutgoingHooks' assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@client/firewall/uspfilter/nat.go`:
- Around line 470-484: The AddOutputDNAT and RemoveOutputDNAT methods currently
return nil when m.nativeFirewall == nil, which signals success to callers;
change both to return errNatNotSupported instead so behavior matches
AddDNATRule/DeleteDNATRule and keeps client/internal/dns/service_listener.go
from marking tcpDNATConfigured true when no native DNAT was created; locate
Manager.AddOutputDNAT and Manager.RemoveOutputDNAT and replace the nil-return
branch with returning errNatNotSupported.
---
Duplicate comments:
In `@client/internal/dns/service_listener.go`:
- Around line 105-114: The current eBPF TCP DNAT branch can leave TCP/53
unreachable without failing startup; change Listen() so that when s.ebpfService
!= nil and s.listenPort != DefaultPort you require s.firewall != nil (otherwise
disable/skip the eBPF fallback path) and if AddOutputDNAT(...) returns an error
you propagate that error out of Listen() and perform teardown (call the service
cleanup used on failure, e.g. s.Close() or the existing shutdown sequence)
instead of logging and continuing; keep s.tcpDNATConfigured set true only on
successful AddOutputDNAT and ensure no half-started eBPF path is left running
when DNAT cannot be installed.
- Around line 70-76: The Listen() path sets listenerIsRunning asynchronously in
the UDP goroutine, causing races where Listen() returns success before
transports are actually ready; update the implementation so Listen() only
returns after both transports (TCP and UDP) have been started and verified good,
and set listenerIsRunning synchronously under the same listenerFlagLock before
returning. Concretely: have the UDP startup routine surface success/failure to
Listen() (e.g., via channel or error return), perform TCP/DNAT/eBPF startup and
validation inline or wait for their ready signals, and only when all start steps
succeed acquire listenerFlagLock and set listenerIsRunning = true; if any
startup fails, propagate the error instead of logging and leaving the flag
unset, and ensure Stop() checks the flag under listenerFlagLock so it won’t
short-circuit cleanup.
---
Nitpick comments:
In `@client/firewall/uspfilter/filter_test.go`:
- Around line 214-235: TestSetTCPPacketHook currently only checks stored hook
metadata but doesn't exercise the TCP packet path; add a parallel test to
TestProcessOutgoingHooks that constructs/serializes a real TCP packet and passes
it through manager.filterOutbound so the SetTCPPacketHook hook on manager
(verify via manager.tcpHookOut and the fn) is actually invoked; specifically, in
TestSetTCPPacketHook use the same packet-building/serialization approach as
TestProcessOutgoingHooks, call manager.filterOutbound with that packet, and
assert the hook's fn was called and returned the expected result, ensuring you
reference SetTCPPacketHook, manager.tcpHookOut, filterOutbound and mirror
TestProcessOutgoingHooks' assertions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d77d5b8f-3639-41ab-9284-761d382005b7
📒 Files selected for processing (16)
client/firewall/iptables/manager_linux.goclient/firewall/iptables/router_linux.goclient/firewall/manager/firewall.goclient/firewall/nftables/manager_linux.goclient/firewall/nftables/router_linux.goclient/firewall/uspfilter/filter_test.goclient/firewall/uspfilter/hooks_bench_test.goclient/firewall/uspfilter/nat.goclient/internal/dns/mock_server.goclient/internal/dns/server.goclient/internal/dns/service.goclient/internal/dns/service_listener.goclient/internal/dns/service_listener_test.goclient/internal/dns/service_memory.goclient/internal/dns/tcpstack.goclient/internal/engine.go
✅ Files skipped from review due to trivial changes (3)
- client/internal/dns/mock_server.go
- client/firewall/uspfilter/hooks_bench_test.go
- client/internal/dns/tcpstack.go
🚧 Files skipped from review as they are similar to previous changes (2)
- client/internal/dns/service.go
- client/internal/dns/service_listener_test.go
f6d360f to
3db6e15
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
client/internal/dns/service_memory.go (1)
101-166:⚠️ Potential issue | 🟠 MajorDon't make TCP DNS setup a one-shot check during
Listen().
tcpDNSand the TCP hook are only created infilterDNSTraffic(). IfGetDevice()is nil at that moment, the UDP path can still recover later because the hook re-checksGetDevice()per packet, but TCP never does—there is no later path that installstcpDNS/SetTCPPacketHook(...). In that startup sequence, truncated UDP replies will keep sending clients into TCP retries that this service still drops until a full restart.Introduce a small
ensureTCPDNS()path that can run once the device becomes available, and call it from the packet-handling path before building the writer / serving a truncated response.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/service_memory.go` around lines 101 - 166, filterDNSTraffic() currently only creates s.tcpDNS and the TCP packet hook once at startup, so if GetDevice() was nil then the TCP path is never installed later; add an ensureTCPDNS() helper (method on ServiceViaMemory) that checks wgInterface.GetDevice(), initializes s.tcpDNS via newTCPDNSServer(...) and installs the TCP hook with filter.SetTCPPacketHook(s.runtimeIP, uint16(s.runtimePort), ...) and sets s.tcpHookSet atomically/ idempotently; then call ensureTCPDNS() from inside the UDP packet hook (before constructing truncationAwareWriter / calling dnsMux.ServeDNS) so truncated UDP responses can spawn a TCP server when the device becomes available.
🧹 Nitpick comments (1)
client/firewall/uspfilter/filter_test.go (1)
214-235: Exercise the TCP dispatch path here too.This test only loads
tcpHookOutand calls the stored callback directly, so it would still pass if outbound TCP packets never reached the hook. AfilterOutbound()assertion likeTestProcessOutgoingHookswould cover the path this PR actually depends on.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/firewall/uspfilter/filter_test.go` around lines 214 - 235, TestSetTCPPacketHook currently only inspects tcpHookOut and invokes the stored callback directly, so it doesn't validate the actual dispatch path; update the test (TestSetTCPPacketHook) to also exercise the outbound dispatch by invoking the manager's outbound processing (the same path used in TestProcessOutgoingHooks) with a crafted outbound TCP packet that matches netip.MustParseAddr("10.168.0.1") and port 53 so the stored hook is executed; ensure you call manager.SetTCPPacketHook(...) with a sentinel side-effect, run the outbound filter/dispatch method used for outgoing packets (the filterOutbound/process outgoing hook path), and assert the hook was called and tcpHookOut is cleared after setting nil.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@client/firewall/iptables/router_linux.go`:
- Around line 985-1004: ensureNATOutputChain can leave a dangling chain when
NewChain succeeded but Insert and ClearAndDeleteChain both fail; make the check
resilient by first querying the kernel for the chain and treating an
already-existing chain as okay: call the iptables client chain-existence helper
(e.g. r.iptablesClient.ChainExists or ExistsChain on tableNat, chainNATOutput)
before calling NewChain and if the chain already exists, skip NewChain and
proceed to ensure the OUTPUT jump rule is present (insert it if missing) and
then set r.rules[jumpNatOutput] and call r.updateState(); also, if NewChain
returns an error indicating “already exists”, handle that the same way instead
of failing. Ensure you reference and update r.rules[jumpNatOutput], use
r.iptablesClient.NewChain, r.iptablesClient.Insert and
r.iptablesClient.ClearAndDeleteChain, and keep r.updateState() behavior
unchanged.
---
Outside diff comments:
In `@client/internal/dns/service_memory.go`:
- Around line 101-166: filterDNSTraffic() currently only creates s.tcpDNS and
the TCP packet hook once at startup, so if GetDevice() was nil then the TCP path
is never installed later; add an ensureTCPDNS() helper (method on
ServiceViaMemory) that checks wgInterface.GetDevice(), initializes s.tcpDNS via
newTCPDNSServer(...) and installs the TCP hook with
filter.SetTCPPacketHook(s.runtimeIP, uint16(s.runtimePort), ...) and sets
s.tcpHookSet atomically/ idempotently; then call ensureTCPDNS() from inside the
UDP packet hook (before constructing truncationAwareWriter / calling
dnsMux.ServeDNS) so truncated UDP responses can spawn a TCP server when the
device becomes available.
---
Nitpick comments:
In `@client/firewall/uspfilter/filter_test.go`:
- Around line 214-235: TestSetTCPPacketHook currently only inspects tcpHookOut
and invokes the stored callback directly, so it doesn't validate the actual
dispatch path; update the test (TestSetTCPPacketHook) to also exercise the
outbound dispatch by invoking the manager's outbound processing (the same path
used in TestProcessOutgoingHooks) with a crafted outbound TCP packet that
matches netip.MustParseAddr("10.168.0.1") and port 53 so the stored hook is
executed; ensure you call manager.SetTCPPacketHook(...) with a sentinel
side-effect, run the outbound filter/dispatch method used for outgoing packets
(the filterOutbound/process outgoing hook path), and assert the hook was called
and tcpHookOut is cleared after setting nil.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3635d239-6acf-444c-a734-861fdda08286
📒 Files selected for processing (15)
client/firewall/iptables/manager_linux.goclient/firewall/iptables/router_linux.goclient/firewall/manager/firewall.goclient/firewall/nftables/manager_linux.goclient/firewall/nftables/router_linux.goclient/firewall/uspfilter/filter_test.goclient/firewall/uspfilter/nat.goclient/internal/dns/mock_server.goclient/internal/dns/server.goclient/internal/dns/service.goclient/internal/dns/service_listener.goclient/internal/dns/service_listener_test.goclient/internal/dns/service_memory.goclient/internal/dns/tcpstack.goclient/internal/engine.go
✅ Files skipped from review due to trivial changes (3)
- client/internal/engine.go
- client/firewall/nftables/manager_linux.go
- client/internal/dns/tcpstack.go
🚧 Files skipped from review as they are similar to previous changes (6)
- client/internal/dns/mock_server.go
- client/firewall/iptables/manager_linux.go
- client/internal/dns/service.go
- client/internal/dns/service_listener_test.go
- client/firewall/nftables/router_linux.go
- client/internal/dns/service_listener.go
- Rename DNSFirewall to Firewall (revive: stuttering name in dns package) - Check Stop() error in service_listener_test (errcheck) - Fix firewall nil at DNS construction: add SetFirewall to Server interface, inject from engine after createFirewall completes - Close managers in hook tests and benchmarks to prevent goroutine leaks - Fix tcpResponseWriter.Write returning wire byte count instead of data - Add nil checks for UDP layer and GetDevice in memory service hook
3db6e15 to
63cf286
Compare
- Make ensureNATOutputChain retry-safe by checking ChainExists before NewChain, handling orphaned chains from partial failures - Fix TestExchangeWithFallback_UDPFallbackToTCP to actually test the truncation-to-TCP retry path
There was a problem hiding this comment.
🧹 Nitpick comments (3)
client/internal/dns/tcpstack.go (2)
457-468: IPv6 header parsing doesn't handle extension headers.
srcIPv6returns a fixed offset ofheader.IPv6MinimumSize(40 bytes) for the transport header. If IPv6 extension headers are present, this offset would be incorrect. However, for the current use case (logging source address), this is acceptable since the source IP extraction is correct.For future-proofing, consider adding a comment noting this limitation:
func srcIPv6(pkt []byte) (netip.Addr, int) { if len(pkt) < header.IPv6MinimumSize { return netip.Addr{}, 0 } hdr := header.IPv6(pkt) src := hdr.SourceAddress() ip, ok := netip.AddrFromSlice(src.AsSlice()) if !ok { return netip.Addr{}, 0 } + // Note: This assumes no extension headers. For logging purposes only. return ip, header.IPv6MinimumSize }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/tcpstack.go` around lines 457 - 468, The srcIPv6 function extracts the IPv6 source correctly but always returns a fixed transport offset of header.IPv6MinimumSize (40 bytes) and does not parse IPv6 extension headers; update the code by adding a concise comment above srcIPv6 stating this limitation (that extension headers are not handled so the returned offset may be incorrect if extensions exist) and note that this is acceptable for the current logging-only use case, so leave the implementation unchanged for now; reference the srcIPv6 function and header.IPv6MinimumSize in the comment for clarity.
400-403: Message length validation could be tighter.The upper bound check
msgLen > 65535is redundant sincemsgLenis computed from two bytes and can never exceed 65535. The lower boundmsgLen == 0is correct. Consider also validating against a reasonable minimum DNS message size (12 bytes for header).♻️ Suggested tighter validation
msgLen := int(lenBuf[0])<<8 | int(lenBuf[1]) - if msgLen == 0 || msgLen > 65535 { + if msgLen < 12 { + return nil, fmt.Errorf("message too short: %d bytes", msgLen) + } + if msgLen > 65535 { return nil, fmt.Errorf("invalid message length: %d", msgLen) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/tcpstack.go` around lines 400 - 403, The length check on msgLen should reject messages smaller than a valid DNS header; replace the current condition on msgLen with a tighter validation such as checking if msgLen < 12 (and optionally also guard msgLen > 65535 for clarity) instead of only msgLen == 0, and update the error message to reflect the minimum required size; locate the check that computes msgLen from lenBuf (the msgLen variable in tcp read logic in tcpstack.go) and change the if to e.g. if msgLen < 12 { return nil, fmt.Errorf("invalid message length: %d (must be >= 12)", msgLen) }.client/internal/dns/upstream.go (1)
587-605: Duplicate truncation logic in ExchangeWithNetstack.The truncation logic (lines 597-603) is identical to ExchangeWithFallback (lines 554-560). Consider extracting this into a helper function to reduce duplication.
♻️ Suggested refactor to reduce duplication
+// truncateForUDPClient truncates a TCP-fetched response to fit the client's UDP buffer. +func truncateForUDPClient(rm *dns.Msg, r *dns.Msg) { + maxSize := dns.MinMsgSize + if opt := r.IsEdns0(); opt != nil { + maxSize = int(opt.UDPSize()) + } + if rm.Len() > maxSize { + rm.Truncate(maxSize) + } +}Then use
truncateForUDPClient(rm, r)in bothExchangeWithFallbackandExchangeWithNetstack.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/upstream.go` around lines 587 - 605, Extract the duplicated UDP truncation logic into a helper (e.g., truncateForUDPClient(resp *dns.Msg, req *dns.Msg)) and call it from both ExchangeWithNetstack and ExchangeWithFallback; the helper should compute maxSize = dns.MinMsgSize or use req.IsEdns0().UDPSize() when present and call resp.Truncate(maxSize) if resp.Len() > maxSize, preserving existing behavior when the request came in over UDP but was fetched via TCP.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@client/internal/dns/tcpstack.go`:
- Around line 457-468: The srcIPv6 function extracts the IPv6 source correctly
but always returns a fixed transport offset of header.IPv6MinimumSize (40 bytes)
and does not parse IPv6 extension headers; update the code by adding a concise
comment above srcIPv6 stating this limitation (that extension headers are not
handled so the returned offset may be incorrect if extensions exist) and note
that this is acceptable for the current logging-only use case, so leave the
implementation unchanged for now; reference the srcIPv6 function and
header.IPv6MinimumSize in the comment for clarity.
- Around line 400-403: The length check on msgLen should reject messages smaller
than a valid DNS header; replace the current condition on msgLen with a tighter
validation such as checking if msgLen < 12 (and optionally also guard msgLen >
65535 for clarity) instead of only msgLen == 0, and update the error message to
reflect the minimum required size; locate the check that computes msgLen from
lenBuf (the msgLen variable in tcp read logic in tcpstack.go) and change the if
to e.g. if msgLen < 12 { return nil, fmt.Errorf("invalid message length: %d
(must be >= 12)", msgLen) }.
In `@client/internal/dns/upstream.go`:
- Around line 587-605: Extract the duplicated UDP truncation logic into a helper
(e.g., truncateForUDPClient(resp *dns.Msg, req *dns.Msg)) and call it from both
ExchangeWithNetstack and ExchangeWithFallback; the helper should compute maxSize
= dns.MinMsgSize or use req.IsEdns0().UDPSize() when present and call
resp.Truncate(maxSize) if resp.Len() > maxSize, preserving existing behavior
when the request came in over UDP but was fetched via TCP.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 878fa4d9-153c-4eb4-b832-41ecba9ec690
📒 Files selected for processing (28)
client/firewall/iptables/manager_linux.goclient/firewall/iptables/router_linux.goclient/firewall/manager/firewall.goclient/firewall/nftables/manager_linux.goclient/firewall/nftables/router_linux.goclient/firewall/uspfilter/filter.goclient/firewall/uspfilter/filter_test.goclient/firewall/uspfilter/nat.goclient/firewall/uspfilter/rule.goclient/firewall/uspfilter/tracer_test.goclient/iface/device/device_filter.goclient/iface/mocks/filter.goclient/iface/mocks/iface/mocks/filter.goclient/internal/dns/handler_chain.goclient/internal/dns/mock_server.goclient/internal/dns/response_writer.goclient/internal/dns/server.goclient/internal/dns/server_test.goclient/internal/dns/service.goclient/internal/dns/service_listener.goclient/internal/dns/service_listener_test.goclient/internal/dns/service_memory.goclient/internal/dns/tcpstack.goclient/internal/dns/upstream.goclient/internal/dns/upstream_android.goclient/internal/dns/upstream_test.goclient/internal/dnsfwd/forwarder.goclient/internal/engine.go
💤 Files with no reviewable changes (1)
- client/iface/mocks/iface/mocks/filter.go
✅ Files skipped from review due to trivial changes (4)
- client/internal/dns/upstream_android.go
- client/firewall/iptables/manager_linux.go
- client/internal/engine.go
- client/firewall/iptables/router_linux.go
🚧 Files skipped from review as they are similar to previous changes (10)
- client/internal/dns/mock_server.go
- client/firewall/uspfilter/rule.go
- client/internal/dnsfwd/forwarder.go
- client/internal/dns/handler_chain.go
- client/firewall/nftables/manager_linux.go
- client/internal/dns/upstream_test.go
- client/internal/dns/server_test.go
- client/internal/dns/service_listener_test.go
- client/firewall/uspfilter/filter_test.go
- client/firewall/uspfilter/nat.go
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (5)
client/firewall/nftables/router_linux.go (1)
39-39: Minor: Consider cleanup of empty NAT output chain.The
chainNameNATOutputchain is created lazily but not cleaned up when empty (e.g., after all OUTPUT DNAT rules are removed). This is minor—empty chains don't cause functional issues—but you may want to track this for future cleanup if the firewall lifecycle becomes more dynamic.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/firewall/nftables/router_linux.go` at line 39, The NAT output chain netbird-nat-output (chainNameNATOutput) is created lazily but never removed; update the code paths that remove OUTPUT DNAT rules (e.g., the function(s) that delete DNAT rules or the routine that tears down NAT rules) to check the chain's rule count and, if zero, delete the chain and its hook from nftables. Concretely: after removing an OUTPUT DNAT rule, call into the nftables client to list rules in chainNameNATOutput, and if none remain call the existing chain/table delete functions (or add a cleanup function like cleanupNATOutputChain) to remove the chain and its associated hook/chain entry to avoid leaving empty chains behind.client/internal/dns/server.go (1)
155-156: Verify lateSetFirewall()calls cannot miss TCP DNAT setup.
newServiceViaListener(..., nil)now creates the listener with no firewall, andSetFirewall()only swapssvc.firewall.client/internal/dns/service_listener.goadds the TCP DNAT rule duringListen(), so if any caller wires the firewall after the firstListen(), custom-port TCP/53 redirection never gets backfilled. Please either makeSetFirewall()program/remove DNAT on a running listener or confirm every call site sets it before the firstListen().Also applies to: 378-385
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/server.go` around lines 155 - 156, The current call site creates the service with newServiceViaListener(..., nil) so the listener is created without firewall rules and later SetFirewall only swaps svc.firewall, which misses the TCP DNAT added during Listen(); update SetFirewall on the service (method SetFirewall) to apply or remove the TCP DNAT immediately on any already-running listener (i.e., invoke the same DNAT programming/removal logic that Listen() uses when a firewall is present) so late wiring backfills redirection, or alternatively ensure every caller of newServiceViaListener passes a non-nil firewall before Listen() is ever called; specifically modify SetFirewall to detect svc.listener (from service_listener.go) and call the listener’s DNAT install/remove routine (the same codepath Listen uses) so custom-port TCP/53 redirection is consistent.client/firewall/uspfilter/filter_test.go (1)
214-236: Add a packet-level TCP hook test.This verifies the setter/clear semantics, but the new TCP branch in
client/firewall/uspfilter/filter.goat Line 727 still has no end-to-end coverage for matchingdstIP:dportand short-circuitingfilterOutbound(). A TCP test mirroringTestProcessOutgoingHookswould better protect the DNS-over-TCP path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/firewall/uspfilter/filter_test.go` around lines 214 - 236, Add an end-to-end test that mirrors TestProcessOutgoingHooks but exercises the TCP branch added in filter.go (around the new TCP handling at the dstIP:dport match), using SetTCPPacketHook to register a hook for a target dstIP and port and asserting the hook is invoked and that filterOutbound short-circuits (i.e., the packet is accepted/blocked per the hook) for DNS-over-TCP; specifically create a test that constructs the manager (as in TestSetTCPPacketHook), registers a TCP packet hook via SetTCPPacketHook, synthesizes an outgoing TCP packet matching dstIP:dport, calls the same processing path used in production (the outgoing packet processing function that contains the TCP branch / filterOutbound), and assert the hook was called and the processing returned the short-circuited result, then clear the hook and assert normal processing resumes.client/internal/dns/upstream.go (1)
552-560: Extract the UDP-client truncation block into one helper.The same
EDNS0 → maxSize → Truncatesequence now lives in both exchange paths. Centralizing it will keep the standard client path and the netstack path from drifting.♻️ Suggested extraction
- // Request came in over UDP but response was fetched via TCP. - // Truncate to fit the client's UDP buffer. - maxSize := dns.MinMsgSize - if opt := r.IsEdns0(); opt != nil { - maxSize = int(opt.UDPSize()) - } - if rm.Len() > maxSize { - rm.Truncate(maxSize) - } + truncateToClientUDPSize(r, rm) ... - // Request came in over UDP but response was fetched via TCP. - // Truncate to fit the client's UDP buffer. - maxSize := dns.MinMsgSize - if opt := r.IsEdns0(); opt != nil { - maxSize = int(opt.UDPSize()) - } - if rm.Len() > maxSize { - rm.Truncate(maxSize) - } + truncateToClientUDPSize(r, rm)func truncateToClientUDPSize(req, resp *dns.Msg) { maxSize := dns.MinMsgSize if opt := req.IsEdns0(); opt != nil { maxSize = int(opt.UDPSize()) } if resp.Len() > maxSize { resp.Truncate(maxSize) } }Also applies to: 595-603
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/upstream.go` around lines 552 - 560, Extract the EDNS0→maxSize→Truncate logic into a helper (e.g., truncateToClientUDPSize(req *dns.Msg, resp *dns.Msg)) and replace the duplicated blocks in both exchange paths: the standard client path that currently uses variables r and rm and the netstack path (the second block at lines ~595-603). The helper should compute maxSize starting from dns.MinMsgSize, override from req.IsEdns0().UDPSize() when present, and call resp.Truncate(maxSize) only if resp.Len() > maxSize; then call this helper in place of the inline truncation logic in both locations.client/internal/dns/tcpstack.go (1)
355-381: Deduplicate the TCP framing logic.
WriteMsgandWriteboth rebuild the same 2-byte length prefix. A small shared helper would keep framing, error handling, and future changes from drifting between the two paths.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/tcpstack.go` around lines 355 - 381, Both WriteMsg and Write duplicate the DNS-over-TCP 2-byte length framing and write logic; extract that into a single helper on tcpResponseWriter (e.g., writeTCPFrame or frameAndWrite) that accepts a []byte, builds the 2-byte length-prefixed buffer, writes to w.conn, and returns appropriate results/errors; then have WriteMsg call msg.Pack(), check the pack error as before, and pass the packed data into the helper, and have Write call the same helper and return len(data) on success to preserve existing semantics. Ensure the helper is used by both tcpResponseWriter.WriteMsg and tcpResponseWriter.Write so framing and error handling are centralized.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@client/internal/dns/tcpstack.go`:
- Around line 272-275: The code uses
conn.SetDeadline(time.Now().Add(dnsTCPReadTimeout)) which applies the timeout to
both reads and writes; change this to
conn.SetReadDeadline(time.Now().Add(dnsTCPReadTimeout)) so only the initial read
is time-bounded (reference conn and dnsTCPReadTimeout in tcpstack.go).
Additionally, add explicit write deadlines (e.g., conn.SetWriteDeadline(...)
using a separate write timeout) immediately before the response write operations
that currently occur later in the handler (the writes around the response send
points) so writes aren’t constrained by the read deadline.
---
Nitpick comments:
In `@client/firewall/nftables/router_linux.go`:
- Line 39: The NAT output chain netbird-nat-output (chainNameNATOutput) is
created lazily but never removed; update the code paths that remove OUTPUT DNAT
rules (e.g., the function(s) that delete DNAT rules or the routine that tears
down NAT rules) to check the chain's rule count and, if zero, delete the chain
and its hook from nftables. Concretely: after removing an OUTPUT DNAT rule, call
into the nftables client to list rules in chainNameNATOutput, and if none remain
call the existing chain/table delete functions (or add a cleanup function like
cleanupNATOutputChain) to remove the chain and its associated hook/chain entry
to avoid leaving empty chains behind.
In `@client/firewall/uspfilter/filter_test.go`:
- Around line 214-236: Add an end-to-end test that mirrors
TestProcessOutgoingHooks but exercises the TCP branch added in filter.go (around
the new TCP handling at the dstIP:dport match), using SetTCPPacketHook to
register a hook for a target dstIP and port and asserting the hook is invoked
and that filterOutbound short-circuits (i.e., the packet is accepted/blocked per
the hook) for DNS-over-TCP; specifically create a test that constructs the
manager (as in TestSetTCPPacketHook), registers a TCP packet hook via
SetTCPPacketHook, synthesizes an outgoing TCP packet matching dstIP:dport, calls
the same processing path used in production (the outgoing packet processing
function that contains the TCP branch / filterOutbound), and assert the hook was
called and the processing returned the short-circuited result, then clear the
hook and assert normal processing resumes.
In `@client/internal/dns/server.go`:
- Around line 155-156: The current call site creates the service with
newServiceViaListener(..., nil) so the listener is created without firewall
rules and later SetFirewall only swaps svc.firewall, which misses the TCP DNAT
added during Listen(); update SetFirewall on the service (method SetFirewall) to
apply or remove the TCP DNAT immediately on any already-running listener (i.e.,
invoke the same DNAT programming/removal logic that Listen() uses when a
firewall is present) so late wiring backfills redirection, or alternatively
ensure every caller of newServiceViaListener passes a non-nil firewall before
Listen() is ever called; specifically modify SetFirewall to detect svc.listener
(from service_listener.go) and call the listener’s DNAT install/remove routine
(the same codepath Listen uses) so custom-port TCP/53 redirection is consistent.
In `@client/internal/dns/tcpstack.go`:
- Around line 355-381: Both WriteMsg and Write duplicate the DNS-over-TCP 2-byte
length framing and write logic; extract that into a single helper on
tcpResponseWriter (e.g., writeTCPFrame or frameAndWrite) that accepts a []byte,
builds the 2-byte length-prefixed buffer, writes to w.conn, and returns
appropriate results/errors; then have WriteMsg call msg.Pack(), check the pack
error as before, and pass the packed data into the helper, and have Write call
the same helper and return len(data) on success to preserve existing semantics.
Ensure the helper is used by both tcpResponseWriter.WriteMsg and
tcpResponseWriter.Write so framing and error handling are centralized.
In `@client/internal/dns/upstream.go`:
- Around line 552-560: Extract the EDNS0→maxSize→Truncate logic into a helper
(e.g., truncateToClientUDPSize(req *dns.Msg, resp *dns.Msg)) and replace the
duplicated blocks in both exchange paths: the standard client path that
currently uses variables r and rm and the netstack path (the second block at
lines ~595-603). The helper should compute maxSize starting from dns.MinMsgSize,
override from req.IsEdns0().UDPSize() when present, and call
resp.Truncate(maxSize) only if resp.Len() > maxSize; then call this helper in
place of the inline truncation logic in both locations.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: be315438-5e02-4c2d-a5d7-d46b197b6d3f
📒 Files selected for processing (28)
client/firewall/iptables/manager_linux.goclient/firewall/iptables/router_linux.goclient/firewall/manager/firewall.goclient/firewall/nftables/manager_linux.goclient/firewall/nftables/router_linux.goclient/firewall/uspfilter/filter.goclient/firewall/uspfilter/filter_test.goclient/firewall/uspfilter/nat.goclient/firewall/uspfilter/rule.goclient/firewall/uspfilter/tracer_test.goclient/iface/device/device_filter.goclient/iface/mocks/filter.goclient/iface/mocks/iface/mocks/filter.goclient/internal/dns/handler_chain.goclient/internal/dns/mock_server.goclient/internal/dns/response_writer.goclient/internal/dns/server.goclient/internal/dns/server_test.goclient/internal/dns/service.goclient/internal/dns/service_listener.goclient/internal/dns/service_listener_test.goclient/internal/dns/service_memory.goclient/internal/dns/tcpstack.goclient/internal/dns/upstream.goclient/internal/dns/upstream_android.goclient/internal/dns/upstream_test.goclient/internal/dnsfwd/forwarder.goclient/internal/engine.go
💤 Files with no reviewable changes (1)
- client/iface/mocks/iface/mocks/filter.go
✅ Files skipped from review due to trivial changes (4)
- client/internal/dns/upstream_android.go
- client/internal/engine.go
- client/internal/dns/upstream_test.go
- client/internal/dns/service_listener_test.go
🚧 Files skipped from review as they are similar to previous changes (12)
- client/internal/dns/mock_server.go
- client/internal/dnsfwd/forwarder.go
- client/firewall/uspfilter/rule.go
- client/internal/dns/handler_chain.go
- client/internal/dns/response_writer.go
- client/firewall/nftables/manager_linux.go
- client/firewall/iptables/manager_linux.go
- client/internal/dns/service.go
- client/iface/device/device_filter.go
- client/firewall/iptables/router_linux.go
- client/internal/dns/service_listener.go
- client/internal/dns/service_memory.go
4f31445 to
45a7839
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
client/internal/dns/service_listener.go (2)
110-113:⚠️ Potential issue | 🟠 MajorDo not publish the eBPF path after TCP DNAT setup fails.
Once this branch is active, a failed
AddOutputDNATmeans TCP retries to port 53 will never reach the local listener, butListen()still returns success and the service continues as if TCP DNS were available. Make this a startup error and unwind the already-started servers instead of warning and continuing.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/service_listener.go` around lines 110 - 113, The current branch swallows AddOutputDNAT failures (in the s.ebpfService && s.firewall && s.listenPort != DefaultPort path) and only logs a warning, leaving Listen() to return success; change this to treat AddOutputDNAT errors as fatal startup errors: when firewall.AddOutputDNAT(...) returns an error, stop publishing the eBPF path, call the server shutdown/unwind logic used elsewhere in the listener (reverse any started servers/listeners started before this point), and return the error from Listen() (or return a wrapped startup error) instead of logging and continuing; update the code paths around s.ebpfService, s.firewall, AddOutputDNAT, and Listen() to perform cleanup on failure so the service does not advertise TCP DNS when DNAT setup failed.
92-106:⚠️ Potential issue | 🟠 MajorUse one fatal-exit path for both DNS servers.
These goroutines diverge on failure: the UDP path only flips the running flag and shuts TCP down, while the TCP path only logs. If either server exits outside
Stop(), the listener can stay partially up and any DNAT/eBPF state installed later in this method is never unwound. Route both serve loops through a shared teardown path so an unexpected exit fully closes the listener once.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@client/firewall/manager/firewall.go`:
- Around line 172-176: The AddOutputDNAT and RemoveOutputDNAT APIs accept
netip.Addr but underlying implementations (iptables, nftables using
TableFamilyIPv4) are IPv4-only; add an explicit IPv4 guard: validate
localAddr.Is4() before dispatching (e.g., where s.listenIP is passed) and return
a clear error if not IPv4, or alternatively add a contract comment on the
methods stating they only accept IPv4 addresses; reference the
AddOutputDNAT/RemoveOutputDNAT signatures, the call site passing s.listenIP, and
the nftables use of TableFamilyIPv4/iptables string conversion when making the
change.
In `@client/firewall/uspfilter/filter.go`:
- Around line 143-146: resetState() currently only clears the old rule/route
state and does not clear the new per-protocol hook pointers, leaving stale hooks
(udpHookOut and tcpHookOut) active after shutdown; update resetState() to
explicitly clear/reset both atomic.Pointer[packetHook] fields (udpHookOut and
tcpHookOut) so that any installed outbound hooks are removed on firewall
reset/reuse, ensuring outgoingRules cleanup is complemented by clearing these
hook pointers.
---
Duplicate comments:
In `@client/internal/dns/service_listener.go`:
- Around line 110-113: The current branch swallows AddOutputDNAT failures (in
the s.ebpfService && s.firewall && s.listenPort != DefaultPort path) and only
logs a warning, leaving Listen() to return success; change this to treat
AddOutputDNAT errors as fatal startup errors: when firewall.AddOutputDNAT(...)
returns an error, stop publishing the eBPF path, call the server shutdown/unwind
logic used elsewhere in the listener (reverse any started servers/listeners
started before this point), and return the error from Listen() (or return a
wrapped startup error) instead of logging and continuing; update the code paths
around s.ebpfService, s.firewall, AddOutputDNAT, and Listen() to perform cleanup
on failure so the service does not advertise TCP DNS when DNAT setup failed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b9fa6831-322d-44f8-a652-c10f5f3cbb03
📒 Files selected for processing (28)
client/firewall/iptables/manager_linux.goclient/firewall/iptables/router_linux.goclient/firewall/manager/firewall.goclient/firewall/nftables/manager_linux.goclient/firewall/nftables/router_linux.goclient/firewall/uspfilter/filter.goclient/firewall/uspfilter/filter_test.goclient/firewall/uspfilter/nat.goclient/firewall/uspfilter/rule.goclient/firewall/uspfilter/tracer_test.goclient/iface/device/device_filter.goclient/iface/mocks/filter.goclient/iface/mocks/iface/mocks/filter.goclient/internal/dns/handler_chain.goclient/internal/dns/mock_server.goclient/internal/dns/response_writer.goclient/internal/dns/server.goclient/internal/dns/server_test.goclient/internal/dns/service.goclient/internal/dns/service_listener.goclient/internal/dns/service_listener_test.goclient/internal/dns/service_memory.goclient/internal/dns/tcpstack.goclient/internal/dns/upstream.goclient/internal/dns/upstream_android.goclient/internal/dns/upstream_test.goclient/internal/dnsfwd/forwarder.goclient/internal/engine.go
💤 Files with no reviewable changes (1)
- client/iface/mocks/iface/mocks/filter.go
✅ Files skipped from review due to trivial changes (2)
- client/internal/dnsfwd/forwarder.go
- client/internal/dns/tcpstack.go
🚧 Files skipped from review as they are similar to previous changes (12)
- client/internal/dns/mock_server.go
- client/firewall/uspfilter/rule.go
- client/internal/dns/upstream_android.go
- client/internal/dns/handler_chain.go
- client/internal/dns/server_test.go
- client/firewall/nftables/manager_linux.go
- client/internal/dns/service.go
- client/internal/dns/response_writer.go
- client/iface/device/device_filter.go
- client/internal/dns/upstream_test.go
- client/internal/dns/service_memory.go
- client/iface/mocks/filter.go
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@client/internal/dns/server.go`:
- Around line 411-415: The defer currently swallows s.service.Stop() errors by
only logging them; change disableDNS() so the deferred stop call returns its
error and is joined into the function's returned error. Replace the logging-only
defer that calls s.service.Stop() with a deferred closure that captures the
named return error (e.g., err), calls stopErr := s.service.Stop(), and if
stopErr != nil merges it into the returned error (use errors.Join(err, stopErr)
if available or set err = fmt.Errorf("%w; %v", err, stopErr) / fmt.Errorf("%v:
%v", err, stopErr) as appropriate). Keep the reference to s.service.Stop() and
ensure disableDNS() has a named error return so the defer can modify it.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6370a19d-2abc-44c0-944c-b7bcdf390e78
📒 Files selected for processing (3)
client/internal/dns/server.goclient/internal/dns/service_listener.goclient/internal/dns/tcpstack.go
🚧 Files skipped from review as they are similar to previous changes (2)
- client/internal/dns/service_listener.go
- client/internal/dns/tcpstack.go
There was a problem hiding this comment.
♻️ Duplicate comments (1)
client/internal/dns/server.go (1)
410-430:⚠️ Potential issue | 🟠 Major
disableDNS()still drops some teardown failures.The named return now captures
s.service.Stop(), butrestoreHostDNS()andDeleteState()are still logged and discarded. Callers can still observe shutdown as successful after partial cleanup.💡 Proposed fix
func (s *DefaultServer) disableDNS() (retErr error) { defer func() { if err := s.service.Stop(); err != nil { retErr = errors.Join(retErr, fmt.Errorf("stop DNS service: %w", err)) } }() @@ - if err := s.hostManager.restoreHostDNS(); err != nil { - log.Errorf("failed to restore host DNS settings: %v", err) - } else if err := s.stateManager.DeleteState(&ShutdownState{}); err != nil { - log.Errorf("failed to delete shutdown dns state: %v", err) + if err := s.hostManager.restoreHostDNS(); err != nil { + retErr = errors.Join(retErr, fmt.Errorf("restore host DNS settings: %w", err)) + } else if err := s.stateManager.DeleteState(&ShutdownState{}); err != nil { + retErr = errors.Join(retErr, fmt.Errorf("delete shutdown dns state: %w", err)) } @@ - return nil + return retErr }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/server.go` around lines 410 - 430, The disableDNS function currently only captures s.service.Stop() into the named return but logs and discards errors from s.hostManager.restoreHostDNS() and s.stateManager.DeleteState(&ShutdownState{}); update disableDNS so both restoreHostDNS() and DeleteState(...) errors are appended to the named return (retErr) using errors.Join (or fmt.Errorf wrapping) instead of just log.Errorf, preserving the existing log messages if desired (e.g., log.Errorf(...); retErr = errors.Join(retErr, fmt.Errorf("restore host DNS: %w", err))). Ensure the defer that joins s.service.Stop() remains and that all three error sources (restoreHostDNS, DeleteState, and service.Stop) are combined into retErr before returning.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@client/internal/dns/server.go`:
- Around line 410-430: The disableDNS function currently only captures
s.service.Stop() into the named return but logs and discards errors from
s.hostManager.restoreHostDNS() and s.stateManager.DeleteState(&ShutdownState{});
update disableDNS so both restoreHostDNS() and DeleteState(...) errors are
appended to the named return (retErr) using errors.Join (or fmt.Errorf wrapping)
instead of just log.Errorf, preserving the existing log messages if desired
(e.g., log.Errorf(...); retErr = errors.Join(retErr, fmt.Errorf("restore host
DNS: %w", err))). Ensure the defer that joins s.service.Stop() remains and that
all three error sources (restoreHostDNS, DeleteState, and service.Stop) are
combined into retErr before returning.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ca089fcf-9624-4759-ad09-dbee6cc0aaf6
📒 Files selected for processing (3)
client/firewall/manager/firewall.goclient/firewall/uspfilter/filter.goclient/internal/dns/server.go
🚧 Files skipped from review as they are similar to previous changes (1)
- client/firewall/manager/firewall.go
|
There was a problem hiding this comment.
🧹 Nitpick comments (2)
client/internal/dns/tcpstack.go (2)
331-358: Consider validating message size before encoding length prefix.If
len(data)exceeds 65535, the 2-byte big-endian encoding will silently overflow/wrap, producing an incorrect length prefix and corrupting the response. While DNS responses rarely approach this size, a defensive check would improve robustness.🛡️ Optional: add bounds check
func (w *tcpResponseWriter) WriteMsg(msg *dns.Msg) error { data, err := msg.Pack() if err != nil { return fmt.Errorf("pack: %w", err) } + if len(data) > 65535 { + return fmt.Errorf("message too large for TCP DNS: %d bytes", len(data)) + } // DNS TCP: 2-byte length prefix + message buf := make([]byte, 2+len(data))func (w *tcpResponseWriter) Write(data []byte) (int, error) { + if len(data) > 65535 { + return 0, fmt.Errorf("message too large for TCP DNS: %d bytes", len(data)) + } buf := make([]byte, 2+len(data))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/tcpstack.go` around lines 331 - 358, The WriteMsg and Write methods of tcpResponseWriter must validate that len(data) fits into the 2-byte length prefix (<= 65535) before creating the buffer; add a pre-check in tcpResponseWriter.WriteMsg and tcpResponseWriter.Write that returns a clear error (e.g., "message too large for TCP DNS length prefix") when len(data) > 65535 to avoid silent overflow/corruption, and only proceed to allocate the 2+len(data) buffer and write when the size is valid.
433-444: IPv6 transport offset ignores extension headers (acceptable for logging).
srcIPv6returns a fixed offset ofheader.IPv6MinimumSize(40), which is incorrect when extension headers are present. Since this is only used for debug logging and IPv6 support isn't complete yet (per the TODO at line 107), this is acceptable for now.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/dns/tcpstack.go` around lines 433 - 444, srcIPv6 currently returns a fixed transport offset of header.IPv6MinimumSize (40) which ignores IPv6 extension headers; either implement correct parsing of NextHeader/extension headers to compute the true transport offset or explicitly document the limitation so future maintainers know it’s intentional. Update the srcIPv6 function to either (A) parse extension headers (walk NextHeader, read each extension header length and advance the offset before returning) so the returned int is the real transport header offset, or (B) keep the current return value but add a clear comment/TODO in srcIPv6 explaining that extension headers are ignored for now and reference header.IPv6MinimumSize and the TODO at the top of the file so this is revisited when IPv6 support is completed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@client/internal/dns/tcpstack.go`:
- Around line 331-358: The WriteMsg and Write methods of tcpResponseWriter must
validate that len(data) fits into the 2-byte length prefix (<= 65535) before
creating the buffer; add a pre-check in tcpResponseWriter.WriteMsg and
tcpResponseWriter.Write that returns a clear error (e.g., "message too large for
TCP DNS length prefix") when len(data) > 65535 to avoid silent
overflow/corruption, and only proceed to allocate the 2+len(data) buffer and
write when the size is valid.
- Around line 433-444: srcIPv6 currently returns a fixed transport offset of
header.IPv6MinimumSize (40) which ignores IPv6 extension headers; either
implement correct parsing of NextHeader/extension headers to compute the true
transport offset or explicitly document the limitation so future maintainers
know it’s intentional. Update the srcIPv6 function to either (A) parse extension
headers (walk NextHeader, read each extension header length and advance the
offset before returning) so the returned int is the real transport header
offset, or (B) keep the current return value but add a clear comment/TODO in
srcIPv6 explaining that extension headers are ignored for now and reference
header.IPv6MinimumSize and the TODO at the top of the file so this is revisited
when IPv6 support is completed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1aaf5751-388d-4472-8961-f65cf197e1c6
📒 Files selected for processing (6)
client/internal/dns/handler_chain.goclient/internal/dns/response_writer.goclient/internal/dns/service_memory.goclient/internal/dns/tcpstack.goclient/internal/dns/upstream.goclient/internal/dns/upstream_test.go
✅ Files skipped from review due to trivial changes (2)
- client/internal/dns/response_writer.go
- client/internal/dns/upstream_test.go
🚧 Files skipped from review as they are similar to previous changes (3)
- client/internal/dns/handler_chain.go
- client/internal/dns/upstream.go
- client/internal/dns/service_memory.go



Describe your changes
Add TCP DNS support for the local DNS listener in userspace mode. Previously, DNS queries over TCP were silently dropped because the internal DNS server only handled UDP. This matters for large DNS responses (e.g. TXT records, DNSSEC) that exceed the UDP message size limit.
TCP DNS handling:
dns.Serveralongside the existing UDP one, plus DNAT rules when eBPF handles UDP redirection.Upstream protocol awareness:
Firewall hook cleanup:
AddUDPPacketHook/AddTCPPacketHook/RemovePacketHookwithSetUDPPacketHook/SetTCPPacketHook(pass nil to remove). Only one hook per protocol is supported, so IDs were unnecessary overhead.Observability:
client=IP:portin the handler chain and DNS forwarder for all queries (previously only available for the listener path).protocol=udp|tcpandupstream_protocol=udp|tcpin response metadata.service.Stop()now returnserrorwith accumulated errors instead of logging inline.Example log lines
TCP stack lifecycle (debug level):
Handler chain now includes client address and protocol info (trace level):
Packet tracer shows hook interception for both protocols:
Issue ticket number and link
Stack
Checklist
Documentation
Select exactly one:
Internal implementation detail, no user-facing configuration changes.
Docs PR URL (required if "docs added" is checked)
Paste the PR link from https://github.com/netbirdio/docs here:
https://github.com/netbirdio/docs/pull/__
Summary by CodeRabbit
New Features
Improvements
Tests