DNS-aware dial for HTTP reverse proxy — use NetBird nameservers for hostname resolution#5905
DNS-aware dial for HTTP reverse proxy — use NetBird nameservers for hostname resolution#5905aturkenov wants to merge 1 commit intonetbirdio:mainfrom
Conversation
…or custom domains in target reverse proxy
|
aturkenov seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
📝 WalkthroughWalkthroughThis PR adds DNS address/port retrieval capabilities throughout the codebase and integrates DNS hostname resolution into the HTTP transport layer. New methods expose DNS server addresses from the engine, a custom DNS resolver is implemented with fallback behavior, and the dial mechanism now resolves hostnames before connecting. Changes
Sequence DiagramsequenceDiagram
participant App as Application
participant Transport as HTTP Transport
participant DNS as DNS Resolution
participant Resolver as Net Resolver
participant Engine as Engine/DNS Server
participant Dialer as Dial Function
App->>Transport: Make HTTP Request
Transport->>DNS: DialContext with hostname
DNS->>DNS: Parse host:port
DNS->>Engine: GetDNSAddrPort()
Engine-->>DNS: DNS Server Address (if available)
alt DNS Server Available
DNS->>Resolver: Create Custom Resolver
Resolver->>Dialer: Dial DNS Server
Dialer-->>Resolver: Connection
Resolver->>Resolver: LookupHost for hostname
Resolver-->>DNS: Resolved IP
else DNS Server Unavailable
DNS->>Resolver: Use Default Resolver
Resolver->>Resolver: LookupHost for hostname
Resolver-->>DNS: Resolved IP
end
DNS->>Dialer: Dial IP:port
Dialer-->>Transport: Connection established
Transport-->>App: Ready for HTTP
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
There was a problem hiding this comment.
Pull request overview
Implements hostname-aware dialing for the HTTP reverse proxy by resolving backend hostnames via the embedded NetBird DNS server (instead of requiring IP literals), and exposes the DNS resolver address (IP+port) through engine/embed client APIs.
Changes:
- Add
DnsAddrPort()to the DNS server interface and implement it for default/mock DNS servers. - Expose
GetDNSAddrPort()onEngineand embeddedClient. - Implement
dialWithDNSResolutionand wire it into the proxy transportDialContext.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| proxy/internal/roundtrip/netbird.go | Wrap transport dialing with DNS-aware resolution. |
| proxy/internal/roundtrip/dns.go | New DNS-aware dial wrapper + NetBird-backed net.Resolver. |
| client/internal/engine.go | Add accessor to fetch current DNS resolver AddrPort. |
| client/internal/dns/server.go | Extend DNS server interface + implement AddrPort on DefaultServer. |
| client/internal/dns/mock_server.go | Add mock implementation of DnsAddrPort(). |
| client/embed/embed.go | Expose GetDNSAddrPort() on embedded client API. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Dial: func(ctx context.Context, _, _ string) (net.Conn, error) { | ||
| // Always use UDP toward the DNS server. The network and address | ||
| // arguments passed by net.Resolver are intentionally ignored; | ||
| // we route through the WireGuard netstack instead. | ||
| return dial(ctx, "udp", addrStr) |
There was a problem hiding this comment.
buildResolver's net.Resolver.Dial ignores the network argument and always dials UDP. net.Resolver may request TCP (e.g., on truncated UDP responses), so forcing UDP can make hostname resolution fail for larger DNS replies. Use the provided network (udp/tcp and v4/v6 variants) when calling dial, while still overriding the address to the NetBird DNS server.
| Dial: func(ctx context.Context, _, _ string) (net.Conn, error) { | |
| // Always use UDP toward the DNS server. The network and address | |
| // arguments passed by net.Resolver are intentionally ignored; | |
| // we route through the WireGuard netstack instead. | |
| return dial(ctx, "udp", addrStr) | |
| Dial: func(ctx context.Context, network, _ string) (net.Conn, error) { | |
| // Preserve the transport requested by net.Resolver (udp/tcp and | |
| // v4/v6 variants), but always route to the NetBird DNS server via | |
| // the WireGuard netstack. | |
| return dial(ctx, network, addrStr) |
| func (e *Engine) GetDNSAddrPort() (netip.AddrPort, bool) { | ||
| e.syncMsgMux.Lock() | ||
| dnsServer := e.dnsServer | ||
| e.syncMsgMux.Unlock() | ||
| if dnsServer == nil { | ||
| return netip.AddrPort{}, false | ||
| } | ||
| addr := dnsServer.DnsAddrPort() | ||
| if !addr.IsValid() { | ||
| return netip.AddrPort{}, false | ||
| } | ||
| return addr, true | ||
| } |
There was a problem hiding this comment.
GetDNSAddrPort() takes syncMsgMux, which is also used to serialize management message processing and is held across relatively long operations (e.g., Engine.Start). Because this getter is called on every dial that needs hostname resolution, it can introduce avoidable contention and request latency. Consider protecting just the DNS server address with a dedicated lightweight lock/atomic (or caching the addrport in the engine) so dialing isn’t blocked by management sync work.
| func dialWithDNSResolution( | ||
| getDNSAddr func() (netip.AddrPort, bool), | ||
| dial func(ctx context.Context, network, addr string) (net.Conn, error), | ||
| ) func(ctx context.Context, network, addr string) (net.Conn, error) { | ||
| return func(ctx context.Context, network, addr string) (net.Conn, error) { | ||
| host, port, err := net.SplitHostPort(addr) | ||
| if err != nil { | ||
| // Malformed address — let the underlying dialer handle or fail it. | ||
| return dial(ctx, network, addr) | ||
| } | ||
|
|
||
| // If the host is already an IP literal, skip resolution entirely. | ||
| if _, err := netip.ParseAddr(host); err == nil { | ||
| return dial(ctx, network, addr) | ||
| } | ||
|
|
||
| resolved, err := resolveHost(ctx, host, getDNSAddr, dial) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
|
|
||
| return dial(ctx, network, net.JoinHostPort(resolved, port)) | ||
| } |
There was a problem hiding this comment.
The new DNS-aware dialing behavior (IP-literal passthrough, hostname resolution via NetBird DNS when available, fallback to default resolver) is not covered by tests in the proxy roundtrip package. Adding unit tests for dialWithDNSResolution/buildResolver would help prevent regressions (e.g., ensuring hostnames trigger resolver usage and that IP literals don’t).
|
What is this trying to solve? DNS already goes through netstack when dialing a domain |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@proxy/internal/roundtrip/dns.go`:
- Around line 58-66: Change resolveHost to return ([]string, error) instead of a
single string: keep the existing resolver.LookupHost call, return the full addrs
slice (or an error if zero or lookup failed). Then update dialWithDNSResolution
to call resolveHost, iterate over each returned address and attempt to dial
sequentially (using the same DialContext/Dial method currently used), returning
immediately on the first successful connection; if all attempts fail return the
last dial error (include the target address in error context). Ensure error
messages distinguish between resolve errors and per-address dial errors and
preserve existing context wrapping.
- Around line 84-89: The Resolver.Dial currently ignores the network parameter
and always uses "udp", which breaks TCP fallback on truncated DNS responses;
update the anonymous Dial func in proxy/internal/roundtrip/dns.go to accept and
use the incoming network parameter (e.g., name it "network") when calling
dial(ctx, network, addrStr), falling back to "udp" only if the provided network
is empty, so net.Resolver's TCP fallback behavior is preserved and dial(ctx,
...) can still handle both "tcp" and "udp".
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c14ba192-6803-4e5d-9d68-67685292e556
📒 Files selected for processing (6)
client/embed/embed.goclient/internal/dns/mock_server.goclient/internal/dns/server.goclient/internal/engine.goproxy/internal/roundtrip/dns.goproxy/internal/roundtrip/netbird.go
| addrs, err := resolver.LookupHost(ctx, host) | ||
| if err != nil { | ||
| return "", fmt.Errorf("dns: resolve %q: %w", host, err) | ||
| } | ||
| if len(addrs) == 0 { | ||
| return "", fmt.Errorf("dns: no addresses returned for %q", host) | ||
| } | ||
| return addrs[0], nil | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
For Go HTTP dialers, what is the recommended approach when DNS returns multiple A/AAAA records—should clients attempt all resolved addresses before failing?
💡 Result:
Yes, for Go HTTP dialers (via net/http.Transport using the default net.Dialer), when DNS returns multiple A/AAAA records, clients attempt connections to addresses sequentially in order until one succeeds or all fail, before the overall dial fails. This follows "try each IP address in order until one succeeds" behavior. The net/http package resolves the hostname to all IP addresses upfront (potentially waiting for both A and AAAA). Then, http.Transport's dialConnFor calls the DialContext (defaultTransportDialContext, which is net.Dialer.DialContext) with the specific "network:ip:port" (e.g., "tcp4:1.2.3.4:80"). The net.Dialer.DialContext handles multi-homed hosts by trying each resolved IP sequentially, spreading any timeout across attempts. This is the standard, recommended approach in Go's net/http—no configuration needed for basic failover. For custom behavior (e.g., parallel "Happy Eyeballs" for IPv4/IPv6), override Transport.DialContext.
Citations:
- 1: https://pkg.go.dev/net@go1.25.1
- 2: https://golang.org/src/net/dial.go
- 3: https://go.dev/src/net/dial.go?m=text
- 4: https://go.dev/src/net/http/transport_default_other.go
🏁 Script executed:
# Locate the dns.go file in the roundtrip directory
find . -path "*/proxy/internal/roundtrip/dns.go" -type fRepository: netbirdio/netbird
Length of output: 94
🏁 Script executed:
# If file found, inspect its full content to understand context
if [ -f "proxy/internal/roundtrip/dns.go" ]; then
wc -l proxy/internal/roundtrip/dns.go
fiRepository: netbirdio/netbird
Length of output: 95
🏁 Script executed:
# Search for resolveHost usage across the codebase
rg "resolveHost" --type go -B 2 -A 5Repository: netbirdio/netbird
Length of output: 1236
🏁 Script executed:
# Also search for dialWithDNSResolution to understand how resolver is used
rg "dialWithDNSResolution" --type go -B 2 -A 10Repository: netbirdio/netbird
Length of output: 3287
🏁 Script executed:
cat -n proxy/internal/roundtrip/dns.goRepository: netbirdio/netbird
Length of output: 3879
Return all resolved addresses and implement retry logic in the caller to handle unreachable first addresses.
Using only addrs[0] makes dialing brittle when the first DNS record is unreachable but later records are valid. This differs from Go's standard library behavior (net.Dialer tries all addresses sequentially). Update resolveHost to return []string and modify dialWithDNSResolution to iterate through all addresses, dialing each until one succeeds or all fail.
Proposed direction
-func resolveHost(
+func resolveHost(
ctx context.Context,
host string,
getDNSAddr func() (netip.AddrPort, bool),
dial func(ctx context.Context, network, addr string) (net.Conn, error),
-) (string, error) {
+) ([]string, error) {
resolver := buildResolver(getDNSAddr, dial)
addrs, err := resolver.LookupHost(ctx, host)
if err != nil {
- return "", fmt.Errorf("dns: resolve %q: %w", host, err)
+ return nil, fmt.Errorf("dns: resolve %q: %w", host, err)
}
if len(addrs) == 0 {
- return "", fmt.Errorf("dns: no addresses returned for %q", host)
+ return nil, fmt.Errorf("dns: no addresses returned for %q", host)
}
- return addrs[0], nil
+ return addrs, nil
}And in dialWithDNSResolution, try each resolved address until one dial succeeds, returning the last error if all fail.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@proxy/internal/roundtrip/dns.go` around lines 58 - 66, Change resolveHost to
return ([]string, error) instead of a single string: keep the existing
resolver.LookupHost call, return the full addrs slice (or an error if zero or
lookup failed). Then update dialWithDNSResolution to call resolveHost, iterate
over each returned address and attempt to dial sequentially (using the same
DialContext/Dial method currently used), returning immediately on the first
successful connection; if all attempts fail return the last dial error (include
the target address in error context). Ensure error messages distinguish between
resolve errors and per-address dial errors and preserve existing context
wrapping.
| Dial: func(ctx context.Context, _, _ string) (net.Conn, error) { | ||
| // Always use UDP toward the DNS server. The network and address | ||
| // arguments passed by net.Resolver are intentionally ignored; | ||
| // we route through the WireGuard netstack instead. | ||
| return dial(ctx, "udp", addrStr) | ||
| }, |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In Go's net.Resolver, can Resolver.Dial be invoked with network "tcp" (for example after UDP truncation), and should custom Resolver.Dial implementations preserve the provided network?
💡 Result:
Yes, Go's net.Resolver.Dial can be invoked with network "tcp", for example when the resolver falls back to TCP after receiving a truncated UDP response (TC bit set), as per standard DNS behavior described in RFC 1035 and confirmed in Go issues discussing UDP truncation handling. Custom Resolver.Dial implementations must preserve the provided network parameter, returning a Conn of the exact type requested: a *TCPConn for "tcp" (or variants like "tcp4", "tcp6") or *UDPConn for UDP networks. This is explicitly stated in the official documentation and original CL adding the Dial field.
Citations:
- 1: https://groups.google.com/g/golang-codereviews/c/9JjM6ld8IsE
- 2: https://pkg.go.dev/net@go1.25.1
- 3: net: Resolver doesn't use provided Dial function in all cases golang/go#60712
- 4: https://github.com/golang/go/blob/master/src/net/lookup.go
- 5: https://github.com/golang/go/blob/master/src/net/dnsclient_unix.go
🏁 Script executed:
# Inspect the DNS dial code context
fd -n 'dns.go' proxy/internal/roundtrip | head -5Repository: netbirdio/netbird
Length of output: 291
🏁 Script executed:
# Read the full file to understand context
cat -n proxy/internal/roundtrip/dns.go | head -150Repository: netbirdio/netbird
Length of output: 3879
🏁 Script executed:
# Search for any comments or docs explaining DNS transport strategy
rg -i "udp|tcp|fallback|transport" proxy/internal/roundtrip/dns.go -A 2 -B 2Repository: netbirdio/netbird
Length of output: 625
🏁 Script executed:
# Check for any README or design docs in proxy directory
find proxy -type f \( -name "*.md" -o -name "*.txt" -o -name "README*" \) | head -10Repository: netbirdio/netbird
Length of output: 130
🏁 Script executed:
# Search for DNS over TCP or TCP DNS references
rg -i "dns.*tcp|tcp.*dns" --type go proxy/Repository: netbirdio/netbird
Length of output: 139
🏁 Script executed:
# Check for WireGuard netstack DNS documentation or implementation
rg -i "netstack|wireguard" proxy/internal/roundtrip/ -A 3 -B 3Repository: netbirdio/netbird
Length of output: 4035
🏁 Script executed:
# Search for issues or TODOs about DNS TCP support
rg -i "todo|fixme|issue.*dns.*tcp" --type goRepository: netbirdio/netbird
Length of output: 20953
🏁 Script executed:
# Look at the underlying dial function signature to understand constraints
rg -B 5 -A 10 "func dialWithDNSResolution" proxy/internal/roundtrip/dns.goRepository: netbirdio/netbird
Length of output: 825
🏁 Script executed:
# Find callers of dialWithDNSResolution
rg "dialWithDNSResolution" --type go -B 3 -A 3Repository: netbirdio/netbird
Length of output: 1844
🏁 Script executed:
# Check if there are other DNS dial implementations that use TCP
rg -i "tcp.*dial|dial.*tcp" proxy/internal/roundtrip/ --type goRepository: netbirdio/netbird
Length of output: 43
🏁 Script executed:
# Look for WireGuard netstack DNS implementation
fd -type f -name "*.go" proxy/internal/roundtrip | xargs rg -l "netstack|wireguard" | head -5Repository: netbirdio/netbird
Length of output: 396
🏁 Script executed:
# Check netbird.go for how the dial function is created and used
sed -n '1,100p' proxy/internal/roundtrip/netbird.goRepository: netbirdio/netbird
Length of output: 3292
🏁 Script executed:
# Find the client.DialContext implementation in embedded client
find . -path "*/client/embed*" -name "*.go" | head -5Repository: netbirdio/netbird
Length of output: 131
🏁 Script executed:
# Search for DialContext in embedded client
rg "func.*DialContext" client/embed --type go -A 5Repository: netbirdio/netbird
Length of output: 458
🏁 Script executed:
# Check if the WireGuard netstack interface supports TCP
rg "DialContext|Dial.*tcp" client/iface --type go | head -20Repository: netbirdio/netbird
Length of output: 185
Preserve Resolver.Dial network parameter to support TCP fallback after DNS truncation.
Go's net.Resolver invokes Resolver.Dial with network="tcp" when a truncated UDP response (TC bit) requires TCP fallback per RFC 1035. Forcing UDP ignores this contract and causes DNS failures on truncated responses. The underlying client.DialContext supports both TCP and UDP, so the fix is straightforward: accept the network parameter and default to UDP only if not specified.
Proposed fix
- Dial: func(ctx context.Context, _, _ string) (net.Conn, error) {
+ Dial: func(ctx context.Context, network, _ string) (net.Conn, error) {
// Always use UDP toward the DNS server. The network and address
// arguments passed by net.Resolver are intentionally ignored;
// we route through the WireGuard netstack instead.
- return dial(ctx, "udp", addrStr)
+ if network == "" {
+ network = "udp"
+ }
+ return dial(ctx, network, addrStr)
},🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@proxy/internal/roundtrip/dns.go` around lines 84 - 89, The Resolver.Dial
currently ignores the network parameter and always uses "udp", which breaks TCP
fallback on truncated DNS responses; update the anonymous Dial func in
proxy/internal/roundtrip/dns.go to accept and use the incoming network parameter
(e.g., name it "network") when calling dial(ctx, network, addrStr), falling back
to "udp" only if the provided network is empty, so net.Resolver's TCP fallback
behavior is preserved and dial(ctx, ...) can still handle both "tcp" and "udp".
|
I just need reverse proxy service that forwards all traffic to custom http domain target. |
|
Can you elaborate on what's not working? The dial already resolves DNS internally with the same nameserver. |
|
@lixmal, its not a bug, its a feature, everything is working fine. |
|
@lixmal, what to do next? |
|
What are you trying to solve? From my understanding you just move the resolution out of the dialer (via netstack) |
|
I have a NetBird client deployed in Docker Swarm, and I now need to deploy Traefik as a reverse proxy. The goal is to use NetBird’s reverse proxy with an HTTP target inside the Docker Swarm overlay network. My idea:
This would allow me to expose Docker services through NetBird reverse proxy using stable domain names, even when container IPs change. Another possible improvement would be to use NetBird DNS zones for creating custom aliases as reverse proxy targets.
I think this might solve an issue many users face when trying to use custom DNS names as reverse proxy targets in NetBird. If this sounds useful, I can also search for similar discussions or feature requests from other users. @lixmal, Thank you for responding! |





Problem
The HTTP reverse proxy transport (
dialWithDNSResolutionin dns.go) was a stub. As a result, target URLs that contained hostnames (e.g.https://myapp.netbird.cloud/) could not be dialled — only literal IP addresses worked. Users wanting to forward traffic to a domain couldn't do so through a resource group or peer proxy target.Solution
Route hostname resolution through NetBird's own DNS infrastructure (custom zones, nameserver groups configured in the management UI) instead of the host-OS resolver or a hardcoded list of DNS servers.
The embedded NetBird client already runs an internal DNS server (bound to the last IP of the WireGuard network on port 53 in netstack/userspace mode). This server already knows about all NetBird-managed zones and upstream nameserver groups. We now query it directly during each dial that involves a hostname.
current state:

expected result:

Summary by CodeRabbit
Release Notes