Skip to content

[vnet] add authoritative DNS mode on Linux#63706

Closed
tangyatsu wants to merge 2 commits intomasterfrom
tangyatsu/vnet-linux-authoritative-dns
Closed

[vnet] add authoritative DNS mode on Linux#63706
tangyatsu wants to merge 2 commits intomasterfrom
tangyatsu/vnet-linux-authoritative-dns

Conversation

@tangyatsu
Copy link
Copy Markdown
Contributor

@tangyatsu tangyatsu commented Feb 10, 2026

This PR is a part of #63664. It adds authoritative DNS mode for VNet on Linux, which is required for correct interaction with systemd-resolved.

systemd-resolved can merge DNS zones and will query multiple upstream servers:

“One more thing to mention: in systemd-resolved.service if lookups match the search/routing domains of multiple interfaces at once, then they are sent to all of them in parallel, and the first positive reply used. If all lookups fail the last negative reply is used. This means the DNS zones on the relevant interfaces are “merged”: domains existing on one but not the other will “just work” and vice versa.”

Currently VNet works in recursive mode: it lists upstream servers and forwards queries it can’t resolve. I also added authoritative mode: unresolved queries return NXDOMAIN without forwarding, which delegates these queries to systemd-resolved. I tested this with a dummy DNS using the same zone as VNet and it worked.

@tangyatsu tangyatsu mentioned this pull request Feb 10, 2026
8 tasks
@tangyatsu tangyatsu marked this pull request as ready for review February 10, 2026 19:02
@github-actions github-actions bot requested a review from gzdunek February 10, 2026 19:03
@nklaassen
Copy link
Copy Markdown
Contributor

I'm curious about the reasoning for authoritative mode and how we expect it to work

For example, say:

  • the cluster has a custom DNS zone "example.com"
  • there is a TCP app with public addr tcp.example.com
  • there is also a public DNS record for tcp.example.com
  • the customer also needs to access external.example.com, which has a public DNS record and is not an enrolled Teleport app

On other platforms the result is deterministic - VNet will always assign a VNet address for queries for tcp.example.com. It sounds like systemd-resolved might race VNet's nameserver against the public nameserver and it might be non-determinstic if VNet will "win" and get to handle requests for tcp.example.com. We do need to handle this case where VNet takes over an FQDN that also has a public DNS record

@tangyatsu
Copy link
Copy Markdown
Contributor Author

tangyatsu commented Feb 11, 2026

@nklaassen
systemd‑resolved routes based on the most specific matching routing domain

“if there two interfaces have search domains that are suffix of each other, and a name is looked up that matches both, the interface with the longer match will win and get the lookup routed to is DNS servers. Only if the match has the same length, then both will be used in parallel. Example: one interface has ~foo.example.com as routing domain, and another one example.com has search domain. A lookup for waldo.foo.example.com is the exclusively routed to the first interface’s DNS server, since it matches by three suffix labels instead of just two. The fact that the matching length is taken into consideration for the routing decision is particularly relevant if you have one interface with the ~. routing domain and another one with ~corp.company.example — both suffixes match a lookup for foo.corp.company.example, but the latter interface wins, since the match is for four labels, while the other is for zero labels.”

So in your exapmle:
VNet link has ~example.com.
The default link does not have any routing domain or it may have ~.

A lookup for tcp.example.com goes only to VNet. There’s no race unless another link also claims ~example.com.
A lookup for external.example.com also goes to VNet and will not be resolved in authoritative mode.
Probably the solution would be to add ~external.example.com to the default link.

Could you please tell more about cases where a VNet takes over an FQDN that also has a public DNS record?
My initial thought was that there might be a corporate setup with a zone such as corporate.com, and the VNet would also handle that zone. In that case, there should not be a problem if the requests are processed in parallel.

@nklaassen
Copy link
Copy Markdown
Contributor

So in this case the problem is the request will not be forwarded, we need to forward DNS requests that don't match a known app.

Check out the original DNS resolver design here https://github.com/gravitational/teleport/blob/master/rfd/0163-vnet.md#dns-queries-and-ip-address-assignment and the changes we made when adding VNet SSH here https://github.com/gravitational/teleport/blob/master/rfd/0207-vnet-ssh.md#dns-resolution

There are some common cases where we do need to forward the DNS query upstream, VNet is not meant to be authoritative over any DNS zones except for its known apps.

  • requests for the proxy address itself must resolve to the proxy's public address, VNet can't forward this because VNet itself needs to contact the proxy
  • requests for web apps at app-name.<proxy-public-addr> must resolve to the public DNS record, VNet doesn't handle these
  • in custom DNS zones *.<suffix> must be forwarded upstream if it does not match a known TCP app, that is the design, VNet is meant to intercept requests to known TCP apps and allow everything else to continue functioning without VNet

We have a VNet usecase where a cluster can make com a custom DNS zone and register a TCP app for github.com so that all requests to GitHub go through VNet, then they can set up IP restrictions in GitHub. So we must not break DNS resolution for things that do not match known TCP apps

@nklaassen
Copy link
Copy Markdown
Contributor

I assume it's difficult or impossible to get systemd-resolved to resolve queries for *.<suffix> after we have configured VNet's nameserver for that suffix, it would recursively call VNet's own nameserver. That is why on other platforms we directly find the upstream nameserver IP address and forward the raw UDP DNS requests there and forward the reply back to the caller

@tangyatsu tangyatsu closed this Feb 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants