Skip to content

[vnet] linux support#63664

Merged
tangyatsu merged 26 commits intomasterfrom
tangyatsu/vnet-linux
Mar 17, 2026
Merged

[vnet] linux support#63664
tangyatsu merged 26 commits intomasterfrom
tangyatsu/vnet-linux

Conversation

@tangyatsu
Copy link
Copy Markdown
Contributor

@tangyatsu tangyatsu commented Feb 9, 2026

builds on #55542.

D-bus daemon

changelog: Added Linux support for VNet

D-bus is basically a hybrid IPC/RPC mechanism used in Unix-based distros. Services can register and expose methods on the bus, and authorization is handled by polkit, your service must call the polkit API over D-bus.
Good article describing D-bus and polkit: https://u1f383.github.io/linux/2025/05/25/dbus-and-polkit-introduction.html.

The VNet daemon is managed by systemd. The systemd unit runs tsh vnet-daemon. To allow D-bus to activate the unit when it isn’t running, we add a D-Bus service file that points to this systemd unit.

The daemon registers on the system bus as org.teleport.vnet1 and exports interface org.teleport.vnet1.Daemon. It exposes two methods:

Start(addr, credPath)
Stop()

For authorization we use a single polkit action: org.teleport.vnet1.manage-daemon. There’s no automatic mapping from polkit action IDs to D-Bus methods, the mapping is enforced in code, and we use the same action for both Start and Stop.

The daemon exits completely when Stop() is called or when the admin process exits on its own. It gets brought back on the next Start() call.

No D-Bus daemon

If the D-Bus service is not available:

  • If running as root, we fall back and launch the admin process via tsh vnet-admin-setup.
  • If not root, we return an error.

Manual Test Plan

Test Environment

it relies on the presence of systemd, D-Bus and polkit, most modern desktop Linux distros include them.

To make it work, you need to install the required polkit and D-Bus configuration files.

Polkit action:

sudo tee /usr/share/polkit-1/actions/org.teleport.vnet1.policy > /dev/null <<'EOF'
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE policyconfig PUBLIC
  "-//freedesktop//DTD PolicyKit Policy Configuration 1.0//EN"
  "http://www.freedesktop.org/standards/PolicyKit/1/policyconfig.dtd">
<policyconfig>

  <action id="org.teleport.vnet1.manage-daemon">
    <description>Start Teleport VNet</description>
    <message>Authentication is required to start Teleport VNet</message>
    <defaults>
      <allow_any>no</allow_any>
      <allow_inactive>no</allow_inactive>
      <!-- Default behavior if no rule matches -->
      <allow_active>yes</allow_active>
    </defaults>
  </action>

</policyconfig>
EOF

D‑Bus config
By default, d-bus will not allow any user (including root) to own the org.teleport.vnet1 name:

sudo tee /usr/share/dbus-1/system.d/org.teleport.vnet1.conf > /dev/null <<'EOF'
<!DOCTYPE busconfig PUBLIC "-//freedesktop//DTD D-Bus Bus Configuration 1.0//EN"
 "http://www.freedesktop.org/standards/dbus/1.0/busconfig.dtd">
<busconfig>
  <policy user="root">
    <allow own="org.teleport.vnet1"/>
  </policy>

  <policy context="default">
    <allow send_destination="org.teleport.vnet1"/>
  </policy>
</busconfig>
EOF

D‑Bus service file (activates systemd unit)

sudo tee /usr/share/dbus-1/system-services/org.teleport.vnet1.service > /dev/null <<'EOF'
[D-BUS Service]
Name=org.teleport.vnet1
SystemdService=teleport-vnet.service
User=root
Exec=/bin/false
EOF

Systemd unit

sudo tee /usr/lib/systemd/system/teleport-vnet.service > /dev/null <<'EOF'
[Unit]
Description=Teleport VNet D-Bus service
After=dbus.service
Requires=dbus.service

[Service]
Type=dbus
BusName=org.teleport.vnet1
ExecStart=/usr/bin/tsh vnet-daemon --debug
User=root
Group=root
EOF

I tested this on a cluster with:

  • PostgreSQL as a TCP app
  • dummy NGINX server as an HTTP app

I also set google.com as a custom DNS zone for VNet and checked access to workspace.google.com with VNet turned on.

It is actually convenient to inspect DNS behavior with resolvectl.

resolvectl status shows all active links.
resolvectl domain shows domains attached to each link, for example:

Link 37 (TeleportVNet): ~internal.example.com ~google.com ~teleport.tangyatsu.com ~teleport.dev

You can query and verify which link handled DNS:

resolvectl query workspace.google.com --legend=yes

Example output:

workspace.google.com: 165.xxx.xxx.xxx  -- link: TeleportVNet

Test Cases

  • can start VNet with tsh vnet
  • can connect to TCP app over VNet
  • can connect to SSH node over VNet
  • HTTP app access continues to work with VNet on and off
  • public URIs under a custom DNS zone are still reachable (with VNet on and off)
  • can view daemon logs with journalctl -u teleport-vnet.service
  • can start and stop VNet from Connect
  • can start VNet with sudo tsh vnet without presence of config files

@tangyatsu tangyatsu force-pushed the tangyatsu/vnet-linux branch 4 times, most recently from 2b9d1ba to 42df7a2 Compare February 10, 2026 18:33
@tangyatsu tangyatsu marked this pull request as ready for review February 10, 2026 19:01
@github-actions github-actions bot added size/lg tsh tsh - Teleport's command line tool for logging into nodes running Teleport. ui labels Feb 10, 2026
@tangyatsu tangyatsu requested review from ravicious and removed request for alexhemard and kimlisa February 10, 2026 19:02
@tangyatsu tangyatsu force-pushed the tangyatsu/vnet-linux branch 2 times, most recently from 7fec826 to 273b629 Compare February 13, 2026 18:43
Copy link
Copy Markdown
Contributor

@nklaassen nklaassen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a plan for distributing the required polkit and systemd unit files?

Comment thread lib/vnet/escalate_linux.go Outdated
Comment thread lib/vnet/escalate_linux.go Outdated
Comment thread lib/vnet/dns/osnameservers_other.go Outdated
Comment thread lib/vnet/systemdresolved/dbus.go Outdated
Comment thread lib/vnet/dbus_client_linux.go
Comment thread lib/vnet/dbus_service_linux.go Outdated
Comment thread lib/vnet/dbus_service_linux.go Outdated
Comment thread lib/vnet/osconfig_linux.go Outdated
Comment thread lib/vnet/user_process_linux.go
@public-teleport-github-review-bot
Copy link
Copy Markdown

@tangyatsu - this PR will require admin approval to merge due to its size. Consider breaking it up into a series smaller changes.

@tangyatsu
Copy link
Copy Markdown
Contributor Author

tangyatsu commented Mar 6, 2026

is there a plan for distributing the required polkit and systemd unit files?

Working on it in PR #64362

I think it would be better to ship these files via package, but there is a problem both teleport and teleport-connect packages would need these files
Edit: We decided that only the Teleport Connect package will install these files

Copy link
Copy Markdown
Contributor

@nklaassen nklaassen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good to me, but I would like to see a manual testplan in the PR description before approving. You have instructions for how to run it it would be nice to see some test steps you have executed. E.g.

  • can connect to TCP app over VNet
  • HTTP app access continues to work with VNet on or off
  • public URIs under a custom DNS zone are still reachable (try adding google.com as a custom DNS zone and make sure you can still connect to mail.google.com).
  • can start and stop VNet from Connect (if that works at this point)
  • can run tsh vnet
  • can view logs from the root/daemon process

Comment thread lib/vnet/diag/routeconflict_other.go Outdated
Comment thread lib/vnet/dns/osnameservers.go Outdated
Comment thread lib/vnet/polkit/constants.go Outdated
Comment thread lib/vnet/systemdresolved/types.go Outdated
Comment thread lib/vnet/admin_process_linux.go
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

// along with this program. If not, see <http://www.gnu.org/licenses/>.

P0 Badge Re-enable route conflict stubs for linux builds

This build-tag change excludes the only fallback definitions of (*NetInterfaces).interfaceApp and (*RouteConflictDiag).commands from linux, and there is no routeconflict_linux.go in lib/vnet/diag to provide replacements. That leaves routeconflict.go with unresolved method calls (n.interfaceApp / c.commands) when building linux targets (including via lib/vnet imports like opensshconfig.go), so the commit makes linux builds fail at compile time.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 706f21dda5

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread lib/vnet/escalate_linux.go
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9a1753c571

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +79 to +83
for _, cidrRange := range cfg.cidrRanges {
if slices.Contains(state.configuredCidrRanges, cidrRange) {
continue
}
log.InfoContext(ctx, "Setting an IPv4 route", "netmask", cidrRange)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Remove stale CIDR routes when config shrinks

platformConfigureOS only appends newly seen CIDRs to state.configuredCidrRanges and never removes routes that disappeared from cfg.cidrRanges, so long-lived sessions keep stale ip route entries after clusters/logins change. In practice, traffic for removed Teleport ranges will still be routed into the VNet TUN and can be blackholed (or misrouted) until the whole VNet process is restarted.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member

@ravicious ravicious left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job, let's ship it!

I tested a couple of edge cases, e.g. missing D-Bus service when running through Connect and the app behaves correctly, that is it reports the error in the UI. I was also able to add a systemd override with TELEPORT_DEBUG=1 and get debug logs in the systemd service.

if cf.Debug {
level = slog.LevelDebug
}
if _, err := utils.InitLogger(utils.LoggingForDaemon, level); err != nil {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to mention that this duplicates timestamps already added logged by systemd, e.g.

Mar 13 18:00:12 ubuntu-linux-22-04-02-desktop tsh[14844]: 2026-03-13T18:00:12.346+01:00 INFO [VNET]      Running VNet admin process vnet/admin_process_linux.go:41

…but I quickly checked if we do this any different for the teleport binary and it does the same thing so I guess it can stay this way for now. 🫩

Comment thread lib/vnet/polkit/polkit.go Outdated
Comment thread lib/vnet/systemdresolved/dbus.go Outdated
select {
case err := <-done:
// network stack exited cleanly within timeout
return trace.Wrap(err, "running VNet admin process")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we intentionally returning an error for a clean exit?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code was only in darwin version previously. my understanding is there was a bug where the network stack could hang on shutdown #58298, so “clean exit” here means it exited within the timeout window, regardless of whether it returned an error

Comment thread lib/vnet/admin_process_unix.go
Comment thread lib/vnet/dbus_service_linux.go Outdated
Comment thread lib/vnet/dbus_service_linux.go Outdated
return dbus.MakeFailedError(trace.Wrap(err, "authorization failed"))
}

if d.closing.Load() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if d.closing is false when we read it but turns true just after that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at it, in the worst case:
Close() runs after Start() readed d.closing but before Start() sets started=true.
Close() sees started==false, sends nil to done, Wait() returns immediately.
Then Start() launches startAdminProcess goroutine while the daemon is already shutting down

I need to somehow check both of these flags atomically, I didn’t come up with a better idea than adding a mutex.

Comment thread lib/vnet/dbus_service_linux.go Outdated
}
log.InfoContext(stopCtx, "Successfully stopped systemd service")
return nil
case <-ticker.C:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly are we polling here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vnet daemon lifecycle is managed by a systemd unit, If the daemon exits, the unit’s ActiveState transitions to inactive or failed, so we periodically poll the unit state to detect that and exit the user process
https://www.freedesktop.org/software/systemd/man/latest/systemd.html#Units

@tangyatsu tangyatsu requested a review from zmb3 March 13, 2026 23:35
Comment thread lib/vnet/dns/osnameservers_other.go Outdated
Comment thread lib/vnet/dbus_service_linux.go Outdated
@public-teleport-github-review-bot public-teleport-github-review-bot bot removed the request for review from cthach March 16, 2026 18:44
@tangyatsu tangyatsu added this pull request to the merge queue Mar 17, 2026
Merged via the queue into master with commit 8385461 Mar 17, 2026
45 checks passed
@tangyatsu tangyatsu deleted the tangyatsu/vnet-linux branch March 17, 2026 14:44
@tangyatsu tangyatsu mentioned this pull request Mar 17, 2026
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport/branch/v18 size/lg tsh tsh - Teleport's command line tool for logging into nodes running Teleport. ui

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants