Skip to content

Fix Teleport re-execs on Managed Updates installations of Teleport#54172

Open
sclevine wants to merge 2 commits intomasterfrom
sclevine/exec-fix
Open

Fix Teleport re-execs on Managed Updates installations of Teleport#54172
sclevine wants to merge 2 commits intomasterfrom
sclevine/exec-fix

Conversation

@sclevine
Copy link
Copy Markdown
Member

@sclevine sclevine commented Apr 21, 2025

As part of #54145, we discovered that when Teleport re-execs itself in various scenarios, it may fail to re-exec if the running version has not been restarted or reloaded recently. This is because os.Executable() resolves symlinks.

For example:

Before update:
/opt/teleport/…/v17.4.1/bin/teleport <- /usr/local/bin
/opt/teleport/…/v17.4.0/bin/teleport
os.Executable will return: /opt/teleport/…/v17.4.1/bin/teleport

After an update without a reload:
/opt/teleport/…/v17.4.2/bin/teleport <- /usr/local/bin
/opt/teleport/…/v17.4.1/bin/teleport
os.Executable will return: /opt/teleport/…/v17.4.1/bin/teleport (outdated)

After another update without a reload:
/opt/teleport/…/v17.4.3/bin/teleport <- /usr/local/bin
/opt/teleport/…/v17.4.2/bin/teleport
os.Executable will return: /opt/teleport/…/v17.4.1/bin/teleport (broken)

Various other failure cases are possible if Managed Updates is enabled/disabled while Teleport is running.
Soft reloads are not impacted, as they exec teleport off of PATH.

This PR fixes various problematic uses of os.Executable by switching to autoupdate.StableExecutable.
In a future PR, I will introduce a linter rule to prevent this from happening.

The code base contains many other valid usages of os.Executable. The only other suspect usage I found was:
https://github.com/gravitational/teleport/blob/master/lib/auditd/auditd_linux.go#L208
However, I believe auditd should receive the path to the running binary.

Note that the current behavior has a bug unrelated to Managed Updates: some uses of os.Executable should be execing /proc/self/exe directly, to ensure they exec the same binary running in memory, and not an updated version that could be incompatible. This should be addressed separately, to avoid too many changes in one release, and to expedite the above fix.


changelog: teleport-update: stabilize binary paths on teleport re-exec


The teleport-update binary is used to enable, disable, and trigger automatic Teleport agent updates. The new Managed Updates system manages a local installation of the cluster-specified version of Teleport stored in /opt/teleport.

RFD: #47126
Goal (internal): https://github.com/gravitational/cloud/issues/11856

@sclevine
Copy link
Copy Markdown
Member Author

Note that this PR is currently based on #54145 to expedite reviews. After #54145 merges, I will point this at master.

@sclevine
Copy link
Copy Markdown
Member Author

sclevine commented Apr 21, 2025

Note: this issue may be less severe than originally triaged. Soft reloads appear to use the latest binary. Unfortunately, that means that soft reload depends on teleport being in PATH, and argv[0] being set correctly.

Copy link
Copy Markdown
Contributor

@hugoShaka hugoShaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add TODOs on the StableExecutable()calls that should be replaced by /prox/self/exe in the next version?

Base automatically changed from sclevine/mu-tbot-path-fix to master April 21, 2025 22:32
@sclevine sclevine force-pushed the sclevine/exec-fix branch from a3f7059 to 9458d18 Compare April 21, 2025 23:03
@sclevine
Copy link
Copy Markdown
Member Author

sclevine commented Apr 21, 2025

Note that @hugoShaka discovered that all of these cases are already fixed by calls to reexecCommandOSTweaks, which is already execing /proc/self/exe directly for the reasons suggested above. We should still merge this eventually, e.g., to fix the issues on other OSs, but not important in the near-term.

Copy link
Copy Markdown
Contributor

@espadolini espadolini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of those except the scp one are noops on linux, since we ultimately call reexecCommandOSTweaks on all of them. Even if this is mainly for darwin, wouldn't we actually rather reexecute the exact version of the binary rather than some potentially newer version?

@hugoShaka
Copy link
Copy Markdown
Contributor

The issue is that os.Executable() might return a non-existent path on any teleport-update managed installation. If we convert the call into /proc/self/exe later, that's perfect, but if we don't, we should point to a valid binary.

@sclevine
Copy link
Copy Markdown
Member Author

except the scp one

That one is covered also -- the Cmd field is set with your tweaks later.

wouldn't we actually rather reexecute the exact version of the binary rather than some potentially newer version

In all of these cases, yes. I opened this as a temporary patch to fix uses of os.Executable that can fail because the path is no longer present. Since your tweaks address this in all of the cases we care about right now, I'm not strongly opinionated about whether we merge it.

That said, we will likely need a solution for macOS eventually. One option could be to open an fd to _NSGetExecutablePath() right at start (and somehow detect/avoid the race condition), and use that in place of /proc/self/exe.

We could wait until we need macOS support before deciding which path to take, or we could merge this sooner to avoid the worst-case scenario moving forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants