Skip to content

Gracefully fail when cannot do client tools updates#57142

Merged
hugoShaka merged 2 commits intomasterfrom
hugo/fix-tctl-in-ro-filesystem
Jul 24, 2025
Merged

Gracefully fail when cannot do client tools updates#57142
hugoShaka merged 2 commits intomasterfrom
hugo/fix-tctl-in-ro-filesystem

Conversation

@hugoShaka
Copy link
Copy Markdown
Contributor

@hugoShaka hugoShaka commented Jul 24, 2025

This PR fixes a bug where client tools update fail to create/read ~/.tsh and block the execution of tctl commands in container images. e.g.

kubectl exec -i deploy/teleport-auth -n cloud-gravitational-io-hugo-test-mu-18 -- tctl status
ERROR: mkdir /.tsh: read-only file system

command terminated with exit code 1

The PR does 3 changes:

  • gracefully handle if we cannot create the home directory/check which version to use. Now we try to use the local binary (warn and disable client tools updates)
  • handle the case where tools updates are explicitly disabled as early as possible. This prevents us from trying to read the user profile and log errors.
  • set TELEPORT_TOOLS_VERSION=off in our docker container to prevent it from even thinking about updating client tools

Changelog: Fix a bug causing tctl/tsh to fail on read-only file systems.
Changelog: the teleport-distroless container image now disables client tools updates by default (when using tsh/tctl, you will always use the version from the image). You can enable them back by unsetting the TELEPORT_TOOLS_VERSION environment variable.

@hugoShaka hugoShaka requested review from sclevine and vapopov July 24, 2025 16:08
@github-actions github-actions bot requested review from fheinecke and smallinsky July 24, 2025 16:09
@hugoShaka hugoShaka changed the title Gracefully fail when cannot do client tool update Gracefully fail when cannot do client tools updates Jul 24, 2025
// the installation is recorded in the configuration file, and the tool is re-executed with the updated version.
func CheckAndUpdateLocal(ctx context.Context, currentProfileName string, reExecArgs []string) error {
// If client tools updates are explicitly disabled, we want to catch this as soon as possible
// so we don't try to read te user home directory, fail, and log warnings.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// so we don't try to read te user home directory, fail, and log warnings.
// so we don't try to read the user home directory, fail, and log warnings.

auto-merge was automatically disabled July 24, 2025 17:01

Pull Request is not mergeable

Merged via the queue into master with commit e690400 Jul 24, 2025
50 checks passed
@hugoShaka hugoShaka deleted the hugo/fix-tctl-in-ro-filesystem branch July 24, 2025 17:17
@backport-bot-workflows
Copy link
Copy Markdown
Contributor

@hugoShaka See the table below for backport results.

Branch Result
branch/v17 Create PR
branch/v18 Create PR

vapopov pushed a commit that referenced this pull request Jul 24, 2025
* Gracefully fail when cannot do client tool update

* Gracefully fail when cannot check the version
vapopov pushed a commit that referenced this pull request Jul 24, 2025
* Gracefully fail when cannot do client tool update

* Gracefully fail when cannot check the version
github-merge-queue bot pushed a commit that referenced this pull request Aug 7, 2025
* Client-tools managed updates version caching (#54563)

* Add profile integration to disable update and re-execution for specific cluster

* Complete integration for the tctl and tsh

* Add commands for tsh

* Fix linter warnings

* Add config file with version and disabling status

* Move check out from helper

* Fixed re-execution ignore if versions is identical

* Move logic out from client

* Remove helper package and profile integration

* Fix argument parsing by filtering

* Use same Darwin platform approach of package extraction for Linux
Add client tools cleanup for V1 directories

* Fix packaging unit test

* Add cleanup for last recently used tools

* Add migration from v1 for better support
Show error log message about failed update/re-execution instead of failing command execution in case if updated binary was broken, modified or not able to validate signature

* Add ignore the version check fail, add more debug information

* Check update for commands `tsh ssh`, `tsh proxy ssh`
Fixed creating `.tsh` subdirectory when TELEPORT_HOME is set
Fix `tsh --proxy` flag parsing

* Wraps client init function to check client tools managed update only when it requested for `tsh ssh` and `tsh proxy ssh`

* Move filesystem lock to configuration library
Configuration modification protected by lock, other process must wait until it is released

* Rename command to `tsh update`, `tsh update --clear`

* Add test for argument filtering

* Update RFD
Make max tools installed to be configurable and set to 3 by default

* Replace "automatic updates" to "managed updates"

* Updated comments to reflect the latest changes

* Fix migration for older versions with two packages

* CR changes

* Prevent failing tools execution if configuration file is corrupted

* Remove lock file as part of cleanup command

* Added context to arguments

* Use a separate Kingpin application for tctl, as is already done for tsh. Double parsing may cause issues since it is not stateless.

* CTMU no longer uses a static path, any re-execution from the tools directory must disable further re-execution

* Gracefully fail when cannot do client tools updates (#57142)

* Gracefully fail when cannot do client tool update

* Gracefully fail when cannot check the version

* Fix printing empty usage and terminate CLI for parsing global flags (#57401)

* Fix printing empty usage and terminate CLI for parsing global flags

* Add test with check of both `--help` flag and `help` command that usage print is not empty and both identical.
Add godoc clarification

* Disable managed update check for version help command test

---------

Co-authored-by: Hugo Shaka <hugo.hervieux@goteleport.com>
github-merge-queue bot pushed a commit that referenced this pull request Aug 7, 2025
* Client-tools managed updates version caching (#54563)

* Add profile integration to disable update and re-execution for specific cluster

* Complete integration for the tctl and tsh

* Add commands for tsh

* Fix linter warnings

* Add config file with version and disabling status

* Move check out from helper

* Fixed re-execution ignore if versions is identical

* Move logic out from client

* Remove helper package and profile integration

* Fix argument parsing by filtering

* Use same Darwin platform approach of package extraction for Linux
Add client tools cleanup for V1 directories

* Fix packaging unit test

* Add cleanup for last recently used tools

* Add migration from v1 for better support
Show error log message about failed update/re-execution instead of failing command execution in case if updated binary was broken, modified or not able to validate signature

* Add ignore the version check fail, add more debug information

* Check update for commands `tsh ssh`, `tsh proxy ssh`
Fixed creating `.tsh` subdirectory when TELEPORT_HOME is set
Fix `tsh --proxy` flag parsing

* Wraps client init function to check client tools managed update only when it requested for `tsh ssh` and `tsh proxy ssh`

* Move filesystem lock to configuration library
Configuration modification protected by lock, other process must wait until it is released

* Rename command to `tsh update`, `tsh update --clear`

* Add test for argument filtering

* Update RFD
Make max tools installed to be configurable and set to 3 by default

* Replace "automatic updates" to "managed updates"

* Updated comments to reflect the latest changes

* Fix migration for older versions with two packages

* CR changes

* Prevent failing tools execution if configuration file is corrupted

* Remove lock file as part of cleanup command

* Added context to arguments

* Use a separate Kingpin application for tctl, as is already done for tsh. Double parsing may cause issues since it is not stateless.

* CTMU no longer uses a static path, any re-execution from the tools directory must disable further re-execution

* Gracefully fail when cannot do client tools updates (#57142)

* Gracefully fail when cannot do client tool update

* Gracefully fail when cannot check the version

* Fix printing empty usage and terminate CLI for parsing global flags (#57401)

* Fix printing empty usage and terminate CLI for parsing global flags

* Add test with check of both `--help` flag and `help` command that usage print is not empty and both identical.
Add godoc clarification

* Disable managed update check for version help command test

---------

Co-authored-by: Hugo Shaka <hugo.hervieux@goteleport.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants