Skip to content

Machine ID: Refactor tbot CLI#47130

Merged
strideynet merged 19 commits intomasterfrom
timothyb89/tbot-cli-overhaul
Oct 11, 2024
Merged

Machine ID: Refactor tbot CLI#47130
strideynet merged 19 commits intomasterfrom
timothyb89/tbot-cli-overhaul

Conversation

@timothyb89
Copy link
Copy Markdown
Contributor

@timothyb89 timothyb89 commented Oct 3, 2024

This significantly refactors the tbot CLI. In addition to a major overhaul of our CLI config handling, it also exposes a number of new subcommands for starting more than just an identity output via pure CLI.

New command examples

# old commands are now under the (default) `tbot start legacy`, so the `tctl bots add` example still works
tbot start --destination-dir=./tbot-user \                                                                                                                                                                    
  --token=<snip> \
  --proxy-server=example.com:443 \
  --join-method=token --data-dir=./tbot-data

# start in identity mode (roughly equivalent to legacy, but with new-style flags)
$ tbot start identity --destination=./tbot-user --storage=./tbot-data --proxy-server=example.com:443 --join-method=token

# most commands have common aliases to improve memorability
$ tbot start ssh --destination=./tbot-user --storage=./tbot-data --proxy-server=example.com:443 --join-method=token

# start in Kubernetes mode
$ tbot start kubernetes --destination ./tbot-user --storage=./tbot-data --proxy-server=example.com:443 --join-method=token --kubernetes-cluster kind-demo

# start a database tunnel
$ tbot start db-tunnel --listen tcp://0.0.0.0:1234 --storage=./tbot-data --proxy-server=example.com:443 --join-method=token --service postgres-demo --username demo --database demo

# start the SPIFFE SVID output
$ tbot start spiffe-x509-svid  --destination=./tbot-user --storage=./tbot-data --proxy-server=example.com:443 --join-method=token --svid-path /foo

These commands and flags all now have 1:1 parity with tbot configure. That means you can easily convert any of these example commands to a config file by replacing tbot start with tbot configure:

# try running a bot manually to test flags
$ tbot start spiffe-x509-svid --destination ./tbot-user --storage=./tbot-data --proxy-server=example.com:443 --join-method=token --svid-path=/foo

# once satisfied, easily convert to a config file
$ tbot configure spiffe-x509-svid --destination ./tbot-user --storage=./tbot-data --proxy-server=example.com:443 --join-method=token --svid-path=/foo > tbot.yaml

# start from the config file
$ tbot start -c tbot.yaml

The --destination and --storage flags now also universally accept URIs, so memory:// and kubernetes-secret:// should work as expected.

Goals

Some (re-)design goals per #44509 include:

  1. Exposing various tbot output types and services through pure CLI
  2. Cleanly cutting over to new terminology for all new subcommands (--storage instead of --data-dir, --destination that accepts URIs instead of --destination-dir, etc)
  3. Maintaining support for legacy syntax

While iterating, I also prioritized:

  1. Sharing as much code as possible between each subcommand, as well as the tbot start and tbot configure variants of each. An earlier iteration unrolled all of this, and bugs very rapidly resulted from drift between copy and pasted code blocks, not to mention the spectacular line of code count.
  2. Moving away from the effectively global namespace of the CLIConf struct. This struct has become unmanageable and led to conflicting usages and definitions of fields. Every command should make its flag requirements and fields explicit (aside from the few true globals)
  3. Ensuring our kingpin configuration did not enforce any new and unergonomic flag ordering constraints. kingpin flags don't have nice inheritance rules so defining common arguments on the tbot start level led to:
    • ugly positioning requirements: tbot start --proxy-server foo.com identity --destination ./bar
    • inability to properly exclude old/renamed arguments from new commands

To attempt to solve everything at once (very dangerous 😬) I've attempted to adapt tctl's TryRun() to allow for reusable commands that we can execute as either start or configure subcommands. Defining new "modern" style subcommands is quite lightweight as the shared set of parameters can be embedded easily, and the "start" and "configure" actions can be passed in at init time.

changelog: Introduced use-case specific commands to tbot to provide a better out-of-box experience.

Comment thread lib/tbot/config/cli.go Outdated
This significantly refactors the tbot CLI. In addition to a major
overhaul of our CLI config handling, it also exposes a number of new
subcommands for starting more than just an identity output via pure
CLI.
@timothyb89 timothyb89 force-pushed the timothyb89/tbot-cli-overhaul branch from 0e8ec74 to cc5abaf Compare October 3, 2024 03:46
Comment thread lib/tbot/config/config.go Outdated
Comment thread lib/tbot/config/cli.go Outdated
Comment thread lib/tbot/config/cli.go Outdated
This moves the new CLI handling code into `lib/tbot/cli`, splits
`cli.go` into several better organized files, and adds all missing
subcommands for major tbot functionality.
This introduces a dedicated global args struct and removes CLIConf.
As most other non-start (and configure) commands depended on that
effectively global namespace, this refactors all of them to use the
new style with properly namespaced command structs.

This also introduces a `genericExecutorHandler` helper to simplify
subcommands that run an arbitrary action without modifying the
config.

This also solves a number of longstanding CLI handling bugs resulting
from the shared arg namespace. A few args were moved to globals that
should always have been exposed, so now things like `--log-format`
will work consistently.

A few commands were left unconverted since they don't make use of
any of the global args or config loading machinery. They could be
converted easily but are technically simpler as written. We may
opt to convert them, or to do so later.

Some headway on testing, but more work is needed.
This adds the first batch of tests on the new CLI along with a bunch
of associated changes made while fixing these tests and others.

This also splits off 2 new flag helpers: LegacyDestinationDirArgs and
AuthProxyArgs. These are reusable embeddable structs to handle
legacy-style --destination-dir behavior, and modern --auth-server
and --proxy-server. `tbot init` and its tests had a subtle dependency
old-style `--destination-dir` handling so we can now reuse that.

There may be other subcommands that require something similar.
Comment thread lib/tbot/cli/db.go
Comment thread lib/tbot/cli/globals.go Outdated
@timothyb89
Copy link
Copy Markdown
Contributor Author

I think I'm ready to open this for review. I still want to spend tomorrow adding unit tests to verify each command's individual flag handling, but also want to start collecting feedback before I'm OoO this week.

Comment thread tool/tbot/main.go
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So nice seeing the hundreds of flags pulled out of here.

Comment thread tool/tbot/main.go Outdated
Copy link
Copy Markdown
Contributor

@strideynet strideynet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good with a separate package for the CLI concerns. This has definitely come together nicely.

I think at a later date, I'd be slightly curious about moving the actual handler code down into cli or tying it more to the command constructors themselves. But I think that's definitely out of scope for this PR and would realistically bring only minor improvements.

Approving this on the basis of some more tests :')

Comment thread lib/tbot/cli/start_spiffe_x509_svid.go Outdated
Comment thread lib/tbot/cli/start_spiffe_x509_svid.go Outdated
@strideynet
Copy link
Copy Markdown
Contributor

Let's make the changelog message a little more user facing as well. Perhaps:

changelog: Introduced use-case specific commands to tbot to provide a better out-of-box experience.

Also includes a few minor fixes discovered in testing - DiagAddr
was not handled properly in shared args, and removed an unused field.
@timothyb89 timothyb89 marked this pull request as ready for review October 9, 2024 02:07
@public-teleport-github-review-bot
Copy link
Copy Markdown

@timothyb89 - this PR will require admin approval to merge due to its size. Consider breaking it up into a series smaller changes.

Comment thread lib/tbot/cli/cli.go Outdated
Comment thread lib/tbot/cli/cli.go
Comment on lines +41 to +126
// CommandRunner defines a contract for `TryRun` that allows commands to
// either execute (possibly returning an error), or pass execution to the next
// command candidate.
type CommandRunner interface {
TryRun(cmd string) (match bool, err error)
}

// MutatorAction is an action that is called by a config mutator-style command.
type MutatorAction func(mutator ConfigMutator) error

// genericMutatorHandler supplies a generic `TryRun` that works for all commands
// that - broadly - load config, mutate that config, and run an action. It's
// meant to be embedded within a command struct to provide the `TryRun`
// implementation.
type genericMutatorHandler struct {
cmd *kingpin.CmdClause
mutator ConfigMutator
action MutatorAction
}

// newGenericMutatorHandler creates a new generic genericMutatorHandler that
// provides a generic `TryRun` implementation.
func newGenericMutatorHandler(cmd *kingpin.CmdClause, mutator ConfigMutator, action MutatorAction) *genericMutatorHandler {
return &genericMutatorHandler{
cmd: cmd,
mutator: mutator,
action: action,
}
}

func (g *genericMutatorHandler) TryRun(cmd string) (match bool, err error) {
switch cmd {
case g.cmd.FullCommand():
err = g.action(g.mutator)
default:
return false, nil
}

return true, trace.Wrap(err)
}

// ConfigMutator is an interface that can apply changes to a BotConfig.
type ConfigMutator interface {
ApplyConfig(cfg *config.BotConfig, l *slog.Logger) error
}

// genericExecutorHandler is a helper that can be embedded to provide a simpler
// TryRun implementation that just runs a function. These functions can be
// passed in while building the CLI to more easily glue behaviors together, or
// specified directly.
type genericExecutorHandler[T any] struct {
cmd *kingpin.CmdClause
args *T

// actions is a list of functions to run when `TryRun` matches the `cmd`.
// Generally at most one action should be exposed to the top level glue in
// main, but commands might want to inject some handler logic for e.g.
// flag migrations.
actions []func(*T) error
}

// newGenericExecutorHandler creates a genericExecutorHandler with the given
// command and action to execute when that command is matched.
func newGenericExecutorHandler[T any](cmd *kingpin.CmdClause, args *T, actions ...func(*T) error) *genericExecutorHandler[T] {
return &genericExecutorHandler[T]{
cmd: cmd,
args: args,
actions: actions,
}
}

func (e *genericExecutorHandler[T]) TryRun(cmd string) (match bool, err error) {
switch cmd {
case e.cmd.FullCommand():
for _, action := range e.actions {
err = action(e.args)
if err != nil {
break
}
}
default:
return false, nil
}

return true, trace.Wrap(err)
}
Copy link
Copy Markdown
Contributor

@rosstimothy rosstimothy Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a more polished version of what tctl uses. Do you see any benefit/problems with moving this into a more common cli package that all of our tools could one day use?

The only tbot specific things I've seen so far is the ConfigMutator, but that could very well be made generic:

type ConfigMutator[Config any] interface {
	ApplyConfig(config Config, l *slog.Logger) error
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TimB's on leave until next week - I figure that since this PR is already pretty large - the overall idea of extracting this to something reusable makes sense but we perhaps ought to do that in another PR. I think there's a few finishing touches/refinements we'll want to add if other folks are going to use this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't so much asking for something in this PR to change, just curious if you all thought it would be a good idea in general.

@public-teleport-github-review-bot public-teleport-github-review-bot Bot removed the request for review from greedy52 October 9, 2024 20:22
@strideynet strideynet added this pull request to the merge queue Oct 11, 2024
Merged via the queue into master with commit 83f1003 Oct 11, 2024
@strideynet strideynet deleted the timothyb89/tbot-cli-overhaul branch October 11, 2024 08:40
mvbrock pushed a commit that referenced this pull request Oct 16, 2024
* Machine ID: Refactor `tbot` CLI

This significantly refactors the tbot CLI. In addition to a major
overhaul of our CLI config handling, it also exposes a number of new
subcommands for starting more than just an identity output via pure
CLI.

* Move new CLI handling into lib/tbot/cli; add all new subcommands

This moves the new CLI handling code into `lib/tbot/cli`, splits
`cli.go` into several better organized files, and adds all missing
subcommands for major tbot functionality.

* Add sane global handling, convert most other commands to new style

This introduces a dedicated global args struct and removes CLIConf.
As most other non-start (and configure) commands depended on that
effectively global namespace, this refactors all of them to use the
new style with properly namespaced command structs.

This also introduces a `genericExecutorHandler` helper to simplify
subcommands that run an arbitrary action without modifying the
config.

This also solves a number of longstanding CLI handling bugs resulting
from the shared arg namespace. A few args were moved to globals that
should always have been exposed, so now things like `--log-format`
will work consistently.

A few commands were left unconverted since they don't make use of
any of the global args or config loading machinery. They could be
converted easily but are technically simpler as written. We may
opt to convert them, or to do so later.

Some headway on testing, but more work is needed.

* First batch of tests

This adds the first batch of tests on the new CLI along with a bunch
of associated changes made while fixing these tests and others.

This also splits off 2 new flag helpers: LegacyDestinationDirArgs and
AuthProxyArgs. These are reusable embeddable structs to handle
legacy-style --destination-dir behavior, and modern --auth-server
and --proxy-server. `tbot init` and its tests had a subtle dependency
old-style `--destination-dir` handling so we can now reuse that.

There may be other subcommands that require something similar.

* Remove outdated TODO

* Restore TestConfigCLIOnlySample

* Fix lints

* Fix missing embedded flag init

* Code review feedback

* Add docstrings

* Add unit tests for all "start" commands and globals

Also includes a few minor fixes discovered in testing - DiagAddr
was not handled properly in shared args, and removed an unused field.

* Fix imports

* Add helper for repetitive test

* Fix command name

* Adjust behaviour of `tbot kube credentials` command

* Add tests for remaining commands

* Appease linter

* Add godocs

---------

Co-authored-by: Noah Stride <noah.stride@goteleport.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants