Skip to content

MWI: Add joining URIs for tbot#56267

Merged
timothyb89 merged 9 commits intomasterfrom
timothyb89/tbot-joining-uris
Jul 11, 2025
Merged

MWI: Add joining URIs for tbot#56267
timothyb89 merged 9 commits intomasterfrom
timothyb89/tbot-joining-uris

Conversation

@timothyb89
Copy link
Copy Markdown
Contributor

@timothyb89 timothyb89 commented Jul 1, 2025

This adds support for joining URIs to tbot. Joining URIs are intended to condense tbot's growing list of required server-side config options or CLI parameters into a single string that can be provided to the tbot client.

For example, consider these two equivalent CLI commands:

$ tbot start identity \
    --proxy-server example.teleport.sh:443 \
    --join-method bound_keypair \
    --token my-token \
    --registration-secret abc123 \
    --storage ./tbot-data
    --destination ./tbot-user

$ tbot start identity \
    --join-uri tbot+proxy+bound-keypair://my-token:abc123@example.teleport.sh:443 \
    --storage ./tbot-data \
    --destination ./tbot-user

As shown, all parameters necessary for bots to actually connect to and authenticate with the remote Teleport instance are included in a single parameter. This parameter can be generated by existing tooling, like the example command printed via tctl bots add, or the web UI.

End users will only need to paste a single "token", provide their own client-side parameters (if any), and run. Similarly, we now have a new minimally viable YAML config:

version: v2
join_uri: tbot+proxy+bound-keypair://my-token:abc123@example.teleport.sh:443
storage:
  type: directory
  path: ./tbot-data
services:
  - type: identity
    destination:
      type: directory
      path: ./tbot-user

This implementation is designed to be only additive, and should not interfere with existing config files or CLI strings. Parsed URI parameters are merged on top of the traditional config fields during the bot's pre-run check, and raise an error if any field conflicts.

changelog: Machine and Workload ID: Add support for new bot joining URIs to simplify bot configuration and onboarding

RFD: #52546

Comment thread lib/tbot/config/config.go Outdated
Comment thread lib/tbot/cli/start_shared.go Outdated
Comment thread lib/tbot/cli/start_shared.go Outdated
@timothyb89 timothyb89 marked this pull request as ready for review July 2, 2025 02:11
@github-actions github-actions bot requested review from boxofrad and strideynet July 2, 2025 02:11
Copy link
Copy Markdown
Contributor

@boxofrad boxofrad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! So much easier to use 👏🏻

@timothyb89 timothyb89 added this pull request to the merge queue Jul 11, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 11, 2025
@timothyb89 timothyb89 added this pull request to the merge queue Jul 11, 2025
github-merge-queue bot pushed a commit that referenced this pull request Jul 11, 2025
* MWI: Add joining URIs for tbot

This adds support for joining URIs to tbot. Joining URIs are intended
to condense tbot's growing list of required server-side config options
or CLI parameters into a single string that can be provided to the
`tbot` client.

For example, consider these two equivalent CLI commands:

```
$ tbot start identity \
    --proxy-server example.teleport.sh:443 \
    --join-method bound_keypair \
    --token my-token \
    --registration-secret abc123 \
    --storage ./tbot-data
    --destination ./tbot-user

$ tbot start identity \
    tbot+proxy+bound-keypair://my-token:abc123@example.teleport.sh:443 \
    --storage ./tbot-data \
    --destination ./tbot-user
```

As shown, all parameters necessary for bots to actually connect to
and authenticate with the remote Teleport instance are included in a
single parameter. This parameter can be generated by existing tooling,
like the example command printed via `tctl bots add`, or the web UI.

End users will only need to paste a single "token", provide their own
client-side parameters (if any), and run. Similarly, we now have a new
minimally viable YAML config:

```yaml
version: v2
uri: tbot+proxy+bound-keypair://my-token:abc123@example.teleport.sh:443
storage:
  type: directory
  path: ./tbot-data
services:
  - type: identity
    destination:
      type: directory
      path: ./tbot-user
```

This implementation is designed to be only additive, and should not
interfere with existing config files or CLI strings. Parsed URI
parameters are merged on top of the traditional config fields during
the bot's pre-run check, and raise an error if any field conflicts.

RFD: #52546

* Fix lints

* Set `omitempty` flag on the URI field

This excludes the URI field when empty, to avoid polluting generated
config files when not using URIs - which remains fully supported - and
to clear test failures since a large number of golden tests would
otherwise need to be regenerated.

* Add additional tests for joining URI config merging

* Add additional integration-style test for joining URIs

* Fix lint

* Consistently rename field to JoinURI and convert from arg to flag

* Remove interspersed flag as arg has been removed.
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 11, 2025
This adds support for joining URIs to tbot. Joining URIs are intended
to condense tbot's growing list of required server-side config options
or CLI parameters into a single string that can be provided to the
`tbot` client.

For example, consider these two equivalent CLI commands:

```
$ tbot start identity \
    --proxy-server example.teleport.sh:443 \
    --join-method bound_keypair \
    --token my-token \
    --registration-secret abc123 \
    --storage ./tbot-data
    --destination ./tbot-user

$ tbot start identity \
    tbot+proxy+bound-keypair://my-token:abc123@example.teleport.sh:443 \
    --storage ./tbot-data \
    --destination ./tbot-user
```

As shown, all parameters necessary for bots to actually connect to
and authenticate with the remote Teleport instance are included in a
single parameter. This parameter can be generated by existing tooling,
like the example command printed via `tctl bots add`, or the web UI.

End users will only need to paste a single "token", provide their own
client-side parameters (if any), and run. Similarly, we now have a new
minimally viable YAML config:

```yaml
version: v2
uri: tbot+proxy+bound-keypair://my-token:abc123@example.teleport.sh:443
storage:
  type: directory
  path: ./tbot-data
services:
  - type: identity
    destination:
      type: directory
      path: ./tbot-user
```

This implementation is designed to be only additive, and should not
interfere with existing config files or CLI strings. Parsed URI
parameters are merged on top of the traditional config fields during
the bot's pre-run check, and raise an error if any field conflicts.

RFD: #52546
This excludes the URI field when empty, to avoid polluting generated
config files when not using URIs - which remains fully supported - and
to clear test failures since a large number of golden tests would
otherwise need to be regenerated.
@timothyb89 timothyb89 force-pushed the timothyb89/tbot-joining-uris branch from 5210c6d to e8738f3 Compare July 11, 2025 02:11
@timothyb89 timothyb89 enabled auto-merge July 11, 2025 02:12
@timothyb89 timothyb89 added this pull request to the merge queue Jul 11, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 11, 2025
@timothyb89 timothyb89 added this pull request to the merge queue Jul 11, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 11, 2025
@timothyb89 timothyb89 added this pull request to the merge queue Jul 11, 2025
Merged via the queue into master with commit 717bd22 Jul 11, 2025
42 of 43 checks passed
@timothyb89 timothyb89 deleted the timothyb89/tbot-joining-uris branch July 11, 2025 06:36
timothyb89 added a commit that referenced this pull request Jul 15, 2025
* MWI: Add joining URIs for tbot

This adds support for joining URIs to tbot. Joining URIs are intended
to condense tbot's growing list of required server-side config options
or CLI parameters into a single string that can be provided to the
`tbot` client.

For example, consider these two equivalent CLI commands:

```
$ tbot start identity \
    --proxy-server example.teleport.sh:443 \
    --join-method bound_keypair \
    --token my-token \
    --registration-secret abc123 \
    --storage ./tbot-data
    --destination ./tbot-user

$ tbot start identity \
    tbot+proxy+bound-keypair://my-token:abc123@example.teleport.sh:443 \
    --storage ./tbot-data \
    --destination ./tbot-user
```

As shown, all parameters necessary for bots to actually connect to
and authenticate with the remote Teleport instance are included in a
single parameter. This parameter can be generated by existing tooling,
like the example command printed via `tctl bots add`, or the web UI.

End users will only need to paste a single "token", provide their own
client-side parameters (if any), and run. Similarly, we now have a new
minimally viable YAML config:

```yaml
version: v2
uri: tbot+proxy+bound-keypair://my-token:abc123@example.teleport.sh:443
storage:
  type: directory
  path: ./tbot-data
services:
  - type: identity
    destination:
      type: directory
      path: ./tbot-user
```

This implementation is designed to be only additive, and should not
interfere with existing config files or CLI strings. Parsed URI
parameters are merged on top of the traditional config fields during
the bot's pre-run check, and raise an error if any field conflicts.

RFD: #52546

* Fix lints

* Set `omitempty` flag on the URI field

This excludes the URI field when empty, to avoid polluting generated
config files when not using URIs - which remains fully supported - and
to clear test failures since a large number of golden tests would
otherwise need to be regenerated.

* Add additional tests for joining URI config merging

* Add additional integration-style test for joining URIs

* Fix lint

* Consistently rename field to JoinURI and convert from arg to flag

* Remove interspersed flag as arg has been removed.

* Fix broken tests after rebase
github-merge-queue bot pushed a commit that referenced this pull request Jul 24, 2025
* MWI: Bound Keypair Rotation (#55240)

* MWI: Bound Keypair Joining: Keypair rotation

This adds keypair rotation for bound keypair rotation. When a rotation
flag is set in the token spec, joining clients will be required to
generate a new keypair and complete an additional joining challenge
against the new keypair.

The flag is a timestamp token to allow for some level of idempotency;
to make setting this flag easier, a new `tctl` command is included:
`tctl bound-keypair request-rotation [token]`. This sets the flag
to the current timestamp, and joining clients will be required to
perform a rotation on their next authentication attempt.

Closes #55084

* Properly initialize the tctl command

* Refactor ClientState to allow storing intermediate state during rotation

* Fix invalid comparison and mutation logic

* Log signature suite and use cryptosuites helper

* Remove outdated TODO

* Frontload MFA check to avoid prompting twice

* Fix tctl command logging

* Fix incomplete docstring

* Fix imports

* Fix typo in log message

* Add tests for server-side rotation

Adjusts the test harness a bit and adds a batch of test cases for
keypair rotation.

Also fixes a lint error.

* Add additional test case for reused keys

* Add ClientState unit test

* Remove unnecessary log

* Fix test lints

* Fix reference to wrong key field

Now that the key can change, fix a dangling reference to the initial
key field. Also s/marshalled/marshaled

* Wrap KeyHistoryEntry in a containing struct

This should allow for some future extension if needed.

* MWI: Bound Keypair - Registration Secrets (#55380)

* MWI: Bound Keypair - Registration Secrets

This adds support for initial joining via registration secrets. These
one time use secrets emulate traditional token joining and allow
clients to perform their initial join

With this, no options are required for bound keypair-type tokens.
While admins can specify a joining secret if they wish, if none is
provided, one will be generated on the server and can be found in
`status.bound_keypair.registration_secret` on the token resource.

When joining, this secret can be shared with clients in addition to
the (no longer sensitive) token name. This secret is verified and
a keypair rotation is requested, prompting the client to generate a
new keypair, provide the public key to the server, and complete a
joining challenge. It then joins the cluster as usual.

* Remove unnecessary token validation checks

* Rename tbot flag to --registration-secret

* Fix reference to renamed flag

* Various fixes, mostly more unwanted checks

* Add test cases for registration secrets

* Fix broken test

Onboarding config is no longer required, so fix the now-broken test

* Allow empty .spec.bound_keypair field for bound keypair tokens

This allows .spec.bound_keypair to be empty or entirely unset,
since we can build defaults at creation time.

* Add test for secret expiry enforcement

* Handle nonexistent client state when using a registration secret

* Fix test lints

* Hide exact registration secret rejection reason from client

Registration secret errors now return a single error message to the
client and log a more specific message on the server.

* MWI: Enforce generation counter for bound keypair joining (#55543)

* MWI: Enforce generation counter for bound keypair joining

This enable generation counter enforcement for bound keypair joining,
and adds a new function, `shouldEnforceGenerationCounter`, to make
enabling it for other join methods trivial.

Bound keypair joining introduces a similar mechanism for use between
its own recovery attempts but does rely on the standard generation
counter for it's renewal-style certificates so every join attempt is
subject to a generation check. This wasn't enabled in the original set
of bound keypair PRs so it's enabled here.

RFD: #52546

* Add tests for generation counter enforcement, fix error handling bug

This adds a test case for traditional generation counter enforcement
with bound keypair joining, and fixes an error handling bug around
certificate generation. This bug was mostly harmless before and
would've just returned nil certs at worst, but is now meaningfully
fallible.

* Fix broken test

* Fix lint

* Remove references to registration secret in test for rebase onto master

* Empty commit for CI

* MWI: Add audit events for bound keypair joining (#55701)

* MWI: Add audit events for bound keypair joining

This adds 3 new audit events for bound keypair joining:
- `join_token.bound_keypair.recovery` - emitted when a join triggers
  a recovery (first join, or join with expired certs)
- `join_token.bound_keypair.rotation` - emitted when a keypair
  rotation takes place
- `join_token.bound_keypair.join_state_verification_failed` - emitted
  when the client provides an invalid join state document

* Fix UI lint

* Fix more UI lints

* Remove outdated TODO

* Fix tests broken by error message changes

* MWI: Add lock targets for join token name and bot instance ID (#56021)

* MWI: Add lock targets for join token name and bot instance ID

This adds two new lock targets meant to help lock specific bot
instances without affecting all bots sharing a single user:
- Bot Instance ID: Targets a bot instance UUID, which has been
  assigned automatically to unique bot instances for some time
- Join token name: Targets the join token through which the bot
  joined

Bot instance ID locks are most useful for traditional token-joined
bots, since tokens are single use and bots have no way to onboard
again without human intervention if their old certs (and old bot
instance) expire.

Join token locks are useful for bots using delegated join methods.
They are particularly useful for bound keypair joining, where there
is a direct 1:1 relationship between a "bot instance" and a token,
even though that bot ID will change each time a recovery takes place.

Note that this does not currently set the join token for nodes even
though that would theoretically be possible. We could consider
supporting node locking in the future if there's demand.

* Set join token cert request field for non-renewable bot identities

* Fix ASN ID and pass through join token name in impersonated certs

* Tweak docstrings and add missing references for lib/decision

* Clarify docstrings

Clarifies various docstrings and makes sure they mention `token`
joined bots cannot be targeted.

* Fix failing tests

* MWI: Use specific lock targets when locking out bots (#56110)

* MWI: Use specific lock targets when locking out bots

Building on #56021, this takes advantage of the new granular lock
targets to lock bots during verification failures, namely:
- Generation counter mismatch: Locks a bot instance (token) or token
  name (bound keypair).
- Join state verification failure (bound keypair only)

Additionally, as the bound keypair joining process now generates
locks, join state verification has been moved to take place explicitly
*after* the main joining challenge has been completed. Without this,
unauthenticated clients could abuse the new locking behavior by simply
sending any invalid join state document.

* Use new lock targets for traditional generation counter lockouts

* Enforce new bot lock targets during cert generation

* Fix lint in `mutateStatusConsumeRecovery()`

* Add tests for new lock events

This adds new tests and updates existing tests to account for the new
locking strategies, and to make sure existing clients are actually
denied cluster access.

Additionally, as join state is now verified only after the regular
challenge ceremony, a number of tests were broken as they set up
the token in a technically impossible state, depending on the join
state being checked first. Tests now explicitly specify their token
keypair (bound or initial) to resolve this.

* Remove resolved TODOs

* Fix cut off comment

* MWI: Fix flaky tests for automatic bot lockouts (#56323)

* MWI: Fix flaky tests for automatic bot lockouts

This fixes a flaky test, `TestRegisterBotCertificateGenerationStolen`,
which assumed authenticated clients would immediately lose access if
locked. It also fixes another test introduced at the same time that
contains a similar check.

* Increase maximum time limit

* MWI: Remove bound keypair experiment flag (#56592)

This removes the environment variable gating use of the bound keypair
experiment.

* MWI: Fix bound keypair initial join secret field name (#56603)

* MWI: Fix bound keypair initial join secret field name

The `initial_join_secret` field was not given a proper YAML field
name and was rendering as `initialjoinsecret`. Additionally, we've
tried to standardize on referring to this field as "the registration
secret", so this renames the field to match new terminology.

This hopefully does not count as a breaking change as registration
secret functionality has not been made available in a release.

* Rename to `registration_secret`

* MWI: Fix typos in bound keypair ProvisionTokenV2 proto (#56653)

This fixes a number of spelling and grammar issues in the proto
comments for ProvisionTokenSpecV2BoundKeypair and
ProvisionTokenStatusV2BoundKeypair.

* MWI: Fix flaky test for bound keypair generation counter (#56732)

* MWI: Fix flaky test for bound keypair generation counter

This fixes another flaky test in
TestServer_RegisterUsingBoundKeypairMethod_GenerationCounter, caused
by locks occasionally not immediately taking effect.

* Apply suggestions from code review

* MWI: Add joining URIs for tbot (#56267)

* MWI: Add joining URIs for tbot

This adds support for joining URIs to tbot. Joining URIs are intended
to condense tbot's growing list of required server-side config options
or CLI parameters into a single string that can be provided to the
`tbot` client.

For example, consider these two equivalent CLI commands:

```
$ tbot start identity \
    --proxy-server example.teleport.sh:443 \
    --join-method bound_keypair \
    --token my-token \
    --registration-secret abc123 \
    --storage ./tbot-data
    --destination ./tbot-user

$ tbot start identity \
    tbot+proxy+bound-keypair://my-token:abc123@example.teleport.sh:443 \
    --storage ./tbot-data \
    --destination ./tbot-user
```

As shown, all parameters necessary for bots to actually connect to
and authenticate with the remote Teleport instance are included in a
single parameter. This parameter can be generated by existing tooling,
like the example command printed via `tctl bots add`, or the web UI.

End users will only need to paste a single "token", provide their own
client-side parameters (if any), and run. Similarly, we now have a new
minimally viable YAML config:

```yaml
version: v2
uri: tbot+proxy+bound-keypair://my-token:abc123@example.teleport.sh:443
storage:
  type: directory
  path: ./tbot-data
services:
  - type: identity
    destination:
      type: directory
      path: ./tbot-user
```

This implementation is designed to be only additive, and should not
interfere with existing config files or CLI strings. Parsed URI
parameters are merged on top of the traditional config fields during
the bot's pre-run check, and raise an error if any field conflicts.

RFD: #52546

* Fix lints

* Set `omitempty` flag on the URI field

This excludes the URI field when empty, to avoid polluting generated
config files when not using URIs - which remains fully supported - and
to clear test failures since a large number of golden tests would
otherwise need to be regenerated.

* Add additional tests for joining URI config merging

* Add additional integration-style test for joining URIs

* Fix lint

* Consistently rename field to JoinURI and convert from arg to flag

* Remove interspersed flag as arg has been removed.

* Fix broken tests after rebase

* MWI: Verify locks against bound keypair tokens before mutating state (#56829)

* MWI: Verify locks against bound keypair tokens before mutating state

This adds an additional check for locks against a bound keypair token
before any server-side state can be mutated, e.g. before potentially
generating additional locks.

Locks were always checked before credentials were issued, so access
was reliably prevented. However, if bots get locked, they will retry
the connection in a loop. The locks are generated before they're
checked, which can lead to an infinite lock creation loop.

This PR adds an additional check for locks against the join token
before any server-side mutation takes place, but after we've at least
partially verified the client's identity (via a challenge or
registration secret) to avoid leaking new information about whether
or not a token is locked.

* Don't test for exact lock counts

Preventing duplicate locks is best effort and subject to the lock
checks actually returning an error when a lock exists in a timely
manner, so don't assume we won't have duplicates in the test.

* Try to call t.Helper() when possible in testExtractBotParamsFromCerts
timothyb89 added a commit that referenced this pull request Aug 29, 2025
* MWI: Add joining URIs for tbot

This adds support for joining URIs to tbot. Joining URIs are intended
to condense tbot's growing list of required server-side config options
or CLI parameters into a single string that can be provided to the
`tbot` client.

For example, consider these two equivalent CLI commands:

```
$ tbot start identity \
    --proxy-server example.teleport.sh:443 \
    --join-method bound_keypair \
    --token my-token \
    --registration-secret abc123 \
    --storage ./tbot-data
    --destination ./tbot-user

$ tbot start identity \
    tbot+proxy+bound-keypair://my-token:abc123@example.teleport.sh:443 \
    --storage ./tbot-data \
    --destination ./tbot-user
```

As shown, all parameters necessary for bots to actually connect to
and authenticate with the remote Teleport instance are included in a
single parameter. This parameter can be generated by existing tooling,
like the example command printed via `tctl bots add`, or the web UI.

End users will only need to paste a single "token", provide their own
client-side parameters (if any), and run. Similarly, we now have a new
minimally viable YAML config:

```yaml
version: v2
uri: tbot+proxy+bound-keypair://my-token:abc123@example.teleport.sh:443
storage:
  type: directory
  path: ./tbot-data
services:
  - type: identity
    destination:
      type: directory
      path: ./tbot-user
```

This implementation is designed to be only additive, and should not
interfere with existing config files or CLI strings. Parsed URI
parameters are merged on top of the traditional config fields during
the bot's pre-run check, and raise an error if any field conflicts.

RFD: #52546

* Fix lints

* Set `omitempty` flag on the URI field

This excludes the URI field when empty, to avoid polluting generated
config files when not using URIs - which remains fully supported - and
to clear test failures since a large number of golden tests would
otherwise need to be regenerated.

* Add additional tests for joining URI config merging

* Add additional integration-style test for joining URIs

* Fix lint

* Consistently rename field to JoinURI and convert from arg to flag

* Remove interspersed flag as arg has been removed.

* Fix broken tests after rebase
timothyb89 added a commit that referenced this pull request Sep 4, 2025
* MWI: Add joining URIs for tbot

This adds support for joining URIs to tbot. Joining URIs are intended
to condense tbot's growing list of required server-side config options
or CLI parameters into a single string that can be provided to the
`tbot` client.

For example, consider these two equivalent CLI commands:

```
$ tbot start identity \
    --proxy-server example.teleport.sh:443 \
    --join-method bound_keypair \
    --token my-token \
    --registration-secret abc123 \
    --storage ./tbot-data
    --destination ./tbot-user

$ tbot start identity \
    tbot+proxy+bound-keypair://my-token:abc123@example.teleport.sh:443 \
    --storage ./tbot-data \
    --destination ./tbot-user
```

As shown, all parameters necessary for bots to actually connect to
and authenticate with the remote Teleport instance are included in a
single parameter. This parameter can be generated by existing tooling,
like the example command printed via `tctl bots add`, or the web UI.

End users will only need to paste a single "token", provide their own
client-side parameters (if any), and run. Similarly, we now have a new
minimally viable YAML config:

```yaml
version: v2
uri: tbot+proxy+bound-keypair://my-token:abc123@example.teleport.sh:443
storage:
  type: directory
  path: ./tbot-data
services:
  - type: identity
    destination:
      type: directory
      path: ./tbot-user
```

This implementation is designed to be only additive, and should not
interfere with existing config files or CLI strings. Parsed URI
parameters are merged on top of the traditional config fields during
the bot's pre-run check, and raise an error if any field conflicts.

RFD: #52546

* Fix lints

* Set `omitempty` flag on the URI field

This excludes the URI field when empty, to avoid polluting generated
config files when not using URIs - which remains fully supported - and
to clear test failures since a large number of golden tests would
otherwise need to be regenerated.

* Add additional tests for joining URI config merging

* Add additional integration-style test for joining URIs

* Fix lint

* Consistently rename field to JoinURI and convert from arg to flag

* Remove interspersed flag as arg has been removed.

* Fix broken tests after rebase
github-merge-queue bot pushed a commit that referenced this pull request Sep 12, 2025
* MWI: Bound Keypair Rotation (#55240)

* MWI: Bound Keypair Joining: Keypair rotation

This adds keypair rotation for bound keypair rotation. When a rotation
flag is set in the token spec, joining clients will be required to
generate a new keypair and complete an additional joining challenge
against the new keypair.

The flag is a timestamp token to allow for some level of idempotency;
to make setting this flag easier, a new `tctl` command is included:
`tctl bound-keypair request-rotation [token]`. This sets the flag
to the current timestamp, and joining clients will be required to
perform a rotation on their next authentication attempt.

Closes #55084

* Properly initialize the tctl command

* Refactor ClientState to allow storing intermediate state during rotation

* Fix invalid comparison and mutation logic

* Log signature suite and use cryptosuites helper

* Remove outdated TODO

* Frontload MFA check to avoid prompting twice

* Fix tctl command logging

* Fix incomplete docstring

* Fix imports

* Fix typo in log message

* Add tests for server-side rotation

Adjusts the test harness a bit and adds a batch of test cases for
keypair rotation.

Also fixes a lint error.

* Add additional test case for reused keys

* Add ClientState unit test

* Remove unnecessary log

* Fix test lints

* Fix reference to wrong key field

Now that the key can change, fix a dangling reference to the initial
key field. Also s/marshalled/marshaled

* Wrap KeyHistoryEntry in a containing struct

This should allow for some future extension if needed.

* MWI: Bound Keypair - Registration Secrets (#55380)

* MWI: Bound Keypair - Registration Secrets

This adds support for initial joining via registration secrets. These
one time use secrets emulate traditional token joining and allow
clients to perform their initial join

With this, no options are required for bound keypair-type tokens.
While admins can specify a joining secret if they wish, if none is
provided, one will be generated on the server and can be found in
`status.bound_keypair.registration_secret` on the token resource.

When joining, this secret can be shared with clients in addition to
the (no longer sensitive) token name. This secret is verified and
a keypair rotation is requested, prompting the client to generate a
new keypair, provide the public key to the server, and complete a
joining challenge. It then joins the cluster as usual.

* Remove unnecessary token validation checks

* Rename tbot flag to --registration-secret

* Fix reference to renamed flag

* Various fixes, mostly more unwanted checks

* Add test cases for registration secrets

* Fix broken test

Onboarding config is no longer required, so fix the now-broken test

* Allow empty .spec.bound_keypair field for bound keypair tokens

This allows .spec.bound_keypair to be empty or entirely unset,
since we can build defaults at creation time.

* Add test for secret expiry enforcement

* Handle nonexistent client state when using a registration secret

* Fix test lints

* Hide exact registration secret rejection reason from client

Registration secret errors now return a single error message to the
client and log a more specific message on the server.

* MWI: Enforce generation counter for bound keypair joining (#55543)

* MWI: Enforce generation counter for bound keypair joining

This enable generation counter enforcement for bound keypair joining,
and adds a new function, `shouldEnforceGenerationCounter`, to make
enabling it for other join methods trivial.

Bound keypair joining introduces a similar mechanism for use between
its own recovery attempts but does rely on the standard generation
counter for it's renewal-style certificates so every join attempt is
subject to a generation check. This wasn't enabled in the original set
of bound keypair PRs so it's enabled here.

RFD: #52546

* Add tests for generation counter enforcement, fix error handling bug

This adds a test case for traditional generation counter enforcement
with bound keypair joining, and fixes an error handling bug around
certificate generation. This bug was mostly harmless before and
would've just returned nil certs at worst, but is now meaningfully
fallible.

* Fix broken test

* Fix lint

* Remove references to registration secret in test for rebase onto master

* Empty commit for CI

* MWI: Add audit events for bound keypair joining (#55701)

* MWI: Add audit events for bound keypair joining

This adds 3 new audit events for bound keypair joining:
- `join_token.bound_keypair.recovery` - emitted when a join triggers
  a recovery (first join, or join with expired certs)
- `join_token.bound_keypair.rotation` - emitted when a keypair
  rotation takes place
- `join_token.bound_keypair.join_state_verification_failed` - emitted
  when the client provides an invalid join state document

* Fix UI lint

* Fix more UI lints

* Remove outdated TODO

* Fix tests broken by error message changes

* Fix lint

* MWI: Add lock targets for join token name and bot instance ID (#56021)

* MWI: Add lock targets for join token name and bot instance ID

This adds two new lock targets meant to help lock specific bot
instances without affecting all bots sharing a single user:
- Bot Instance ID: Targets a bot instance UUID, which has been
  assigned automatically to unique bot instances for some time
- Join token name: Targets the join token through which the bot
  joined

Bot instance ID locks are most useful for traditional token-joined
bots, since tokens are single use and bots have no way to onboard
again without human intervention if their old certs (and old bot
instance) expire.

Join token locks are useful for bots using delegated join methods.
They are particularly useful for bound keypair joining, where there
is a direct 1:1 relationship between a "bot instance" and a token,
even though that bot ID will change each time a recovery takes place.

Note that this does not currently set the join token for nodes even
though that would theoretically be possible. We could consider
supporting node locking in the future if there's demand.

* Set join token cert request field for non-renewable bot identities

* Fix ASN ID and pass through join token name in impersonated certs

* Tweak docstrings and add missing references for lib/decision

* Clarify docstrings

Clarifies various docstrings and makes sure they mention `token`
joined bots cannot be targeted.

* Fix failing tests

* MWI: Use specific lock targets when locking out bots (#56110)

* MWI: Use specific lock targets when locking out bots

Building on #56021, this takes advantage of the new granular lock
targets to lock bots during verification failures, namely:
- Generation counter mismatch: Locks a bot instance (token) or token
  name (bound keypair).
- Join state verification failure (bound keypair only)

Additionally, as the bound keypair joining process now generates
locks, join state verification has been moved to take place explicitly
*after* the main joining challenge has been completed. Without this,
unauthenticated clients could abuse the new locking behavior by simply
sending any invalid join state document.

* Use new lock targets for traditional generation counter lockouts

* Enforce new bot lock targets during cert generation

* Fix lint in `mutateStatusConsumeRecovery()`

* Add tests for new lock events

This adds new tests and updates existing tests to account for the new
locking strategies, and to make sure existing clients are actually
denied cluster access.

Additionally, as join state is now verified only after the regular
challenge ceremony, a number of tests were broken as they set up
the token in a technically impossible state, depending on the join
state being checked first. Tests now explicitly specify their token
keypair (bound or initial) to resolve this.

* Remove resolved TODOs

* Fix cut off comment

* MWI: Fix flaky tests for automatic bot lockouts (#56323)

* MWI: Fix flaky tests for automatic bot lockouts

This fixes a flaky test, `TestRegisterBotCertificateGenerationStolen`,
which assumed authenticated clients would immediately lose access if
locked. It also fixes another test introduced at the same time that
contains a similar check.

* Increase maximum time limit

* MWI: Remove bound keypair experiment flag (#56592)

This removes the environment variable gating use of the bound keypair
experiment.

* MWI: Fix bound keypair initial join secret field name (#56603)

* MWI: Fix bound keypair initial join secret field name

The `initial_join_secret` field was not given a proper YAML field
name and was rendering as `initialjoinsecret`. Additionally, we've
tried to standardize on referring to this field as "the registration
secret", so this renames the field to match new terminology.

This hopefully does not count as a breaking change as registration
secret functionality has not been made available in a release.

* Rename to `registration_secret`

* MWI: Fix typos in bound keypair ProvisionTokenV2 proto (#56653)

This fixes a number of spelling and grammar issues in the proto
comments for ProvisionTokenSpecV2BoundKeypair and
ProvisionTokenStatusV2BoundKeypair.

* MWI: Fix flaky test for bound keypair generation counter (#56732)

* MWI: Fix flaky test for bound keypair generation counter

This fixes another flaky test in
TestServer_RegisterUsingBoundKeypairMethod_GenerationCounter, caused
by locks occasionally not immediately taking effect.

* Apply suggestions from code review

* MWI: Add joining URIs for tbot (#56267)

* MWI: Add joining URIs for tbot

This adds support for joining URIs to tbot. Joining URIs are intended
to condense tbot's growing list of required server-side config options
or CLI parameters into a single string that can be provided to the
`tbot` client.

For example, consider these two equivalent CLI commands:

```
$ tbot start identity \
    --proxy-server example.teleport.sh:443 \
    --join-method bound_keypair \
    --token my-token \
    --registration-secret abc123 \
    --storage ./tbot-data
    --destination ./tbot-user

$ tbot start identity \
    tbot+proxy+bound-keypair://my-token:abc123@example.teleport.sh:443 \
    --storage ./tbot-data \
    --destination ./tbot-user
```

As shown, all parameters necessary for bots to actually connect to
and authenticate with the remote Teleport instance are included in a
single parameter. This parameter can be generated by existing tooling,
like the example command printed via `tctl bots add`, or the web UI.

End users will only need to paste a single "token", provide their own
client-side parameters (if any), and run. Similarly, we now have a new
minimally viable YAML config:

```yaml
version: v2
uri: tbot+proxy+bound-keypair://my-token:abc123@example.teleport.sh:443
storage:
  type: directory
  path: ./tbot-data
services:
  - type: identity
    destination:
      type: directory
      path: ./tbot-user
```

This implementation is designed to be only additive, and should not
interfere with existing config files or CLI strings. Parsed URI
parameters are merged on top of the traditional config fields during
the bot's pre-run check, and raise an error if any field conflicts.

RFD: #52546

* Fix lints

* Set `omitempty` flag on the URI field

This excludes the URI field when empty, to avoid polluting generated
config files when not using URIs - which remains fully supported - and
to clear test failures since a large number of golden tests would
otherwise need to be regenerated.

* Add additional tests for joining URI config merging

* Add additional integration-style test for joining URIs

* Fix lint

* Consistently rename field to JoinURI and convert from arg to flag

* Remove interspersed flag as arg has been removed.

* Fix broken tests after rebase

* MWI: Verify locks against bound keypair tokens before mutating state (#56829)

* MWI: Verify locks against bound keypair tokens before mutating state

This adds an additional check for locks against a bound keypair token
before any server-side state can be mutated, e.g. before potentially
generating additional locks.

Locks were always checked before credentials were issued, so access
was reliably prevented. However, if bots get locked, they will retry
the connection in a loop. The locks are generated before they're
checked, which can lead to an infinite lock creation loop.

This PR adds an additional check for locks against the join token
before any server-side mutation takes place, but after we've at least
partially verified the client's identity (via a challenge or
registration secret) to avoid leaking new information about whether
or not a token is locked.

* Don't test for exact lock counts

Preventing duplicate locks is best effort and subject to the lock
checks actually returning an error when a lock exists in a timely
manner, so don't assume we won't have duplicates in the test.

* Try to call t.Helper() when possible in testExtractBotParamsFromCerts

* Bound Keypair: Fix lock generation on sequence desync (#57687)

* Bound Keypair: Fix lock generation on sequence desync

This fixes an issue where locks may not be generated as expected when
join state sequences desync unless the original client is also
performing a recovery.

Currently, if the original client is renewing with a valid identity,
its bot instance ID is checked against the stored instance ID. If they
don't match, access is denied without generating a lock. However, if
client credentials are stolen and used to perform a recovery, this
implicitly generates a new bot instance ID. The original client,
presumably with still-valid certs containing the original ID, will
try to renew as usual, but will only be denied. Join state
verification is skipped, and no lock is created.

(Note that, given enough time, the client's credentials will
eventually expire. The next join attempt will then attempt a recovery,
fail to verify join state, and generate a lock as expected. This just
means locking takes ~1hr instead of ~20min, based on default values.)

The fix is straightforward enough: the bot instance ID check is moved
after join state verification. In practice this check is unlikely to
be useful as any action that could cause the bot instance IDs to
change should also cause join state verification to fail. The check
remains at the end of the renewal flow as a sanity check, but only
after the challenge ceremony and join state verification are
performed.

changelog: Bound Keypair Joining: Fix lock generation on sequence desync

* Add some test detail in a comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants