[v18] Bound Keypair Joining Backport#56746
Conversation
* MWI: Bound Keypair Joining: Keypair rotation This adds keypair rotation for bound keypair rotation. When a rotation flag is set in the token spec, joining clients will be required to generate a new keypair and complete an additional joining challenge against the new keypair. The flag is a timestamp token to allow for some level of idempotency; to make setting this flag easier, a new `tctl` command is included: `tctl bound-keypair request-rotation [token]`. This sets the flag to the current timestamp, and joining clients will be required to perform a rotation on their next authentication attempt. Closes #55084 * Properly initialize the tctl command * Refactor ClientState to allow storing intermediate state during rotation * Fix invalid comparison and mutation logic * Log signature suite and use cryptosuites helper * Remove outdated TODO * Frontload MFA check to avoid prompting twice * Fix tctl command logging * Fix incomplete docstring * Fix imports * Fix typo in log message * Add tests for server-side rotation Adjusts the test harness a bit and adds a batch of test cases for keypair rotation. Also fixes a lint error. * Add additional test case for reused keys * Add ClientState unit test * Remove unnecessary log * Fix test lints * Fix reference to wrong key field Now that the key can change, fix a dangling reference to the initial key field. Also s/marshalled/marshaled * Wrap KeyHistoryEntry in a containing struct This should allow for some future extension if needed.
* MWI: Bound Keypair - Registration Secrets This adds support for initial joining via registration secrets. These one time use secrets emulate traditional token joining and allow clients to perform their initial join With this, no options are required for bound keypair-type tokens. While admins can specify a joining secret if they wish, if none is provided, one will be generated on the server and can be found in `status.bound_keypair.registration_secret` on the token resource. When joining, this secret can be shared with clients in addition to the (no longer sensitive) token name. This secret is verified and a keypair rotation is requested, prompting the client to generate a new keypair, provide the public key to the server, and complete a joining challenge. It then joins the cluster as usual. * Remove unnecessary token validation checks * Rename tbot flag to --registration-secret * Fix reference to renamed flag * Various fixes, mostly more unwanted checks * Add test cases for registration secrets * Fix broken test Onboarding config is no longer required, so fix the now-broken test * Allow empty .spec.bound_keypair field for bound keypair tokens This allows .spec.bound_keypair to be empty or entirely unset, since we can build defaults at creation time. * Add test for secret expiry enforcement * Handle nonexistent client state when using a registration secret * Fix test lints * Hide exact registration secret rejection reason from client Registration secret errors now return a single error message to the client and log a more specific message on the server.
* MWI: Enforce generation counter for bound keypair joining This enable generation counter enforcement for bound keypair joining, and adds a new function, `shouldEnforceGenerationCounter`, to make enabling it for other join methods trivial. Bound keypair joining introduces a similar mechanism for use between its own recovery attempts but does rely on the standard generation counter for it's renewal-style certificates so every join attempt is subject to a generation check. This wasn't enabled in the original set of bound keypair PRs so it's enabled here. RFD: #52546 * Add tests for generation counter enforcement, fix error handling bug This adds a test case for traditional generation counter enforcement with bound keypair joining, and fixes an error handling bug around certificate generation. This bug was mostly harmless before and would've just returned nil certs at worst, but is now meaningfully fallible. * Fix broken test * Fix lint * Remove references to registration secret in test for rebase onto master * Empty commit for CI
* MWI: Add audit events for bound keypair joining This adds 3 new audit events for bound keypair joining: - `join_token.bound_keypair.recovery` - emitted when a join triggers a recovery (first join, or join with expired certs) - `join_token.bound_keypair.rotation` - emitted when a keypair rotation takes place - `join_token.bound_keypair.join_state_verification_failed` - emitted when the client provides an invalid join state document * Fix UI lint * Fix more UI lints * Remove outdated TODO * Fix tests broken by error message changes
* MWI: Add lock targets for join token name and bot instance ID This adds two new lock targets meant to help lock specific bot instances without affecting all bots sharing a single user: - Bot Instance ID: Targets a bot instance UUID, which has been assigned automatically to unique bot instances for some time - Join token name: Targets the join token through which the bot joined Bot instance ID locks are most useful for traditional token-joined bots, since tokens are single use and bots have no way to onboard again without human intervention if their old certs (and old bot instance) expire. Join token locks are useful for bots using delegated join methods. They are particularly useful for bound keypair joining, where there is a direct 1:1 relationship between a "bot instance" and a token, even though that bot ID will change each time a recovery takes place. Note that this does not currently set the join token for nodes even though that would theoretically be possible. We could consider supporting node locking in the future if there's demand. * Set join token cert request field for non-renewable bot identities * Fix ASN ID and pass through join token name in impersonated certs * Tweak docstrings and add missing references for lib/decision * Clarify docstrings Clarifies various docstrings and makes sure they mention `token` joined bots cannot be targeted. * Fix failing tests
7509472 to
17b58ed
Compare
* MWI: Use specific lock targets when locking out bots Building on #56021, this takes advantage of the new granular lock targets to lock bots during verification failures, namely: - Generation counter mismatch: Locks a bot instance (token) or token name (bound keypair). - Join state verification failure (bound keypair only) Additionally, as the bound keypair joining process now generates locks, join state verification has been moved to take place explicitly *after* the main joining challenge has been completed. Without this, unauthenticated clients could abuse the new locking behavior by simply sending any invalid join state document. * Use new lock targets for traditional generation counter lockouts * Enforce new bot lock targets during cert generation * Fix lint in `mutateStatusConsumeRecovery()` * Add tests for new lock events This adds new tests and updates existing tests to account for the new locking strategies, and to make sure existing clients are actually denied cluster access. Additionally, as join state is now verified only after the regular challenge ceremony, a number of tests were broken as they set up the token in a technically impossible state, depending on the join state being checked first. Tests now explicitly specify their token keypair (bound or initial) to resolve this. * Remove resolved TODOs * Fix cut off comment
* MWI: Fix flaky tests for automatic bot lockouts This fixes a flaky test, `TestRegisterBotCertificateGenerationStolen`, which assumed authenticated clients would immediately lose access if locked. It also fixes another test introduced at the same time that contains a similar check. * Increase maximum time limit
This removes the environment variable gating use of the bound keypair experiment.
* MWI: Fix bound keypair initial join secret field name The `initial_join_secret` field was not given a proper YAML field name and was rendering as `initialjoinsecret`. Additionally, we've tried to standardize on referring to this field as "the registration secret", so this renames the field to match new terminology. This hopefully does not count as a breaking change as registration secret functionality has not been made available in a release. * Rename to `registration_secret`
This fixes a number of spelling and grammar issues in the proto comments for ProvisionTokenSpecV2BoundKeypair and ProvisionTokenStatusV2BoundKeypair.
* MWI: Fix flaky test for bound keypair generation counter This fixes another flaky test in TestServer_RegisterUsingBoundKeypairMethod_GenerationCounter, caused by locks occasionally not immediately taking effect. * Apply suggestions from code review
* MWI: Add joining URIs for tbot
This adds support for joining URIs to tbot. Joining URIs are intended
to condense tbot's growing list of required server-side config options
or CLI parameters into a single string that can be provided to the
`tbot` client.
For example, consider these two equivalent CLI commands:
```
$ tbot start identity \
--proxy-server example.teleport.sh:443 \
--join-method bound_keypair \
--token my-token \
--registration-secret abc123 \
--storage ./tbot-data
--destination ./tbot-user
$ tbot start identity \
tbot+proxy+bound-keypair://my-token:abc123@example.teleport.sh:443 \
--storage ./tbot-data \
--destination ./tbot-user
```
As shown, all parameters necessary for bots to actually connect to
and authenticate with the remote Teleport instance are included in a
single parameter. This parameter can be generated by existing tooling,
like the example command printed via `tctl bots add`, or the web UI.
End users will only need to paste a single "token", provide their own
client-side parameters (if any), and run. Similarly, we now have a new
minimally viable YAML config:
```yaml
version: v2
uri: tbot+proxy+bound-keypair://my-token:abc123@example.teleport.sh:443
storage:
type: directory
path: ./tbot-data
services:
- type: identity
destination:
type: directory
path: ./tbot-user
```
This implementation is designed to be only additive, and should not
interfere with existing config files or CLI strings. Parsed URI
parameters are merged on top of the traditional config fields during
the bot's pre-run check, and raise an error if any field conflicts.
RFD: #52546
* Fix lints
* Set `omitempty` flag on the URI field
This excludes the URI field when empty, to avoid polluting generated
config files when not using URIs - which remains fully supported - and
to clear test failures since a large number of golden tests would
otherwise need to be regenerated.
* Add additional tests for joining URI config merging
* Add additional integration-style test for joining URIs
* Fix lint
* Consistently rename field to JoinURI and convert from arg to flag
* Remove interspersed flag as arg has been removed.
* Fix broken tests after rebase
|
Amplify deployment status
|
|
Manual testing notes with steps I ran both on a local cluster and a cloud tenant. (May make a follow up PR to add these to the Machine ID testplan) First pass with registration secrets:
Next, try a new token with a preregistered key:
|
…56829) * MWI: Verify locks against bound keypair tokens before mutating state This adds an additional check for locks against a bound keypair token before any server-side state can be mutated, e.g. before potentially generating additional locks. Locks were always checked before credentials were issued, so access was reliably prevented. However, if bots get locked, they will retry the connection in a loop. The locks are generated before they're checked, which can lead to an infinite lock creation loop. This PR adds an additional check for locks against the join token before any server-side mutation takes place, but after we've at least partially verified the client's identity (via a challenge or registration secret) to avoid leaking new information about whether or not a token is locked. * Don't test for exact lock counts Preventing duplicate locks is best effort and subject to the lock checks actually returning an error when a lock exists in a timely manner, so don't assume we won't have duplicates in the test. * Try to call t.Helper() when possible in testExtractBotParamsFromCerts
|
@timothyb89 - this PR will require admin approval to merge due to its size. Consider breaking it up into a series smaller changes. |
…bound-keypair-omnibackport
…bound-keypair-omnibackport
This is a backport of Bound Keypair joining for branch/v18, containing several PRs:
This does not include:
A partial implementation of bound keypair joining made it into v18.0.0 when the branch was cut, so around half of the PRs are already merged (#52566, #54766, #54371, #54372, #54822, #54940).
Note that we're targeting this for release in v18.1.0 and it should not be merged until we're confident there will be no further v18.0.x minor releases. Backports of the documentation PRs (#56604, #56824) will follow separately.
changelog: Machine and Workload ID: Add new
bound_keypairjoin method to better support bots in on-prem and other environments without a platform-specific join method