Skip to content

Make Teleport startup resilient to invalid roles (#9062)#9105

Merged
codingllama merged 1 commit intobranch/v8from
codingllama/v8-bad-role-resilience
Nov 23, 2021
Merged

Make Teleport startup resilient to invalid roles (#9062)#9105
codingllama merged 1 commit intobranch/v8from
codingllama/v8-bad-role-resilience

Conversation

@codingllama
Copy link
Copy Markdown
Contributor

Removing the old roles migration allows Teleport to start even in the face of invalid roles. The system will still be largely unusable, but tctl rm is now possible as a fallback.

Added logging makes it easier to determine the bad role.

Turns this scenario:

$ teleport start
> (...)
> ERROR: initialization failed
> could not parse 'where' rule: "!contains(ssh_session.participants, user.metadata.name)", error: ssh_session.participants is not defined
> (teleport exits)

into this:

$ teleport start
> (...)
> 2021-11-18T16:50:29-03:00 WARN [AUTH:1:CA] "Re-init the cache on error: role \"join_own_sessions_only\"\n\tcould not parse 'where' rule: \"!contains(ssh_session.participants, user.metadata.name)\", error: ssh_session.participants is not defined." cache/cache.go:725
> 2021-11-18T16:50:29-03:00 WARN [AUTH:1:CA] Cache "auth" first init failed, continuing re-init attempts in background. error:[
> ERROR REPORT:
> Original Error: *trace.BadParameterError could not parse 'where' rule: "!contains(ssh_session.participants, user.metadata.name)", error: ssh_session.participants is not defined
> Stack Trace:
> 	(...)
> User Message: role "join_own_sessions_only"
> 	could not parse 'where' rule: "!contains(ssh_session.participants, user.metadata.name)", error: ssh_session.participants is not defined] cache/cache.go:678
> 2021-11-18T16:50:35-03:00 WARN [AUTH:1:CA] "Re-init the cache on error: role \"join_own_sessions_only\"\n\tcould not parse 'where' rule: \"!contains(ssh_session.participants, user.metadata.name)\", error: ssh_session.participants is not defined." cache/cache.go:725
> (teleport running, tctl works)

See #9059 for the larger context.

Removing the old roles migration allows Teleport to start even in the face of
invalid roles. The system will still be largely unusable, but `tctl rm` is now
possible as a fallback.

Added logging makes it easier to determine the bad role.

Turns this scenario:

```shell
$ teleport start
> (...)
> ERROR: initialization failed
> could not parse 'where' rule: "!contains(ssh_session.participants, user.metadata.name)", error: ssh_session.participants is not defined
> (teleport exits)
```

into this:

```shell
$ teleport start
> (...)
> 2021-11-18T16:50:29-03:00 WARN [AUTH:1:CA] "Re-init the cache on error: role \"join_own_sessions_only\"\n\tcould not parse 'where' rule: \"!contains(ssh_session.participants, user.metadata.name)\", error: ssh_session.participants is not defined." cache/cache.go:725
> 2021-11-18T16:50:29-03:00 WARN [AUTH:1:CA] Cache "auth" first init failed, continuing re-init attempts in background. error:[
> ERROR REPORT:
> Original Error: *trace.BadParameterError could not parse 'where' rule: "!contains(ssh_session.participants, user.metadata.name)", error: ssh_session.participants is not defined
> Stack Trace:
> 	(...)
> User Message: role "join_own_sessions_only"
> 	could not parse 'where' rule: "!contains(ssh_session.participants, user.metadata.name)", error: ssh_session.participants is not defined] cache/cache.go:678
> 2021-11-18T16:50:35-03:00 WARN [AUTH:1:CA] "Re-init the cache on error: role \"join_own_sessions_only\"\n\tcould not parse 'where' rule: \"!contains(ssh_session.participants, user.metadata.name)\", error: ssh_session.participants is not defined." cache/cache.go:725
> (teleport running, tctl works)
```

See #9059 for the larger
context.

* Remove Teleport 4.3 role migration
* Remove unused parameters
* Add role name to GetRoles validation failures
@codingllama codingllama enabled auto-merge (squash) November 23, 2021 17:28
@codingllama codingllama merged commit 7ee2c11 into branch/v8 Nov 23, 2021
@codingllama codingllama deleted the codingllama/v8-bad-role-resilience branch November 23, 2021 18:25
@webvictim webvictim mentioned this pull request Mar 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants