Run the readiness logic synchronously#62178
Conversation
|
@espadolini See the table below for backport results.
|
* Run the readiness logic synchronously * Document special event handling in BroadcastEvent * Make NewSupervisor fallible * Move logging out of the processState state machine
There was a problem hiding this comment.
Would you be able to backport this to v17? Is it feasible? I just got a TestDebugService hit on v17.
There was a problem hiding this comment.
I vote against backporting this. Each time we changed readiness we broke things, and this happened many times. v17 should be kept stable. This is a high risk change for a low reward.
There was a problem hiding this comment.
We don't have a great track record in touching readiness without breaking it and v17 is supposed to be stable; unless the test proves to be especially flaky in v17 I'd prefer not to touch anything. 😬
The backport would also not be clean without including something like #61620 (partial #59667 and #59907), which adds to the risk of either getting a tweaked backport wrong or to add more changes than necessary to a stable release.
This PR moves the
lib/service.processStatemachinery at theLocalSupervisorlevel, so that(*LocalSupervisor).BroadcastEventcan update the process state synchronously instead of relying on a monitoring goroutine. This avoids problems with readiness caused by events being broadcast before the monitoring goroutine is in place, either because of sequencing or because of a goroutine race.changelog: fixed Teleport instances running the Auth Service sometimes not becoming ready during initialization