-
Notifications
You must be signed in to change notification settings - Fork 833
Use BasicService in state_replication.go for replication loop. #3930
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having done this I mentioned to @pstibrany whether we should convert the whole Alertmanager
to be a Service
, but that would be a much bigger commit, and has more risk than this change which is reasonably isolated.
pkg/alertmanager/alertmanager.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seemed like the least-worst option for the time being but open to suggestions, alternatives being:
- Add Stop() method to State interface - Means we would need to a Stop method to Peer in Prometheus
- Storing the service separately as
stateService
or something, again.
Signed-off-by: Steve Simpson <[email protected]>
a3e86d2
to
2c65d7a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job! LGTM (modulo a couple of nits). I left few comments about the design evolution, but I would keep it for a follow-up if we agree on it.
pkg/alertmanager/alertmanager.go
Outdated
am.dispatcher.Stop() | ||
} | ||
|
||
if service, ok := am.state.(*state); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's more clean if you cast to the Service
interface.
if service, ok := am.state.(*state); ok { | |
if service, ok := am.state.(services.Service); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That being said, the next step of refactoring I see (but I would do it in a separate PR) is to require am.state
to implement the services.Service
interface. We can wrap cfg.Peer
into our own struct
which implements services.Service
(using services.NewIdleService()
) and add services.Service
to NilPeer
too. This way we keep all implementations consistent.
Thoughts @stevesg @pstibrany ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable. We should consider how to call Settle
for the non-sharded case too, ideally that would be consistent (the Service wrapper for the Peer wrapper does the settling in starting, as for the replication).
The alternative to the road we've started on, is to have a single "PeerService" which wraps any type of Peer, though that would then need to be extended with some sort of "run" function, so you probably end back at making them all services...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so you probably end back at making them all services...
I think having separate services would keep things cleaner at the end. Code duplication should be minimal.
} | ||
|
||
func (s *state) handleBroadcasts() { | ||
func (s *state) running(ctx context.Context) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me leave some comments here, but not related to running()
. Something to discuss about (but I would keep it separated from this PR):
- Why not doing the
Settle()
in the servicestarting
? - If we do There is no code and I am sad. #1,
Ready()
then would be waiting until service is running?
Thoughts @stevesg @pstibrany ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of implementing WaitReady
in terms of the waiting for the service to be running, seems like the service starting/running model fits nicely onto settling/broadcasting. Also means we can remove the readyc
channel, which is nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me. I was just going to ask why do we still need readyc
channel, good that you suggest removing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love this idea as well!
Signed-off-by: Steve Simpson <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (modulo a couple of nits) as 1st refactoring step.
Signed-off-by: Steve Simpson <[email protected]>
am.Stop() | ||
|
||
if service, ok := am.state.(services.Service); ok { | ||
if err := service.AwaitTerminated(context.Background()); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn’t we accept a context to StopAndWait
method, and use that for waiting? (It has “wait” in the name, so it would make sense to me) In such case, we should also return error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As much as I love putting context everywhere (getting good at it), I think it's better to address this in a separate PR, as this is a pre-existing problem with the code (e.g. the other Stop()
functions all need to be updated, and the calling code).
} | ||
|
||
func (s *state) handleBroadcasts() { | ||
func (s *state) running(ctx context.Context) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me. I was just going to ask why do we still need readyc
channel, good that you suggest removing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Way better than the previous design!
} | ||
|
||
func (s *state) handleBroadcasts() { | ||
func (s *state) running(ctx context.Context) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love this idea as well!
Signed-off-by: Steve Simpson <[email protected]>
a19b694
to
0d300f1
Compare
…xproject#3930) * Use BasicService in state_replication.go for replication loop. Signed-off-by: Steve Simpson <[email protected]> * Review comments. Signed-off-by: Steve Simpson <[email protected]> * Review comments. Signed-off-by: Steve Simpson <[email protected]> * Review comments. Signed-off-by: Steve Simpson <[email protected]>
What this PR does:
Use BasicService in state_replication.go for replication loop.
Original discussion:
#3839 (comment)
Checklist
Documentation addedCHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]