-
-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(api) Added Status API #167
Conversation
…managed services by Sablier and their expirations
Hey @ab623, thanks for your contribution, I will be looking it it |
Thanks @acouvreur. Additionally I just pushed a commit to fix #153. I did it on this branch as it requires the functionality in the initial commits. This works by checking all managed containers on startup, and if the containers are running already, it just applies the default timeout to them. If a containers is already stopped it just skips it. This way the containers can start first, and then Sabiler can be started after and the containers that it manages will be brought down after the default timeout. |
So for the status page we have a few things to consider:
If it is started by name, we will only see the status when it's up, because as soon as it's down, the instance will be out of the KV store. For instances started with the group name, then you can already use the group values to build the status page. What do you think about that? |
My implementation already accounts for that and is independent of the calling method. It looks for the enable label and uses that to track all monitored containers. Then if it is up, it exists in the KV and pulls the expiration data, if not then the service is down, but the container name is still shown in the output, just with a exited status. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I agree with your changes, but I think we need to generify the usage to instance instead of container.
Also, I believe the startup feature is not as expected. We should review it again. And it should also be enabled through a configuration flag.
DesiredReplicas int `json:"desiredReplicas"` | ||
Status string `json:"status"` | ||
Message string `json:"message,omitempty"` | ||
ExpiresAt time.Time `json:"expires_on"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The property name is ExpiresAt but the json is expires_on
We should use one exclusively
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any preference on this?
} | ||
|
||
func (s *Status) ServeStatus(c *gin.Context) { | ||
containers, _ := s.SessionsManager.GetManagedContainers() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not always containers, it might be swarm services, kubernetes deployments, etc.
We should use instances
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense here. I only have docker experience. I will look into using instances.
log.Debugf("Could not get list of managed containers to apply default") | ||
} | ||
|
||
var startContainers []string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not necessarily containers, we should use instances
@@ -69,6 +75,30 @@ func (sm *SessionsManager) initWatchers() { | |||
go sm.consumeInstanceStopped(instanceStopped) | |||
} | |||
|
|||
func (sm *SessionsManager) initContainerManagementStart() { | |||
output, err := sm.provider.GetManagedContainers() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of GetManagedContainers
, you should use sm.groups
which is already populated with the discovered instances
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay using sm.groups does make sense and simplified things. In that case we don't need to update the interface, and I can remove the "GetManagedContainers".
I think I will need to flatten a copy of the sm.groups, as this is a map with the group name as the key. And I believe the status API should be flat with the group as a property of the instance, rather than being grouped by groupname (which the API consumer can do on the client side)
} | ||
} | ||
|
||
sm.RequestSession(startContainers, sm.config.Sessions.DefaultDuration) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really sure about this. I believe the behavior is that we want to stop the instances started we found which are not in the KV.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree with this partially. I believe that if I have an already running containers, and I restart sablier, it would bring my instances down instantly. I prefer having it this way, but I also see your argument for instantly bringing them down.
Maybe we can create a setting that is like STARTUP_DEFAULT_EXPIRATION_DURATION which can be set to "0m" for instant or whatever the user sets. If the setting it blank it defaults to the SESSIONS_DEFAULT_DURATION.
Would that work for you?
@@ -273,6 +303,38 @@ func (s *SessionsManager) RequestReadySessionGroup(ctx context.Context, group st | |||
return s.RequestReadySession(ctx, names, duration, timeout) | |||
} | |||
|
|||
func (s *SessionsManager) GetManagedContainers() ([]string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily containers, use instance instead
return output, nil | ||
} | ||
|
||
func (s *SessionsManager) GetContainerStatus(name string) *InstanceState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily containers, use instance instead
if !ok { | ||
return EntityWithTimeout{}, ok | ||
} | ||
if e.expired() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this method should do any check on the timeouts and should not delete from the KV.
I think the method should be GetExpiration
for a key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully agree with this. If I wanted to get a value and an expiration, under your suggestion I would need to call 2 separate functions, which COULD cause an issue whereby, I manage to retrieve the value, but in the time between calling Get
and GetExpiration
the key expired, it would cause an error because the Key would have expired.
Keeping it as 1 call to get the value and expiration as a single struct makes more sense in my opinion. Why do you think we shouldn't be checking on timeout or deleting from KV like Get
does?
Thanks for the feedback. Glad my first foray into the Go world and this project was somewhat successful. I will refactor this to use the instances and not containers as per your suggestions and review the startup feature. I have left a few comments on both. Would be good to get your thoughts on them. |
Take your time for the refactor by the way, if you need some help feel free to ask me for it. Thanks again for your contribution by the way :) |
Thanks. I'm travelling for a few weeks. So won't have time to work on this, but will pick it up when I'm back. Appreciate the support. I left a few comments on the review can you take a look and let me know your thoughts. |
Hey @acouvreur. I'm back from some holidays. Should I continue to work on this, or wait until the rewrite? Or are you building this in as part of the rewrite 😄 |
Hey! Welcome back! I'll add this to the rewrite! No need to try to merge conflict. Feel free to try and play with the rewrite branch, help is very appreciated! |
Alright. Good to know! I'll close this PR. I'll give the refactor branch a go also. |
I have recently picked up learning Go and trying to learn to contribute back into open source. And thought that Sablier would be a good project to get involved in.
I have been putting in the early stages of a status page, which intends to show the status of the services Sablier manages, along with useful information about those services, mainly, the status, expiration time, and remaining time.
I had to make quite a few updates across the solution. I have documented the rationale below
Output:
Calling
/api/status
:Potential non-blocking issues:
I haven't used Go before, nor any of the libraries such as Gin, so some good feedback and advice would be helpful. I have tired to stick close to the format in the project.