Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing or incomplete allocation event and "really ready" state #404

Closed
fdasoghe opened this issue Nov 5, 2018 · 9 comments
Closed

Missing or incomplete allocation event and "really ready" state #404

fdasoghe opened this issue Nov 5, 2018 · 9 comments
Labels
question I have a question!

Comments

@fdasoghe
Copy link

fdasoghe commented Nov 5, 2018

We're working on integration between Unreal Engine and Agones.
Our use case is a simple match making scenario, where users can choose some details about the newly created lobby. So the game server needs some time not only to load the map (which it executes before calling sdk.ready()) but to customize it, modifying the scene and maybe loading some other assets, before actually being able to accept player connections.

These details are passed from the game manager to the game server via custom annotations on the FleetAllocation creation.

First question is: how the game server can know when it is being allocated? Is it possible only through sdk.watch()?
Second question: how can we guarantee that the game server is actually ("really") ready to accept connections? It needs some additional time to prepare the map with the required details from the allocation. For what we've seen so far, there's no direct support for this phase by Agones, am I correct?

Thanks in advance for any hint/feedback.

Best Regards,
Fabio Da Soghe

@markmandel
Copy link
Collaborator

markmandel commented Nov 5, 2018

how the game server can know when it is being allocated? Is it possible only through sdk.watch()?

Yes exactly - that's one of the uses for sdk.watch! Keep an eye out for the state changing to 'Allocated' and react accordingly.

how can we guarantee that the game server is actually ("really") ready to accept connections?

We recommend setting a label/annotation through the SDK, that can be viewed/watched externally through the Kubernetes API. This lets you set whatever states are relevant for your game, without us having to build a "one solution fits all" - that may not fit everyone's needs.

Does that answer your question?

@markmandel markmandel added the question I have a question! label Nov 5, 2018
@fdasoghe
Copy link
Author

fdasoghe commented Nov 5, 2018

First of all, thank you for your answers.
Knowing that there isn't something that I'm missing is already important.

I do understand that pursuing "one solution fits all" is very dangerous, even more in such a broad and general context. But, as an implementer and Agones client, I felt there's something missing (I admit I'm still not sure if it's my particular use case or not).

I'm not necessarily meaning here that the GameServer should have yet another state before "allocated". But I think that the "ready" state manages well the startup of a new GameServer, whereas the "allocated" one doesn't do the same for the real client connections. The point is that virtually every real world scenario has the GameServer in need of some type of data about the incoming game session and it needs to process someway these data before being actually able to accept connections (but they are not available earlier).

To sum it up: Agones manages the ready state and provides the GameManager a fresh new GameServer via the FleetAllocation. But gives no support on the GameServer allocation process: it returns from the FleetAllocation creation with the GS details (ip:port) and all the rest is up to the GameManager/GameServer. For example, the GameManager should listen to the GameServer for a proper label/annotation and only when it reaches the right sub-state, return to the client the final ip:port for the actual game connection (to apply what you've recommended).

Having said this, I have to add it would be so straightforward to have another SDK endpoint to call (let's name it started()), to confirm the allocation process is done by the GS. This would not let me avoid to watch for the allocation event (it appears unavoidable to me) but would offer a complete solution for the GameManager: Agones should wait for this second call before returning the GS details to it, guaranteeing the GS is "really ready" to accept client connection.

For those cases when this is not necessary and the GameServer is always and for sure able to accept connection after the ready state, two solutions come to my mind:

  1. the GameServer calls sdk.ready() and sdk.started() right afterwards
  2. the GameServer calls sdk.ready(true), where the parameter means "alsoStarted"

When Agones receives a new FleetAllocation, it looks for a started GS, if there's any, and provides it (as it currently happens). No need to wait something other.
If there are not started GSs, it looks for a ready GS, allocates it then wait for the started() endpoint. After the started() call, it returns the GS to the GameManager, as it currently happens.

The label/annotation setting and watching can happens as usual. I always have thought they were for an already running game session: what I discuss here is part of the startup process.

Hope to know your thoughts about it.

PS: the 2nd solution about the reworked sdk.ready() method could be ready(boolean requiresStartup=false), so the current implementations could keep working correctly without modifications.

@markmandel
Copy link
Collaborator

Thanks for taking the time to write out your thoughts - to provide some context, here is the previous design discussion, that also covers many of these topics:
#279

The TL;DR of your suggestion of blocking on allocation until loading (or other operations) was that the backing tech has a hard 30s timeout that we can't change (at least at this stage), and the consequences of it failing could cause really bad things to happen to GameServers.

If we can work out these issues, I'm definitely not against this idea (there are some other concurrency and performance considerations - it's not 100% simple, but they are solvable), but we've yet to determine a design that works.

Maybe there is a way we can make it easier to wait/watch for specific labels/annotations on a GameServer - and come at it a different way.

@markmandel
Copy link
Collaborator

It may be worth noting (and this may be something we need to document better -- I was thinking about this, this morning). You can watch a specific GameServer for changes through the K8s API without (i think), too much difficulty:

For example, watching for changes on a GameServer named simple-udp:

root@d7983a0bc99a:/go/src/agones.dev/agones# curl http://localhost:8001/apis/stable.agones.dev/v1alpha1/watch/namespaces/default/gameservers/simple-udp
{"type":"ADDED","object":{"apiVersion":"stable.agones.dev/v1alpha1","kind":"GameServer","metadata":{"annotations":{"kubectl.kubernetes.io/last-applied-configuration":"{\"apiVersion\":\"stable.agones.dev/v1alpha1\",\"kind\":\"GameServer\",\"metadata\":{\"annotations\":{},\"name\":\"simple-udp\",\"namespace\":\"default\"},\"spec\":{\"ports\":[{\"containerPort\":7654,\"name\":\"default\",\"portPolicy\":\"dynamic\"}],\"template\":{\"spec\":{\"containers\":[{\"image\":\"gcr.io/agones-images/udp-server:0.4\",\"name\":\"simple-udp\"}]}}}}\n"},"clusterName":"","creationTimestamp":"2018-11-06T00:35:55Z","finalizers":["stable.agones.dev"],"generation":1,"name":"simple-udp","namespace":"default","resourceVersion":"3353705","selfLink":"/apis/stable.agones.dev/v1alpha1/namespaces/default/gameservers/simple-udp","uid":"ec13784a-e15b-11e8-948e-42010a8a00e6"},"spec":{"container":"simple-udp","health":{"failureThreshold":3,"initialDelaySeconds":5,"periodSeconds":5},"ports":[{"containerPort":7654,"hostPort":7755,"name":"default","portPolicy":"dynamic","protocol":"UDP"}],"scheduling":"Packed","template":{"metadata":{"creationTimestamp":null},"spec":{"containers":[{"image":"gcr.io/agones-images/udp-server:0.4","name":"simple-udp","resources":{}}]}}},"status":{"address":"35.203.182.203","nodeName":"gke-test-cluster-default-2f036472-9wrr","ports":[{"name":"default","port":7755}],"state":"Ready"}}}

@fdasoghe
Copy link
Author

fdasoghe commented Nov 6, 2018

I didn't see that issue, indeed it's very on topic. The 30 seconds timeout is a real problem here, I understand the design choices taken.

Thank you again for this explanation, much instructive.

Keep up the good work!
Fabio

@markmandel
Copy link
Collaborator

Are you happy with me closing this issue?

@fdasoghe
Copy link
Author

fdasoghe commented Nov 8, 2018

Oh, yes, sorry, I was unsure how to close it since it was marked as question. Yes, please, it's ok for me (and I've implemented the watch method, it works like a charme ^_^).

@markmandel
Copy link
Collaborator

markmandel commented Nov 8, 2018

and I've implemented the watch method, it works like a charme ^_^

☝️ awesome!

Closing!

As an aside - if you feel there are some places our docs could be better to talk about this - maybe an enhancement to a guide or whatever, PRs are definitely welcome!

@fdasoghe
Copy link
Author

fdasoghe commented Nov 9, 2018

As an aside - if you feel there are some places our docs could be better to talk about this - maybe an enhancement to a guide or whatever, PRs are definitely welcome!

I'll think about it, good idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question I have a question!
Projects
None yet
Development

No branches or pull requests

2 participants