-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create canonical auction ids in api / How to handle pod auto scaling? #175
Comments
Another solution could be to have all externally reachable endpoints be run by auto scaling pods while having an internal worker pod performing the inherently unparallelizable tasks. For example the worker task would update the auction id, store the current active auction etc in the database which is read from the database by the externally reachable pods. Likewise this task would store solution competition information in the database so it can be read by the externally reachable pods. This has the advantage that no matter how bad a ddos is we will still be able to run these tasks because they are separate from all external requests. Thinking more about this what I'm describing is basically the driver but with DB access. Previously we have thought about the driver as a thing that anyone could run themselves (coming from gp-v1) but here it has evolved into more of a backend component as it is responsible for querying the external solvers and submitting the transactions. (This nicely fits with one idea we had to implement the visible solution competition by having the driver upload its current competition state to the api by making the "uploading" shared DB access.) I would call this new type of driver api-driver. To easily run the system locally the api binary can run the api-driver too (think of it as a rust async fn) but on kubernetes they would be separate deployments where the api scales and the api-driver is always a single pod (that is not externally reachable). (This is very promising idea to me and was a big realization just now so I hope I'm explaining it right.) |
The solution proposed in the comment above is a neat idea. So we could de-couple the backend "driver" services needed for doing auctions/settlement competition/database maintenance from pods that use the database only for adding orders and reading&serving data. This would mean that we would be able to continue operating for users that have submitted orders (although, under DDoS, it might still be possible to prevent users from placing orders by DDoS-ing price estimation and order placement that are required by the FE). Personally, I am in favour of this change. |
One question about "comment solution" - how do you envision external solvers running off-premise to participate? Would they query a |
I'm not sure. With the current model the api-driver would be the one polling them so that keeps working as is. |
One problem I'm thinking about now is how DB migrations will work. Currently we have this init container set up and there is only one deployment that uses the DB. If there are multiple containers then we could give both of them the init container but if one runs first it would break the other deployment. It is not a huge problem as all the containers should be auto deployed at roughly the same time but it still doesn't feel nice. |
I guess one solution could be to only give the "internal worker pod" (as you called it) write access to the DB and make everybody else push their updates through its API? |
If everything had to go through that pod it would defeat the point of scaling. |
How many writes to the database do you expect from the new component? I thought it would be mostly read heavy. |
Has been implemented. |
Currently the driver implicitly creates batches by fetching the auction from the api at some interval. The auction in the api is updated on demand. We want to move this into the api so that we can eventually have the whole solution competition there too (#127) .
The canonical auction id and the competition are both operations that don't make sense to run in multiple pods like we would with our current auto scaling configuration. We need to figure out how we want this work. My solution to this that I discussed with Nic a bit already is that we would have some routes that are autoscalable and some that aren't. The only route we have now that wouldn't auto scale is the auction route and in the future some routes that related to solution competition like "get current competition", "get winner".
How should this work on a kubernetes level? As a temporary solution we can set the max replicas to 1. I think this is fine because even a single pod does scale quite well as most of our work consists of forwarding requests to other servers. Long term it would be nice to keep the scaling for the other routes.
We can probably achieve this with a kubernetes / nginx config that picks the target deployment based on the route so that get_price_estimate goes to a scaling deployment while get_auction goes to a non scaling deployment of the same api container that we currently use. All the pods would technically be running all apis but external requests would only go to one of the deployments based on the path.
In addition (alternatively?) we could create different containers or command line switches that configure an api pod with what operations it should handle (scalable, non scalable, both). I don't think there is need for this now but it might be useful in the future to disable something like the auto updating native price cache if it is only need for some parts of the api.
The text was updated successfully, but these errors were encountered: