-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow Starting Kernels Proposal #592
Comments
For now the response to POST /api/sessions, see https://petstore.swagger.io/?url=https://raw.githubusercontent.com/jupyter/jupyter_server/master/jupyter_server/services/api/api.yaml#/sessions/post_api_sessions, is e.g.
Is the intent to add an additional |
Thanks for the suggestion about state machines, but I agree that we should make this change as small as possible for consumers. Our thought yesterday is that the kernel should still be thought of as "starting" from the point of the view of the REST client, we're extending the "starting" phase to include starting the process. |
Following up from a suggestion @vidartf had during the meeting, there is in fact a |
We could make this an opt-in behavior at the level of When the opt-in behavior is selected, we do not wait for the future in |
I opened jupyter/jupyter_client#712 to explore the ideas from the previous comment. |
This is a great idea! We worked with @Carreau to tackle slow kernels from another angle. For remote kernels, scheduling is a somewhat large fixed cost (which this proposal seems like it'll bring down). Restarting a remote kernel need not be slow since a user typically means "restart my kernel" and not "reschedule me". With https://github.com/Carreau/inplace_restarter, you can run a restart magic to just restart the kernel, which ends up being very fast for this use case. If other's like this idea, maybe this can be included as one way to help tackle the slow starting kernel problem for remote kernels. I understand if you feel this is far enough from the rest of this issue to warrant a separate discussion. |
Interesting! Yes, I think that warrants its own discussion. There's also @echarles's recent efforts in https://github.com/datalayer/jupyterpool. |
The jupyterpool effort came from my frustration as a user to wait 30s (sometimes more) to get an up-and-running Spark on Hadoop (big data) kernel. BTW Things are much better nowadays on that specific are with faster Spark kernels, but if you extrapolate a bit, you can say, hey I want a kernel preloaded with that 30TB of dataset in a ready -to use dataframe in the second. Having ready-to-be-user Jupyter Kernels to which a user/notebook can bind is something I am working on and is part of making the server more microservice-like, where the security, the code content, the kernel, the datasets... are separated concerns. Having such a pool of kernel to be used can be simple for python kernel, but drive interesting questions in terms of user impersonation when you want to bind user foo to a running kernel and assing that kernel the permissions of user foo (thinking to e.g. a pod running on a Kubernetes cluster). To wrap-up I thing this specific issue is a great quick-win to build a better user interaction (say you show message to the user like "Your kernel is starting, we keep you updated") but is just the very first step a long road that we need to discuss and address in may other issues and PRs. |
Problem
Jupyter Notebook was originally built with the assumption that kernels would start quickly. This turns out
to not be true for some local kernels and most remote kernels.
Proposed Solution
We previously proposed changing the REST API to reflect kernels/sessions that were "pending". The downside to a REST API change is that the server would need to advertise capability through a versioned API or some other status, and clients would need to be updated to accommodate the changes.
An alternative method is to leave the current REST APIs intact and instead introduce the concept of a "pending" kernel that
acts like a regular kernel from the client's perspective.
A POST to
/api/sessions
or/api/kernels
would create a "pending" kernel and return immediately before starting the kernel.It remains to be seen during implementation what changes need to be made to handlers and managers, but at the very least we will use a scheduled callback to actually start the kernels when we are handling the POST.
The MappingKernelManager will also need to be updated to handle pending kernels internally in its public methods.
We should use the kernel manager to get the kernel id
We need to think about how kernel failure to start is handled for the user. Previously, it could be given to the user in the response to a POST
We might even be able to add the pending logic to the handlers without needing to affect the managers (e.g by calling
save_state
on the managers directly)We might also want to address slow-stopping kernels as part of these changes
The text was updated successfully, but these errors were encountered: