-
Notifications
You must be signed in to change notification settings - Fork 5.5k
Add cull_busy, cull_connected options #2498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Additional logic to support culling or skipping kernels that are busy or that have connections.
|
What a nice self cleaning notebook server. |
|
✨ 📓 ✨ |
|
@parente - thanks for adding this. I'm pretty sure I understand the use case for the cull-busy flag (to allow a single cell's execution to exceed the idle period - correct?) but I guess I'm not following the cull-connected case. I figured the kernel was always in a connected state. Seems like this case would be covered by the prior culling feature - in conjunction with cull-busy. Could you please clarify the use cases for these flags? Just trying to minimize knobs here. Thanks! |
|
The connection count tracked by the API is actually how many clients are connected to the kernel via the notebook API. My thinking is that in some situations, you may not want to cull a kernel that has a browser tab open pointing at the associated notebook, even if the kernel is idle (e.g., kiosk mode). In others, you may want to cull regardless of whether there are connected clients or not (e.g., users at work forget they have tabs open). At work, I plan to configure our JupyterHub instances to |
|
Thank you for the great explanation. In light of this, I'm wondering if |
|
I was thinking the same thing about the new |
|
Thanks for the review @rgbkrk and @kevin-bates. I'm not up to speed on how we're merging for 5.1 (or not) in the notebook repo. Label it 5.1 and wait for a merge? |
|
@parente yes, that's pretty much it. And if it goes a few days without further review, remind people about it. If you're confident enough in a change and you have merge permissions, you can also ask for more comments and announce that you'll merge it yourself in a couple of days unless people ask you to wait longer. |
|
|
||
| cull_connected = Bool(True, config=True, | ||
| help="""Whether to consider culling kernels which have one or more connections. | ||
| Only effective if cull_interval is not 0.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean cull_idle_timeout in both of these messages? Looking at the code, setting cull_interval to 0 produces a warning and resets it to 300.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch - yes, the descriptions of each should reference cull_idle_timeout.
|
Thanks for the process heads-up. I've fixed the command line help. I'll leave the PR open for a couple days like you suggest and then merge it. |
minrk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks! I'd have a slight inclination to default both of the flags to False, so that the default is only that truly idle kernels are culled (no connections and no executions). And each more aggressive knob is opt-in.
| help="""The interval (in seconds) on which to check for idle kernels exceeding the cull timeout value.""" | ||
| ) | ||
|
|
||
| cull_connected = Bool(True, config=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe default to the most conservative False for both of these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@minrk I understand taking a conservative approach but I believe that a majority of administrators that enable culling in the first place (the feature is off by default) would want connected kernels culled - provided they've exceeded the idle duration and are not busy for that period. That said, I suppose administrators will review the four culling parameters when enabling, but still feel cull_connected = True is the appropriate default - IMHO.
|
Not executing code but with active users still looking at it doesn't seem idle to me. Killing a kernel for a currently open and active page seems pretty aggressive for a default. |
|
Given that the idle period will typically be on the order of multiple hours I seriously doubt users will still be looking at the page having not done anything during that period. However, since we don't default the idle time (only recommend) and given that admins may choose a timeout value in line with a human's window of patience, your example makes sense. |
Additional logic to support culling or skipping kernels that are busy or that have connections.
Follow-on to #2215.
/cc @kevin-bates