-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated spawner logging to improve debugging and traceability. #8
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good @arthurian 👍
def log_user(self, message='', level=logging.INFO): | ||
user = self.user.name if self.user else None | ||
log_message = "[user:%s] %s" % (user, message) | ||
self.log.log(level, log_message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What an excellent, simple helper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very handy, indeed!
return None #its up! | ||
else: | ||
self.log.debug("Poll, notebook is not running for user %s" % self.user.name) | ||
self.log_user("poll: notebook is NOT running") | ||
return "server up, no instance running for user %s" % self.user.name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @arthurian, I just noticed a possible typo here. It should probably say "Server up, no notebook running for user ..." instead of "Server up, no instance running for user...". This was there from before, not introduced by you. Just pointing it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think you're right about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works great, @arthurian. I have successfully tested it on the miniconda sandbox cluster.
Changes include increasing the poll interval and increasing the number of attempts the is_notebook_running() function will be called with from within the poll() function. This is meant to reduce the chances of the poll() function wrongly determining that the jupyterhub-singleuser process is not running on a worker instance when it actually is.
This PR updates
spawner.py
. The changes are related to logging and checking whether a notebook is running.When issues occur, it can be difficult to determine the cause since logging is a bit ad-hoc and has been added over time. The goal is to streamline the logging in such a way that it is easier to trace/debug spawner activity on a per user basis and make it possible to correlate the data with other metrics.
Streamlining logging
self.log_user()
to the spawner, which automatically logs the user ID. By default, this will log anINFO
message unless otherwise specified.poll()
function.Checking if a worker notebook is running on the server:
This change is slightly speculative, but given that the manager must SSH into every worker instance to verify that jupyter is running, the
is_notebook_running()
check should be as efficient as possible.Currently, we run
ps -ef | grep jupyterhub-singleuser
which spawns 2 processes joined by a pipeline. That should work fine 99.999% of the time, but since we're seeing high loads when users are training their models, it might be worth switching tonice -5 pgrep -a -f jupyterhub-singleuser
. This should be functionally equivalent, except that it executes one process with a slightly higher priority.@joshuagetega @dodget