Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for JSON formatted logs #5799

Merged
merged 13 commits into from
Oct 14, 2020

Commits on Oct 7, 2020

  1. Add support for JSON formatted logs

    This adds a new extra package dependency on `json-logging`
    and an environment variable, which when set to "true" regardless
    of case, will try to use the json-logging non-web app log formatter.
    
    If the `json-logging` package is not installed but the environment
    variable is True, something like this will be logged but it will not
    crash the application:
    
    ```
    $ ENABLE_JSON_LOGGING=true jupyter notebook
    Unable to use json logging due to missing packages. Run "pip install notebook[json-logging]" to fix.
    Traceback (most recent call last):
      File "/home/osboxes/jupyter/notebook/notebook/notebookapp.py", line 144, in <module>
        import json_logging
    ModuleNotFoundError: No module named 'json_logging'
    ```
    
    Initially I tried to add a new Bool config option to toggle this but
    the problem is (from my limited understanding of traitlets and tornado)
    is that `_log_formatter_cls` needs to be set early and trying to se the
    log formatter later in `init_logging` is too late - or at least I couldn't
    figure out a way to reset the log formatter dynamically (I tried calling
    `_log_format_changed` [1] but that didn't work).
    
    With this working you get logs like this:
    
    ```
    {"written_at": "2020-10-07T21:10:51.606Z", "written_ts": 1602105051606265000, "msg": "404 GET /nbextensions/widgets/notebook/js/extension.js (127.0.0.1) 9.26ms referer=http://localhost:8888/notebooks/Untitled.ipynb", "type": "log", "logger": "NotebookApp", "thread": "MainThread", "level": "WARNING", "module": "log", "line_no": 49}
    {"written_at": "2020-10-07T21:10:54.443Z", "written_ts": 1602105054443309000, "msg": "Starting buffering for f260ddd7-938c-42d0-ac3b-455bea76694f:49f30b53fc4b4ec6a8f2fb748a171613", "type": "log", "logger": "NotebookApp", "thread": "MainThread", "level": "INFO", "module": "kernelmanager", "line_no": 222}
    {"written_at": "2020-10-07T21:10:54.446Z", "written_ts": 1602105054446264000, "msg": "Kernel shutdown: f260ddd7-938c-42d0-ac3b-455bea76694f", "type": "log", "logger": "NotebookApp", "thread": "MainThread", "level": "INFO", "module": "multikernelmanager", "line_no": 201}
    ```
    
    An obvious improvement here would to be able to split the `msg` fields apart
    so that we can log things like response status code, request method, request
    URL, response_time_ms etc. That should be possible with the `JSONLogWebFormatter`
    from `json-logging` but when I tried using that I was getting errors from
    the library about a correlation id (which is based on a request header we
    don't use). The `json-logging` library supports several web frameworks like
    Flask but unfortunately does not have built in support for Tornado, but it does
    support custom formatters so that might be a follow up option to improve on this.
    
    [1] https://github.com/ipython/traitlets/blob/4.3.3/traitlets/config/application.py#L195
    
    Closes jupyter#5798
    mriedem committed Oct 7, 2020
    Configuration menu
    Copy the full SHA
    1929fb5 View commit details
    Browse the repository at this point in the history
  2. Check for json log formatting in log_request

    This gets us closer to logging parts of the request
    as separate json fields but not all the way there.
    
    For example:
    
    ```json
    {"written_at": "2020-10-07T21:52:16.975Z", "written_ts": 1602107536975497000, "msg": "{'status': 404, 'method': 'GET', 'ip': '127.0.0.1', 'uri': '/nbextensions/widgets/notebook/js/extension.js', 'request_time': 1.9807815551757812, 'referer': 'http://localhost:8888/notebooks/test.ipynb'}", "type": "log", "logger": "NotebookApp", "thread": "MainThread", "level": "WARNING", "module": "log", "line_no": 54}
    ```
    mriedem committed Oct 7, 2020
    Configuration menu
    Copy the full SHA
    b617d91 View commit details
    Browse the repository at this point in the history

Commits on Oct 9, 2020

  1. Add log_json config option with default and validation handling

    This adds the `NotebookApp.log_json` CLI option which defaults to
    False. The default value comes from the `JUPYTER_ENABLE_JSON_LOGGING`
    environment variable. A validation hook is added such that if json
    logging is enabled but the `json-logging` package is not installed,
    an error is logged and the effective `log_json` value is False.
    
    If enabled and the `json-logging` package is installed, the `init_logging`
    method will initialize the non-web [1] json logging formatter.
    
    [1] https://github.com/bobbui/json-logging-python#21-non-web-application-log
    mriedem committed Oct 9, 2020
    Configuration menu
    Copy the full SHA
    ca95490 View commit details
    Browse the repository at this point in the history
  2. Pass NotebookApp logger to log_request

    In order to use the same logger from the notebook app we
    need to pass it through to the log_request function. To
    do that we need to use a partial function so the `log` kwarg
    is available. If not provided the default logger is the same
    as before, the tornado web access logger.
    
    This is necessary for the `log_json` settings to be used for
    both the notebook app logs and the web app request logging.
    
    There is still a todo needed here so that the individual
    request fields get formatted separately rather than dumped
    into the `msg` field.
    mriedem committed Oct 9, 2020
    Configuration menu
    Copy the full SHA
    6a74393 View commit details
    Browse the repository at this point in the history
  3. Cleanup for log_json review

    mriedem committed Oct 9, 2020
    Configuration menu
    Copy the full SHA
    3e32ce5 View commit details
    Browse the repository at this point in the history
  4. Log request properties separately when log_json=True

    This updates the log_request method to get the request
    properties logged separately if `log_json=True`. This
    ends up looking like:
    
    ```
    {"written_at": "2020-10-09T20:59:52.044Z", "written_ts": 1602277192044732000, "msg": "", "type": "log", "logger": "NotebookApp", "thread": "MainThread", "level": "WARNING", "module": "log", "line_no": 50, "status": 404, "method": "GET", "ip": "127.0.0.1", "uri": "/api/kernels/a6dd447b-282a-4e1c-ab65-eb340ad12965/channels?session_id=472907c27ed94e2cbfbae86501f7d159", "request_time": 8.63, "referer": "None"}
    ```
    mriedem committed Oct 9, 2020
    Configuration menu
    Copy the full SHA
    47e06b0 View commit details
    Browse the repository at this point in the history

Commits on Oct 12, 2020

  1. Address review comments

    mriedem committed Oct 12, 2020
    Configuration menu
    Copy the full SHA
    8b06db3 View commit details
    Browse the repository at this point in the history
  2. Fix logging in _validate_log_json

    Just use the app logger since that has the default LogFormatter
    logging until we change it to JSON.
    mriedem committed Oct 12, 2020
    Configuration menu
    Copy the full SHA
    fac2285 View commit details
    Browse the repository at this point in the history
  3. Add some basic tests for log_json config

    Make sure the default value is handled properly based
    on environment variables. Also checks the validation
    code based on whether or not json_logging is imported.
    mriedem committed Oct 12, 2020
    Configuration menu
    Copy the full SHA
    093aee2 View commit details
    Browse the repository at this point in the history
  4. Add log_json=True test for log_request

    Basic test just to make sure log_request handles log_json=True
    properly.
    mriedem committed Oct 12, 2020
    Configuration menu
    Copy the full SHA
    08f8ccd View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    4ee4bb2 View commit details
    Browse the repository at this point in the history

Commits on Oct 13, 2020

  1. Fix NotebookAppJSONLoggingTests.test_log_json_enabled

    The intent of the test is to make sure that log_json defaults
    to True when JUPYTER_ENABLE_JSON_LOGGING is set to "true" so
    this makes the test more explicit and to pass when json-logging
    is not installed so that does not interfere due to validation.
    mriedem committed Oct 13, 2020
    Configuration menu
    Copy the full SHA
    6a883e0 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    67ad5a1 View commit details
    Browse the repository at this point in the history