Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Less logging, add a CANCEL log level #243

Merged
merged 6 commits into from
Oct 14, 2021
Merged

Less logging, add a CANCEL log level #243

merged 6 commits into from
Oct 14, 2021

Conversation

goodboy
Copy link
Owner

@goodboy goodboy commented Oct 6, 2021

Main thing is adding a CANCEL log level and placing it between the traditional "INFO" and our "RUNTIME".
Most of the "warnings" were basically just cancel status messages anyway so I think this is a nice way to clean up the noise as well as offer some more (filterable) details for those who want to peer into the mechanics of SC over "actors" 😂

The main arg for this is that if you're building a tractor app you'll probably have most of your "issues" relating to cancellation and/or error propagation - so we might as well offer that layer of logging as one on its own.

I was also thinking maybe the level should be called CANCELLING instead? I dunno.

Please yell lurkerz.

@goodboy goodboy added question Further information is requested experiment Exploratory design and testing debugging cancellation SC teardown semantics and anti-zombie semantics labels Oct 6, 2021
@goodboy goodboy changed the title Less logging Less logging, add a CANCEL log level Oct 6, 2021
@goodboy
Copy link
Owner Author

goodboy commented Oct 6, 2021

Bah.. still seeing this once in a while:

Traceback (most recent call last):
  File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/goodboy/repos/tractor/tractor/_child.py", line 40, in <module>
    _trio_main(
  File "/home/goodboy/repos/tractor/tractor/_entry.py", line 99, in _trio_main
    trio.run(trio_main)
  File "/home/goodboy/repos/piker/3.9/lib/python3.9/site-packages/trio/_core/_run.py", line 1932, in run
    raise runner.main_task_outcome.error
  File "/home/goodboy/repos/tractor/tractor/_actor.py", line 938, in _async_main
    )
  File "/home/goodboy/repos/piker/3.9/lib/python3.9/site-packages/trio/_core/_run.py", line 815, in __aexit__
    raise combined_error_from_nursery
trio.MultiError: KeyboardInterrupt(), RuntimeError('Nursery is closed to new arrivals')

Details of embedded exception 1:

  Traceback (most recent call last):
    File "/home/goodboy/repos/tractor/tractor/_actor.py", line 936, in _async_main
      self._parent_chan,
    File "/home/goodboy/repos/piker/3.9/lib/python3.9/site-packages/trio/_core/_run.py", line 815, in __aexit__
      raise combined_error_from_nursery
    File "/home/goodboy/repos/piker/3.9/lib/python3.9/site-packages/trio/_core/_run.py", line 1183, in raise_cancel
      raise KeyboardInterrupt
  KeyboardInterrupt

Details of embedded exception 2:

  Traceback (most recent call last):
    File "/home/goodboy/repos/tractor/tractor/_actor.py", line 698, in _process_messages
      cs = await self._service_n.start(
    File "/home/goodboy/repos/piker/3.9/lib/python3.9/site-packages/trio/_core/_run.py", line 1032, in start
      raise RuntimeError("Nursery is closed to new arrivals")
  RuntimeError: Nursery is closed to new arrivals

So it looks like we may have to handle the MultiError case as well in our cancel logging around here. Speaking of which condition order could probably be reversed there.

Base automatically changed from optional_msgspec_support to master October 6, 2021 21:01
The stdlib's `logging.LoggingAdapter` doesn't currently pass through
`stacklevel: int` down to its wrapped logger instance. Hack it here
and get our msgs looking like they would if using a built-in level.
Now that we're on our way to a (somewhat) serious beta release I think
it's about time to start de-noising the logging emissions. Since we're
trying out this approach of "stack layer oriented" log levels, I figured
this is a good time to move most of the "warnings" to what they should
be: cancellation monitoring status messages. The level is set to 16
which is just above our "runtime" level but just below the traditional
"info" level. I think this will be a decent approach since usually if
you're confused about why your `tractor` app is behaving unlike you
expect, it's 90% of the time going to be to do with cancellation or
error propagation. This this setup a user can specify the 'cancel' level
and see all the msgs pertaining to both actor and task-in-actor
cancellation mechanics.
@parity3
Copy link

parity3 commented Oct 7, 2021

Have you considered log.info('cancel-related issue', extra={'cancelling': True})? Of course you can make decorators/wrappers to handle this in one spot (cancelling(log).info('cancel-related issue').

Then some filters can pick that up and handle appropriately. I guess I'm saying it's not necessarily always inline with severity/priority and may need its own dimension?

@goodboy
Copy link
Owner Author

goodboy commented Oct 7, 2021

Have you considered

Had no clue you could even do this (honestly try to avoid logging pretty much 😂 ) but seems like it could work.
My only thing is what then do we stick all these messages under that are changed in this patch? Info seems too high up.

In general, i'm not super stoked thus far on the "built-in" levels and feel they don't really convey much of anything in name or category. The idea of categorizing "levels" without it being necessarily an ordered hierarchy is also interesting to me that's why i was thinking about messages per stack layer kinda thing.

Maybe the whole idea is deluded though?

I've been meaning to give structlog a serious look as well: https://github.com/hynek/structlog.

@parity3
Copy link

parity3 commented Oct 7, 2021

I don't have any experience with structlog or loguru but I have written a custom formatter which outputs structured logs (via blosc/msgpack) to append-only log files, and a corresponding daemon that picks them up in aggregate and pushes them into loki.

I felt at the time the fancy logging libs out there were either opinionated or diverged unnecessarily from python logging.. I want as much code as possible not to know that it needs some other package to log.. that would make no sense. I still don't think these libraries get that. They can't resist trying to make things "easier" than the standard lib and therefore breaking an API everyone already knows. I would have been fine with monkey-patching but instead they went a different way and made yet another API no one should need to learn.

As for tractor I would guess that not having any logger deps is a win for users. you can still add those extras to the log statements, and document them and then people can write their own adaptors which pipe the logs into a handler/library of their choice.

I have patterns around my code that add context to a shared handler during a time-based scope.. it's a bit draconian and won't really be useful in async but if you need to trace something without control over other loggers, this will do the job:

# disregard the crappy type hint comments

class AttributeFilter(logging.Filter):
    def __init__(self, **attrs):
        # type: (**typing.Any) -> None
        super(AttributeFilter, self).__init__()
        self.attrs = attrs

    def filter(self, record):
        record.__dict__.update(**self.attrs)
        return 1

    @classmethod
    def get_global_attribute_filter(cls):
        # NOOP if already configured
        logging.basicConfig(level=logging.INFO, format='%(asctime)s-%(levelno)s-%(lineno)d-%(message)s')
        for h in logging.root.handlers:
            for f in h.filters:
                if isinstance(f, cls):
                    return f
        f = cls()
        if not logging.root.handlers:
            raise NotImplementedError('do not know how to install with no root handlers')
        for h in logging.root.handlers:
            h.addFilter(f)
            return f

    @classmethod
    @contextlib.contextmanager
    def with_global_attributes(cls, **kwargs):
        # type: (**str) -> None
        f = cls.get_global_attribute_filter()
        with f.attrs_context(**kwargs):
            yield

    @contextlib.contextmanager
    def attrs_context(self, **kwargs):
        # type: (**str) -> None
        prev = self.attrs
        self.attrs = dict(prev, **kwargs)
        try:
            yield
        finally:
            self.attrs = prev

Then you'd surround your traceable code in a with AttributeFilter.with_global_attributes(doomsday_flag="sky-is-falling"):.

@salotz
Copy link
Collaborator

salotz commented Oct 7, 2021

I used Eliot in my library and now I have an issue to rip it out. The structured logs really screw up stack traces and stepping around in debugging.

I'm very interested in your ideas @parity3. I would love a hybrid text logging and event log style interleaved and then piped to structured loggers and still have the plain Jane ones too.

In the end I just get frustrated with python for this kind of shit since some nice macros and general feature flags could make it all nice and clean.

@goodboy
Copy link
Owner Author

goodboy commented Oct 7, 2021

@parity3 re: log.info('cancel-related issue', extra={'cancelling': True}) my only thing with this is that the messages that have changed here actually are strictly related to cancellation (on an actor level for the most part). So it's not exactly some arbitrary "info", these are msgs that detail the cancellation (logical) layer of the runtime.

However, i really love the idea of allowing a hook where user code can use tractor's log system and we automatically will show in the logs where cancellation is active at a trio task level: for eg. we toss in a new field that shows CANCELLING or something that will show to users where a task is hitting log calls and is in the cancelled state for its task.

The structured logs really screw up stack traces and stepping around in debugging.

@salotz i'm in no rush to change to something others have quiffs with.

My only longer term thought was to be able to have runtime monitoring via the log system, which it seems stuff like structure logging is meant for.

I don't want to get too much on a tangent in this PR since the main point was to reduce .warning() noise 😂 and provide insight to the cancellation sequence.

We can make a new issue to discuss alt logging systems / hybrids etc.

@parity3
Copy link

parity3 commented Oct 8, 2021

Ok trying to get back on point here.

Forgot to mention, if you don't mind using a different Logger instance for cancelling, you can use a log name like tasks.cancelling.teardown_stage1 or similar. Library users can place filters/handlers anywhere in that hierarchy to squelch / redirect whatever they want. And it's a bit more friendly-looking/standardized than custom LogRecord attributes / extra=. So 3 options to differentiate:

  • Flat exclusive type (ie custom loglevel)
  • Tags-style (LogRecord attributes)
  • Log category hierarchy (log.name)

@goodboy
Copy link
Owner Author

goodboy commented Oct 8, 2021

Ok trying to get back on point here.

😂

Still not a review of the code but I guess i'll just go with the CI on this one 🏄🏼

Cool ideas, but I think we need some other issues to discuss.

Flat exclusive type (ie custom loglevel)
Tags-style (LogRecord attributes)
Log category hierarchy (log.name)

will def stick this list in a new issue.

@goodboy
Copy link
Owner Author

goodboy commented Oct 14, 2021

No idea what's up with that one hang on msgspec but can't reproduce locally so it's probably just some fluke.

When #245 lands it should all go away anyway.

@goodboy goodboy merged commit dfeebd6 into master Oct 14, 2021
@goodboy goodboy deleted the less_logging branch October 14, 2021 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cancellation SC teardown semantics and anti-zombie semantics experiment Exploratory design and testing question Further information is requested
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants