Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reporters: report jobs separately #721

Merged
merged 1 commit into from
Dec 18, 2022
Merged

Conversation

ryneeverett
Copy link
Contributor

Adapted from #305.

I made some adjustments per the review comment:

2. The `separate` key is nowhere documented or put into the example config, also please rethink where we place the separate key -- should it be per-reporter or a global config?

Is this what you had in mind?

Copy link
Owner

@thp thp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks clean and simple. Added some nitpick comments.

Please also add to /CHANGELOG.md in the ## UNRELEASED section in ### Added an item like this:

- New `separate` configuration option for reporters to split reports into one-per-job (contributed by Ryne Everett)

The more I think about it, the more I feel like it should be like details and footer (configurable in the same spots).

docs/source/configuration.rst Outdated Show resolved Hide resolved
lib/urlwatch/reporters.py Outdated Show resolved Hide resolved
docs/source/configuration.rst Outdated Show resolved Hide resolved
@ryneeverett
Copy link
Contributor Author

Moving the option into the format-specific configuration sections makes sense to me given the current configuration hierarchy. Wouldn't it also make sense for it to go into html?

@thp
Copy link
Owner

thp commented Oct 3, 2022

Moving the option into the format-specific configuration sections makes sense to me given the current configuration hierarchy. Wouldn't it also make sense for it to go into html?

Yes, of course, html also makes sense.

@scottmac
Copy link
Contributor

scottmac commented Nov 8, 2022

I'm interested in this change, @ryneeverett are you ready for another review?

@ryneeverett ryneeverett force-pushed the separate-job-reports branch 2 times, most recently from 218ca3f to 27ea055 Compare November 20, 2022 02:38
@ryneeverett ryneeverett requested a review from thp November 20, 2022 02:39
Copy link
Owner

@thp thp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The __base_kind__ is a good idea, we should use that in other places too by implementing a get_base_config() (see comments) plus making sure we don't fail if a custom ReporterBase subclass doesn't have __base_kind__.

@@ -134,7 +134,11 @@ def submit_all(cls, report, job_states, duration):
if cfg['enabled']:
any_enabled = True
logger.info('Submitting with %s (%r)', name, subclass)
subclass(report, cfg, job_states, duration).submit()
if report.config['report'][subclass.__base_kind__].get('separate', False):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably won't work for fully custom reporters (out-of-tree) that just subclass from ReporterBase. Not sure if such scripts exist, but we don't currently disallow such reporters. For this reason, one option could be to use:

    base_kind = subclass.get('__base_kind__', None)
    if base_kind in report.config['report']:
        ...

I do like the __base_kind__ classification, maybe you can also use it in the rest of the code? E.g. the HTML reporter has this currently at the beginning of def _parts(self)::

cfg = self.report.config['report']['html']

This could be:

cfg = self.report.config['report'][self.__base_kind__]

..and since this is probably used in other places as well, we could have (in ReporterBase):

@classmethod
def get_base_config(cls, report):
    if not hasattr(cls, '__base_kind__'):
        return {}
    return report.config['report'].get(cls.__base_kind__, {})

Then the first line in HtmlReporter._parts() becomes:

cfg = self.get_base_config(self.report)

And the line this comment is attached to becomes:

if subclass.get_base_config(report).get('separate', False):

(all of this is untested, but assumed working)

Please also check if there are other occurrences where the string of __base_kind__ is used currently an could be replaced with either self.get_base_config() or self.__base_kind__ (doing a hasattr() check first if the check is in a base class).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not necessarily opposed to going this route but this is the point where I would ask -- if we're going to allow __base_kind__ to permeate the code, would it be better to enforce it's existence by throwing an error? (Perhaps with an abstract base class?) If we want to maintain backwards compatibility with unknown third party reporters, perhaps at least a warning that it ought to be defined and may be an error in a future release? Even if returning an empty dictionary works fine now, I don't think there's any way to guarantee it will continue working for all usages in the future and it strikes me as likely to lead to confusing behavior.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "cleanest" way would probably just to look at the MRO (method resolution order) for a class, this way finding the "base" kind automatically, and making it possible to get a "merged" dict of all config options (subclass configs overriding non-subclass configs).

The method resolution order can be found as __mro__ on the class object.

Like this:

% PYTHONPATH=lib python3
Python 3.10.8 (main, Oct 13 2022, 09:48:40) [Clang 14.0.0 (clang-1400.0.29.102)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from urlwatch.reporters import ShellReporter
>>> ShellReporter.__mro__
(<class 'urlwatch.reporters.ShellReporter'>, <class 'urlwatch.reporters.TextReporter'>, <class 'urlwatch.reporters.ReporterBase'>, <class 'object'>)
>>> 

Then when using just __kind__ on the base class:

diff --git a/lib/urlwatch/reporters.py b/lib/urlwatch/reporters.py
index 39150bc..46456f7 100644
--- a/lib/urlwatch/reporters.py
+++ b/lib/urlwatch/reporters.py
@@ -262,6 +262,8 @@ class HtmlReporter(ReporterBase):
 
 
 class TextReporter(ReporterBase):
+    __kind__ = 'text'
+
     def submit(self):
         cfg = self.report.config['report']['text']
         line_length = cfg['line_length']

You can get all "kinds" like this:

>>> from urlwatch.reporters import ShellReporter
>>> [getattr(cls, '__kind__', None) for cls in ShellReporter.__mro__]
['shell', 'text', None, None]

(note that the first None there is from ReporterBase, and the second None is from object)

Getting an "inherited config" could look like this:

>>> config = {}
>>> for key in reversed([getattr(cls, '__kind__', None) for cls in ShellReporter.__mro__]):
...     if key is not None:
...         config.update(self.report.config['report'][key])
... 
# here, you should be able to use "config"

I know this is a mouthful, but that's probably the nicest way to implement this.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(in theory we could also give ReporterBase a __kind__ of some kind (sic), e.g. all and then just make sure that the default config has all: separate: false by default, and then it would even magically work to:

  1. Set separate: true for all, but not set it for anything else -> separate reports for everything
  2. Set separate: true for all, and separate: false for text -> separate reports for all but text-based ones
  3. Set separate: false for all, separate: truefortext, separate: falseforshell` -> separate reports only for text-based ones, except shell

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the all configuration idea is a good one but I think improving the "Reporter" section of the Configuration docs is a higher priority at the moment. The docs hint that there is a configuration hierarchy but you have to read reporters.py to find out what it is. I say "hint" because as an end user approaching the docs it isn't clear that the reporters are even python classes. Seems like at the minimum we could provide a hyperlink to reporters.py on github but better would be to just outline the hierarchy in the docs. Ideally the outline would be generated by an RST macro but that might not be worth the effort.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing I think would be really helpful in general is configuration validation. For example, I just found out the hard way that reporters' "enabled" key is required and if you omit it you get a KeyError traceback. Pydantic is my preferred solution for such problems because it encourages you to extract out any implementation details of the configuration and also makes documentation easier because all the information is in one place.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to do the validation thing as a separate PR. This PR is nearly ready (see the comment I just posted), once that is fixed, I think this can be merged -- what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, I wouldn't want to increase the scope of this PR.

lib/urlwatch/reporters.py Outdated Show resolved Hide resolved
lib/urlwatch/reporters.py Outdated Show resolved Hide resolved
@thp
Copy link
Owner

thp commented Dec 12, 2022

Please rebase against master so that the CI tests can run though (see #733).

@ryneeverett ryneeverett marked this pull request as draft December 14, 2022 18:26
@ryneeverett ryneeverett marked this pull request as ready for review December 14, 2022 18:46
@thp thp merged commit 5385365 into thp:master Dec 18, 2022
@thp
Copy link
Owner

thp commented Dec 18, 2022

Merged, thanks!

marunjar added a commit to marunjar/urlwatch that referenced this pull request Oct 24, 2023
This PR should make the `--test-reporter` option respect `separate` flag too..
It's just applying the changes already made with thp#721 from `submit_all` method to the `submit_one` method as well.

fixes thp#771

I hope this is ok so far, because my python is not the best 😉
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants