Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bci_repo_publish.py: Return 1 if the last published build is over a week old #2998

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Vogtinator
Copy link
Member

This way it's easy to spot whether a manual look is required.

I'm not entirely sure whether this is a good idea. Ideally someone looks directly when builds or tests fail and fixes this before a week expires, but as additional safe guard it's probably worth it.

@codecov-commenter
Copy link

codecov-commenter commented Jul 26, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (master@2a0b30a). Learn more about missing BASE report.
Report is 271 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##             master    #2998   +/-   ##
=========================================
  Coverage          ?   28.45%           
=========================================
  Files             ?       85           
  Lines             ?    14716           
  Branches          ?        0           
=========================================
  Hits              ?     4188           
  Misses            ?    10528           
  Partials          ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…eek old

This way it's easy to spot whether a manual look is required.
@dirkmueller
Copy link
Member

I think we should simply stop publishing when there are failed builds? I'm not sure what hte cutoff of 7 days is solving?

@Vogtinator
Copy link
Member Author

I'm not sure what hte cutoff of 7 days is solving?

Currently the bot fails only if some API call actually fails, but not if a build failed or openQA test failed, as this is just expected behaviour. This means that if the repo is unresolvable for a month, the only sign is that there haven't been any openQA runs recently. Or if openQA fails for a month because nobody had a look, the jobs are still green.

With this change it's visible in botmaster as a pipeline failure if there hasn't been a new build published for a week.

@@ -147,6 +156,7 @@ def run(self, version, token=None):
# Trigger publishing
if token is None:
self.logger.warning('Would publish now, but no token specified')
self.last_publish_mtime = packages[0]['built_mtime']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we setting last_publish_mtime in this error case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not really an error case, more of a dry run case

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, but there are plenty of other returns before hand that are not setting last_publish_mtime. so this is a bit inconsistent.

self.logger.info(f'{build_prj}/images not successfully built')
pkg = packages[0]
self.last_publish_mtime = \
self.mtime_of_product(pkg['publish_prj'], pkg['name'], 'images', 'local')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here? we're not actually publishing, we're aborting because it wasn't successfully built?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, which means we have to fetch the last published mtime from OBS to know how long ago that was

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a nit pick, people usually avoid escaping newlines in python. you can write this in a more pythonic way:

     self.last_publish_mtime = self.mtime_of_product(
           pkg['publish_prj'], pkg['name'], 'images', 'local')

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO harder to read and flake8 didn't complain 🤷

@dirkmueller
Copy link
Member

so if I understand this correctly. the only goal of the whole logic is to make the checkbox "red" rather than "green" in the botmaster listing/graph.

how about we change the method to return True/False on whether it actually did a publish or not and then simply convert that to an exit code of 0 / 1?

then it would be red if it had something to publish but didn't publish (because tests failed or no matching run found or publish failed), and otherwise green?

does the UI support tripple state? then we could encode the "still waiting" states with another color..

@Vogtinator
Copy link
Member Author

so if I understand this correctly. the only goal of the whole logic is to make the checkbox "red" rather than "green" in the botmaster listing/graph.

Yes.

how about we change the method to return True/False on whether it actually did a publish or not and then simply convert that to an exit code of 0 / 1?
then it would be red if it had something to publish but didn't publish (because tests failed or no matching run found or publish failed), and otherwise green?

This wouldn't be able to catch if there was no new build for a week, e.g. due to unresolvable or blocked.

does the UI support tripple state? then we could encode the "still waiting" states with another color..

There's only orange for "pipeline is running"

@dirkmueller
Copy link
Member

then it would be red if it had something to publish but didn't publish (because tests failed or no matching run found or publish failed), and otherwise green?
This wouldn't be able to catch if there was no new build for a week, e.g. due to unresolvable or blocked.

well, unresolvable or build failed would lead to an exit 1, as well as a missing openqa run. unfinished openqa run or unclean build status would not.

does the UI support tripple state? then we could encode the "still waiting" states with another color..
There's only orange for "pipeline is running"

ok, too bad.

Well, I don't have strong feelings either way. I think the 7 days to wait is arbitrary and feels a bit too long. maybe 3 days?
Other than that I still think trying to determine the status "this looks erratic" or "this can still be normal" and then just exiting 1/0 right away is the better approach. we don't want to wait for a week to notice that openqa has lost an event or the like. that is not gonna help anyone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants