Skip to content

Conversation

@bison
Copy link
Contributor

@bison bison commented Sep 14, 2021

This enhancement proposal was initially added in #637. There was lots of discussion at the time, and it was decided at some point that it would be merged as-is to provide a starting point.

While the original proposal absolutely provided a great jumping-off point to the discussion around issues with alerting in OpenShift, the monitoring team would like to see it now evolve into a more implementable enhancement proposal focused on developing a practical style guide for OpenShift alerts. We'd also like to see the acceptance criteria for new critical alerts in OpenShift formalized here. This is a first pass at reshaping the original proposal into something in this direction.

@bison
Copy link
Contributor Author

bison commented Sep 14, 2021

cc: @openshift/sre-alert-sme @openshift/openshift-team-monitoring

@openshift-ci openshift-ci bot requested review from mandre and sdodson September 14, 2021 10:29
The group of critical alerts should be small, very well defined, highly
documented, polished and with a high bar set for entry. This includes a
mandatory review of a proposed critical alert by the Red Hat SRE team.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add details on how and where this happens? How does someone contact the SRE team for this review?

@simonpasquier
Copy link
Contributor

/lgtm
/cc @openshift/sre-alert-sme

Thanks for doing this!

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 21, 2021
@sdodson
Copy link
Member

sdodson commented Sep 21, 2021

/lgtm
/approve
/hold
@michaelgugino As someone heavily invested in the initial enhancement I'd like to make sure that you think this addresses your original goals, please clear the hold if you're ok with this.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 21, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 21, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sdodson

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 21, 2021
@bison
Copy link
Contributor Author

bison commented Sep 28, 2021

Ping @michaelgugino and @sdodson -- Happy to get more feedback here, but not sure how long to wait.

@dgrisonnet
Copy link
Member

Great effort, thank you for looking into this @bison.

/lgtm

@sdodson
Copy link
Member

sdodson commented Sep 28, 2021

@bison we've waited long enough, thanks
/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 28, 2021
@openshift-merge-robot openshift-merge-robot merged commit 15aebc7 into openshift:master Sep 28, 2021
@bison bison deleted the alerting-consistency branch September 28, 2021 15:08
@bison
Copy link
Contributor Author

bison commented Sep 28, 2021

Thanks everyone. Of course, if there ends up being more feedback, please reach out to me. We can keep iterating on this.

jsafrane pushed a commit to jsafrane/origin that referenced this pull request Nov 11, 2021
The [Alerting Consistency][1] enhancement, and the proposed updates to
it in [openshift/enhancements openshift#897][2], define a style-guide for the
alerts shipped as part of OpenShift.  This adds a test validating some
of the guidelines considered required.

[1]: https://github.com/openshift/enhancements/blob/master/enhancements/monitoring/alerting-consistency.md
[2]: openshift/enhancements#897
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants