Skip to content

Conversation

@wking
Copy link
Member

@wking wking commented Feb 22, 2021

We're doing better in updates now, and want to ratchet down to bar critical-alert noise during updates. The old 1m alertPeriodCheckMinutes landed with this test in 3b8cb3c (#24786). DurationSinceStartInSeconds, which I'm using now, landed in ace1345 (#25784).

I've also dropped some special-cased alertname filtering, because we don't want any critical alerts firing. Watchdog is severity=none . AlertmanagerReceiversNotConfigured is severity=warning. KubeAPILatencyHigh was dropped in openshift/cluster-monitoring-operator#898, 4.6 and was severity=warning anyway.

We're doing better in updates now, and want to ratchet down to bar
critical-alert noise during updates.  The old 1m
alertPeriodCheckMinutes landed with this test in 3b8cb3c (Add CI
test to check for crit alerts post upgrade, 2020-03-27, openshift#24786).

DurationSinceStartInSeconds, which I'm using now, landed in ace1345
(test: Allow tests that check invariants over time to be constrained,
2021-01-06, openshift#25784).
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: wking
To complete the pull request process, please assign soltysh after the PR has been reviewed.
You can assign the PR to them by writing /assign @soltysh in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@smarterclayton
Copy link
Contributor

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 22, 2021
@smarterclayton
Copy link
Contributor

#25904

is intended to remove the ability to throw any alerts during upgrade

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 22, 2021

@wking: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/verify 5249584 link /test verify
ci/prow/e2e-gcp-csi 5249584 link /test e2e-gcp-csi
ci/prow/e2e-aws-csi 5249584 link /test e2e-aws-csi
ci/prow/e2e-agnostic-cmd 5249584 link /test e2e-agnostic-cmd
ci/prow/e2e-gcp-upgrade 5249584 link /test e2e-gcp-upgrade

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@wking
Copy link
Member Author

wking commented Feb 24, 2021

Between #25904 and #25923, everything I was trying to do here is covered.

/close

@openshift-ci-robot
Copy link

@wking: Closed this PR.

Details

In response to this:

Between #25904 and #25923, everything I was trying to do here is covered.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@wking wking deleted the no-critical-alerts-during-updates branch February 24, 2021 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants