Skip to content
This repository was archived by the owner on Apr 28, 2025. It is now read-only.

Conversation

@pracucci
Copy link
Collaborator

What this PR does:
In this PR I propose to add the CortexRolloutStuck which fires if a Cortex StatefulSet or Deployment rollout is stuck. I've manually tried the queries and they should work as expected. The alert is a warning to get some confidence with it, but final goal would be to run it as critical.

Which issue(s) this PR fixes:
N/A

Checklist

  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Marco Pracucci <[email protected]>
@pracucci pracucci requested a review from a team as a code owner October 13, 2021 08:53
Signed-off-by: Marco Pracucci <[email protected]>
Copy link
Contributor

@simonswine simonswine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +421 to +426
(
max without (revision) (
kube_statefulset_status_current_revision
unless
kube_statefulset_status_update_revision
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not too sure why stateful sets need that revision check, but I guess it's also upstream: https://github.com/kubernetes-monitoring/kubernetes-mixin/blob/6c72589035f4f49674a56cf97a3ec1a02f14671a/alerts/apps_alerts.libsonnet#L128

So it should be ok 🙂

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I learned it from there.

@pracucci pracucci merged commit 306c081 into main Oct 14, 2021
@pracucci pracucci deleted the alert-on-stuck-rollout branch October 14, 2021 07:46
simonswine pushed a commit to grafana/mimir that referenced this pull request Oct 18, 2021
…tuck-rollout

Add CortexRolloutStuck alert
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants