Skip to content
This repository has been archived by the owner on May 5, 2024. It is now read-only.

Commit

Permalink
feat(thanos): add thanos monitoring
Browse files Browse the repository at this point in the history
  • Loading branch information
truxnell committed Mar 31, 2022
1 parent 28a454a commit a9a41a0
Show file tree
Hide file tree
Showing 3 changed files with 75 additions and 0 deletions.
17 changes: 17 additions & 0 deletions k8s/clusters/hegira/flux/orchestration/system-monitoring.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,23 @@ spec:
---
apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
name: system-monitoring-thanos-monitoring
namespace: flux-system
spec:
dependsOn:
- name: system-monitoring-namespace
- name: system-monitoring-kube-prom-stack
interval: 5m
path: "./k8s/manifests/system-monitoring/thanos/monitoring"
prune: true
wait: true
sourceRef:
kind: GitRepository
name: home-cluster
---
apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
name: system-monitoring-botkube
namespace: flux-system
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- prometheus-rules.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: thanos-all
namespace: monitoring
spec:
groups:
- name: thanos-all
rules:
- alert: ThanosSidecarPrometheusDown
annotations:
description: Thanos Sidecar {{$labels.instance}} cannot connect to Prometheus.
runbook_url: https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarprometheusdown
summary: Thanos Sidecar cannot connect to Prometheus
expr: |
thanos_sidecar_prometheus_up{job=~".*thanos-sidecar.*"} == 0
for: 5m
labels:
severity: critical
- alert: ThanosSidecarBucketOperationsFailed
annotations:
description: Thanos Sidecar {{$labels.instance}} bucket operations are failing
runbook_url: https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarbucketoperationsfailed
summary: Thanos Sidecar bucket operations are failing
expr: |
sum by (job, instance) (rate(thanos_objstore_bucket_operation_failures_total{job=~".*thanos-sidecar.*"}[5m])) > 0
for: 5m
labels:
severity: critical
- alert: ThanosSidecarUnhealthy
annotations:
description:
Thanos Sidecar {{$labels.instance}} is unhealthy for more than {{$value}}
seconds.
runbook_url: https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarunhealthy
summary: Thanos Sidecar is unhealthy.
expr: |
time() - max by (job, instance) (thanos_sidecar_last_heartbeat_success_time_seconds{job=~".*thanos-sidecar.*"}) >= 240
for: 5m
labels:
severity: critical
- alert: ThanosSidecarIsDown
annotations:
description:
ThanosSidecar has disappeared. Prometheus target for the component
cannot be discovered.
runbook_url: https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarisdown
summary: Thanos component has disappeared.
expr: |
thanos_sidecar_prometheus_up{job=~".*thanos-sidecar.*"} == 0
for: 5m
labels:
severity: critical

0 comments on commit a9a41a0

Please sign in to comment.