Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Victoria Metrics... Again #2546

Merged
merged 6 commits into from
Jun 19, 2024
Merged

Victoria Metrics... Again #2546

merged 6 commits into from
Jun 19, 2024

Conversation

joryirving
Copy link
Owner

Welp.

@joryirving joryirving self-assigned this Jun 19, 2024
@github-actions github-actions bot added area/kubernetes Changes made in the kubernetes directory cluster/main cluster/utility labels Jun 19, 2024
@joryirving joryirving force-pushed the feat/vmetrics-again branch from ef51b67 to 424f149 Compare June 19, 2024 17:44
@smurf-bot
Copy link
Contributor

smurf-bot bot commented Jun 19, 2024

--- kubernetes/utility/flux Kustomization: flux-system/cluster HelmRepository: flux-system/victoria-metrics

+++ kubernetes/utility/flux Kustomization: flux-system/cluster HelmRepository: flux-system/victoria-metrics

@@ -0,0 +1,13 @@

+---
+apiVersion: source.toolkit.fluxcd.io/v1
+kind: HelmRepository
+metadata:
+  labels:
+    kustomize.toolkit.fluxcd.io/name: cluster
+    kustomize.toolkit.fluxcd.io/namespace: flux-system
+  name: victoria-metrics
+  namespace: flux-system
+spec:
+  interval: 1h
+  url: https://victoriametrics.github.io/helm-charts/
+
--- kubernetes/utility/apps Kustomization: flux-system/cluster-apps Kustomization: flux-system/kube-prometheus-stack

+++ kubernetes/utility/apps Kustomization: flux-system/cluster-apps Kustomization: flux-system/kube-prometheus-stack

@@ -1,38 +0,0 @@

----
-apiVersion: kustomize.toolkit.fluxcd.io/v1
-kind: Kustomization
-metadata:
-  labels:
-    kustomize.toolkit.fluxcd.io/name: cluster-apps
-    kustomize.toolkit.fluxcd.io/namespace: flux-system
-  name: kube-prometheus-stack
-  namespace: flux-system
-spec:
-  commonMetadata:
-    labels:
-      app.kubernetes.io/name: kube-prometheus-stack
-  decryption:
-    provider: sops
-    secretRef:
-      name: sops-age
-  dependsOn:
-  - name: external-secrets-stores
-  interval: 30m
-  path: ./kubernetes/utility/apps/observability/kube-prometheus-stack/app
-  postBuild:
-    substitute:
-      THANOS_VERSION: v0.35.1
-    substituteFrom:
-    - kind: ConfigMap
-      name: cluster-settings
-    - kind: Secret
-      name: cluster-secrets
-  prune: true
-  retryInterval: 1m
-  sourceRef:
-    kind: GitRepository
-    name: home-kubernetes
-  targetNamespace: observability
-  timeout: 5m
-  wait: false
-
--- kubernetes/utility/apps Kustomization: flux-system/cluster-apps Kustomization: flux-system/victoria-metrics

+++ kubernetes/utility/apps Kustomization: flux-system/cluster-apps Kustomization: flux-system/victoria-metrics

@@ -0,0 +1,34 @@

+---
+apiVersion: kustomize.toolkit.fluxcd.io/v1
+kind: Kustomization
+metadata:
+  labels:
+    kustomize.toolkit.fluxcd.io/name: cluster-apps
+    kustomize.toolkit.fluxcd.io/namespace: flux-system
+  name: victoria-metrics
+  namespace: flux-system
+spec:
+  commonMetadata:
+    labels:
+      app.kubernetes.io/name: victoria-metrics
+  decryption:
+    provider: sops
+    secretRef:
+      name: sops-age
+  interval: 30m
+  path: ./kubernetes/utility/apps/observability/victoria-metrics/app
+  postBuild:
+    substituteFrom:
+    - kind: ConfigMap
+      name: cluster-settings
+    - kind: Secret
+      name: cluster-secrets
+  prune: true
+  retryInterval: 1m
+  sourceRef:
+    kind: GitRepository
+    name: home-kubernetes
+  targetNamespace: observability
+  timeout: 5m
+  wait: false
+
--- kubernetes/utility/apps/observability/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack ExternalSecret: observability/thanos-objstore-config

+++ kubernetes/utility/apps/observability/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack ExternalSecret: observability/thanos-objstore-config

@@ -1,30 +0,0 @@

----
-apiVersion: external-secrets.io/v1beta1
-kind: ExternalSecret
-metadata:
-  labels:
-    app.kubernetes.io/name: kube-prometheus-stack
-    kustomize.toolkit.fluxcd.io/name: kube-prometheus-stack
-    kustomize.toolkit.fluxcd.io/namespace: flux-system
-  name: thanos-objstore-config
-  namespace: observability
-spec:
-  dataFrom:
-  - extract:
-      key: thanos
-  secretStoreRef:
-    kind: ClusterSecretStore
-    name: bitwarden-secrets-manager
-  target:
-    name: thanos-objstore-config
-    template:
-      data:
-        config: |-
-          type: s3
-          config:
-            bucket: thanos
-            endpoint: rgw.
-            access_key: {{ .AWS_ACCESS_KEY_ID }}
-            secret_key: {{ .AWS_SECRET_ACCESS_KEY }}
-      engineVersion: v2
-
--- kubernetes/utility/apps/observability/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack HelmRelease: observability/kube-prometheus-stack

+++ kubernetes/utility/apps/observability/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack HelmRelease: observability/kube-prometheus-stack

@@ -1,193 +0,0 @@

----
-apiVersion: helm.toolkit.fluxcd.io/v2
-kind: HelmRelease
-metadata:
-  labels:
-    app.kubernetes.io/name: kube-prometheus-stack
-    kustomize.toolkit.fluxcd.io/name: kube-prometheus-stack
-    kustomize.toolkit.fluxcd.io/namespace: flux-system
-  name: kube-prometheus-stack
-  namespace: observability
-spec:
-  chart:
-    spec:
-      chart: kube-prometheus-stack
-      sourceRef:
-        kind: HelmRepository
-        name: prometheus-community
-        namespace: flux-system
-      version: 60.2.0
-  dependsOn:
-  - name: prometheus-operator-crds
-    namespace: observability
-  - name: longhorn
-    namespace: storage
-  install:
-    crds: Skip
-    remediation:
-      retries: 3
-  interval: 30m
-  timeout: 15m
-  upgrade:
-    cleanupOnFail: true
-    crds: Skip
-    remediation:
-      retries: 3
-      strategy: rollback
-  values:
-    alertmanager:
-      enabled: false
-    cleanPrometheusOperatorObjectNames: true
-    crds:
-      enabled: false
-    grafana:
-      enabled: false
-    kube-state-metrics:
-      fullnameOverride: kube-state-metrics
-      metricLabelsAllowlist:
-      - pods=[*]
-      - deployments=[*]
-      - persistentvolumeclaims=[*]
-    kubeApiServer:
-      enabled: true
-      serviceMonitor:
-        metricRelabelings:
-        - action: keep
-          regex: (aggregator_openapi|aggregator_unavailable|apiextensions_openapi|apiserver_admission|apiserver_audit|apiserver_cache|apiserver_cel|apiserver_client|apiserver_crd|apiserver_current|apiserver_envelope|apiserver_flowcontrol|apiserver_init|apiserver_kube|apiserver_longrunning|apiserver_request|apiserver_requested|apiserver_response|apiserver_selfrequest|apiserver_storage|apiserver_terminated|apiserver_tls|apiserver_watch|apiserver_webhooks|authenticated_user|authentication|disabled_metric|etcd_bookmark|etcd_lease|etcd_request|field_validation|get_token|go|grpc_client|hidden_metric|kube_apiserver|kubernetes_build|kubernetes_feature|node_authorizer|pod_security|process_cpu|process_max|process_open|process_resident|process_start|process_virtual|registered_metric|rest_client|scrape_duration|scrape_samples|scrape_series|serviceaccount_legacy|serviceaccount_stale|serviceaccount_valid|watch_cache|workqueue)_(.+)
-          sourceLabels:
-          - __name__
-        - action: drop
-          regex: (apiserver|etcd|rest_client)_request(|_sli|_slo)_duration_seconds_bucket
-          sourceLabels:
-          - __name__
-        - action: drop
-          regex: (apiserver_response_sizes_bucket|apiserver_watch_events_sizes_bucket)
-          sourceLabels:
-          - __name__
-    kubeControllerManager:
-      enabled: true
-      endpoints:
-      - 10.69.1.121
-      serviceMonitor:
-        metricRelabelings:
-        - action: keep
-          regex: (apiserver_audit|apiserver_client|apiserver_delegated|apiserver_envelope|apiserver_storage|apiserver_webhooks|attachdetach_controller|authenticated_user|authentication|cronjob_controller|disabled_metric|endpoint_slice|ephemeral_volume|garbagecollector_controller|get_token|go|hidden_metric|job_controller|kubernetes_build|kubernetes_feature|leader_election|node_collector|node_ipam|process_cpu|process_max|process_open|process_resident|process_start|process_virtual|pv_collector|registered_metric|replicaset_controller|rest_client|retroactive_storageclass|root_ca|running_managed|scrape_duration|scrape_samples|scrape_series|service_controller|storage_count|storage_operation|ttl_after|volume_operation|workqueue)_(.+)
-          sourceLabels:
-          - __name__
-    kubeEtcd:
-      enabled: true
-      endpoints:
-      - 10.69.1.121
-    kubeProxy:
-      enabled: false
-    kubeScheduler:
-      enabled: true
-      endpoints:
-      - 10.69.1.121
-      serviceMonitor:
-        metricRelabelings:
-        - action: keep
-          regex: (apiserver_audit|apiserver_client|apiserver_delegated|apiserver_envelope|apiserver_storage|apiserver_webhooks|authenticated_user|authentication|disabled_metric|go|hidden_metric|kubernetes_build|kubernetes_feature|leader_election|process_cpu|process_max|process_open|process_resident|process_start|process_virtual|registered_metric|rest_client|scheduler|scrape_duration|scrape_samples|scrape_series|workqueue)_(.+)
-          sourceLabels:
-          - __name__
-    kubeStateMetrics:
-      enabled: true
-    kubelet:
-      enabled: true
-      serviceMonitor:
-        metricRelabelings:
-        - action: keep
-          regex: (apiserver_audit|apiserver_client|apiserver_delegated|apiserver_envelope|apiserver_storage|apiserver_webhooks|authentication_token|cadvisor_version|container_blkio|container_cpu|container_fs|container_last|container_memory|container_network|container_oom|container_processes|container|csi_operations|disabled_metric|get_token|go|hidden_metric|kubelet_certificate|kubelet_cgroup|kubelet_container|kubelet_containers|kubelet_cpu|kubelet_device|kubelet_graceful|kubelet_http|kubelet_lifecycle|kubelet_managed|kubelet_node|kubelet_pleg|kubelet_pod|kubelet_run|kubelet_running|kubelet_runtime|kubelet_server|kubelet_started|kubelet_volume|kubernetes_build|kubernetes_feature|machine_cpu|machine_memory|machine_nvm|machine_scrape|node_namespace|plugin_manager|prober_probe|process_cpu|process_max|process_open|process_resident|process_start|process_virtual|registered_metric|rest_client|scrape_duration|scrape_samples|scrape_series|storage_operation|volume_manager|volume_operation|workqueue)_(.+)
-          sourceLabels:
-          - __name__
-        - action: replace
-          sourceLabels:
-          - node
-          targetLabel: instance
-        - action: labeldrop
-          regex: (uid)
-        - action: labeldrop
-          regex: (id|name)
-        - action: drop
-          regex: (rest_client_request_duration_seconds_bucket|rest_client_request_duration_seconds_sum|rest_client_request_duration_seconds_count)
-          sourceLabels:
-          - __name__
-    nodeExporter:
-      enabled: true
-    prometheus:
-      ingress:
-        annotations:
-          external-dns.alpha.kubernetes.io/target: internal-utility.
-        enabled: true
-        hosts:
-        - prometheus-utility.
-        ingressClassName: internal
-        pathType: Prefix
-      prometheusSpec:
-        additionalAlertManagerConfigs:
-        - static_configs:
-          - targets:
-            - alertmanager.
-        enableAdminAPI: true
-        enableFeatures:
-        - auto-gomemlimit
-        - memory-snapshot-on-shutdown
-        - new-service-discovery-manager
-        externalLabels:
-          cluster: utility
-        podMetadata:
-          annotations:
-            secret.reloader.stakater.com/reload: thanos-objstore-config
-        podMonitorSelectorNilUsesHelmValues: false
-        probeSelectorNilUsesHelmValues: false
-        replicaExternalLabelName: __replica__
-        replicas: 1
-        resources:
-          limits:
-            memory: 1500Mi
-          requests:
-            cpu: 100m
-        retention: 2d
-        retentionSize: 15GB
-        ruleSelectorNilUsesHelmValues: false
-        scrapeConfigSelectorNilUsesHelmValues: false
-        scrapeInterval: 1m
-        serviceMonitorSelectorNilUsesHelmValues: false
-        storageSpec:
-          volumeClaimTemplate:
-            spec:
-              resources:
-                requests:
-                  storage: 20Gi
-              storageClassName: local-hostpath
-        thanos:
-          image: quay.io/thanos/thanos:v0.35.1
-          objectStorageConfig:
-            existingSecret:
-              key: config
-              name: thanos-objstore-config
-          version: 0.35.1
-      thanosService:
-        enabled: true
-      thanosServiceExternal:
-        annotations:
-          external-dns.alpha.kubernetes.io/hostname: thanos-svc.
-          io.cilium/lb-ipam-ips: temp
-        enabled: true
-        externalTrafficPolicy: Cluster
-        type: LoadBalancer
-      thanosServiceMonitor:
-        enabled: true
-    prometheus-node-exporter:
-      fullnameOverride: node-exporter
-      prometheus:
-        monitor:
-          enabled: true
-          relabelings:
-          - action: replace
-            regex: (.*)
-            replacement: $1
-            sourceLabels:
-            - __meta_kubernetes_pod_node_name
-            targetLabel: kubernetes_node
-
--- kubernetes/utility/apps/observability/victoria-metrics/app Kustomization: flux-system/victoria-metrics HelmRelease: observability/victoria-metrics

+++ kubernetes/utility/apps/observability/victoria-metrics/app Kustomization: flux-system/victoria-metrics HelmRelease: observability/victoria-metrics

@@ -0,0 +1,49 @@

+---
+apiVersion: helm.toolkit.fluxcd.io/v2
+kind: HelmRelease
+metadata:
+  labels:
+    app.kubernetes.io/name: victoria-metrics
+    kustomize.toolkit.fluxcd.io/name: victoria-metrics
+    kustomize.toolkit.fluxcd.io/namespace: flux-system
+  name: victoria-metrics
+  namespace: observability
+spec:
+  chart:
+    spec:
+      chart: victoria-metrics-agent
+      sourceRef:
+        kind: HelmRepository
+        name: victoria-metrics
+        namespace: flux-system
+      version: 0.10.9
+  install:
+    remediation:
+      retries: 3
+  interval: 30m
+  upgrade:
+    cleanupOnFail: true
+    remediation:
+      retries: 3
+      strategy: rollback
+  values:
+    config:
+      global:
+        scrape_interval: 30s
+    deployment:
+      enabled: true
+    ingress:
+      annotations:
+        external-dns.alpha.kubernetes.io/target: internal-utility.${SECRET_DOMAIN}
+      enabled: true
+      hosts:
+      - name: vmagent-utility.${SECRET_DOMAIN}
+        path: /
+        port: http
+      ingressClassName: internal
+    remoteWriteUrls:
+    - https://victoria-metrics.${SECRET_DOMAIN}/insert/0/prometheus
+    replicaCount: 1
+    service:
+      enabled: true
+

@smurf-bot
Copy link
Contributor

smurf-bot bot commented Jun 19, 2024

--- kubernetes/main/flux Kustomization: flux-system/cluster HelmRepository: flux-system/victoria-metrics

+++ kubernetes/main/flux Kustomization: flux-system/cluster HelmRepository: flux-system/victoria-metrics

@@ -0,0 +1,13 @@

+---
+apiVersion: source.toolkit.fluxcd.io/v1
+kind: HelmRepository
+metadata:
+  labels:
+    kustomize.toolkit.fluxcd.io/name: cluster
+    kustomize.toolkit.fluxcd.io/namespace: flux-system
+  name: victoria-metrics
+  namespace: flux-system
+spec:
+  interval: 1h
+  url: https://victoriametrics.github.io/helm-charts/
+
--- kubernetes/main/apps Kustomization: flux-system/cluster-apps Kustomization: flux-system/kube-prometheus-stack

+++ kubernetes/main/apps Kustomization: flux-system/cluster-apps Kustomization: flux-system/kube-prometheus-stack

@@ -1,38 +0,0 @@

----
-apiVersion: kustomize.toolkit.fluxcd.io/v1
-kind: Kustomization
-metadata:
-  labels:
-    kustomize.toolkit.fluxcd.io/name: cluster-apps
-    kustomize.toolkit.fluxcd.io/namespace: flux-system
-  name: kube-prometheus-stack
-  namespace: flux-system
-spec:
-  commonMetadata:
-    labels:
-      app.kubernetes.io/name: kube-prometheus-stack
-  decryption:
-    provider: sops
-    secretRef:
-      name: sops-age
-  dependsOn:
-  - name: external-secrets-stores
-  interval: 30m
-  path: ./kubernetes/main/apps/observability/kube-prometheus-stack/app
-  postBuild:
-    substitute:
-      THANOS_VERSION: v0.35.1
-    substituteFrom:
-    - kind: ConfigMap
-      name: cluster-settings
-    - kind: Secret
-      name: cluster-secrets
-  prune: true
-  retryInterval: 1m
-  sourceRef:
-    kind: GitRepository
-    name: home-kubernetes
-  targetNamespace: observability
-  timeout: 5m
-  wait: false
-
--- kubernetes/main/apps Kustomization: flux-system/cluster-apps Kustomization: flux-system/thanos

+++ kubernetes/main/apps Kustomization: flux-system/cluster-apps Kustomization: flux-system/thanos

@@ -1,37 +0,0 @@

----
-apiVersion: kustomize.toolkit.fluxcd.io/v1
-kind: Kustomization
-metadata:
-  labels:
-    kustomize.toolkit.fluxcd.io/name: cluster-apps
-    kustomize.toolkit.fluxcd.io/namespace: flux-system
-  name: thanos
-  namespace: flux-system
-spec:
-  commonMetadata:
-    labels:
-      app.kubernetes.io/name: thanos
-  decryption:
-    provider: sops
-    secretRef:
-      name: sops-age
-  dependsOn:
-  - name: dragonfly-cluster
-  - name: external-secrets-stores
-  interval: 30m
-  path: ./kubernetes/main/apps/observability/thanos/app
-  postBuild:
-    substituteFrom:
-    - kind: ConfigMap
-      name: cluster-settings
-    - kind: Secret
-      name: cluster-secrets
-  prune: true
-  retryInterval: 1m
-  sourceRef:
-    kind: GitRepository
-    name: home-kubernetes
-  targetNamespace: observability
-  timeout: 15m
-  wait: false
-
--- kubernetes/main/apps Kustomization: flux-system/cluster-apps Kustomization: flux-system/victoria-metrics

+++ kubernetes/main/apps Kustomization: flux-system/cluster-apps Kustomization: flux-system/victoria-metrics

@@ -0,0 +1,34 @@

+---
+apiVersion: kustomize.toolkit.fluxcd.io/v1
+kind: Kustomization
+metadata:
+  labels:
+    kustomize.toolkit.fluxcd.io/name: cluster-apps
+    kustomize.toolkit.fluxcd.io/namespace: flux-system
+  name: victoria-metrics
+  namespace: flux-system
+spec:
+  commonMetadata:
+    labels:
+      app.kubernetes.io/name: victoria-metrics
+  decryption:
+    provider: sops
+    secretRef:
+      name: sops-age
+  interval: 30m
+  path: ./kubernetes/main/apps/observability/victoria-metrics/app
+  postBuild:
+    substituteFrom:
+    - kind: ConfigMap
+      name: cluster-settings
+    - kind: Secret
+      name: cluster-secrets
+  prune: true
+  retryInterval: 1m
+  sourceRef:
+    kind: GitRepository
+    name: home-kubernetes
+  targetNamespace: observability
+  timeout: 5m
+  wait: false
+
--- kubernetes/main/apps/observability/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack ExternalSecret: observability/alertmanager-secret

+++ kubernetes/main/apps/observability/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack ExternalSecret: observability/alertmanager-secret

@@ -1,86 +0,0 @@

----
-apiVersion: external-secrets.io/v1beta1
-kind: ExternalSecret
-metadata:
-  labels:
-    app.kubernetes.io/name: kube-prometheus-stack
-    kustomize.toolkit.fluxcd.io/name: kube-prometheus-stack
-    kustomize.toolkit.fluxcd.io/namespace: flux-system
-  name: alertmanager-secret
-  namespace: observability
-spec:
-  dataFrom:
-  - extract:
-      key: alertmanager
-  - extract:
-      key: discord
-  refreshInterval: 15m
-  secretStoreRef:
-    kind: ClusterSecretStore
-    name: bitwarden-secrets-manager
-  target:
-    name: alertmanager-secret
-    template:
-      data:
-        alertmanager.yaml: |
-          global:
-            resolve_timeout: 5m
-          route:
-            group_by: ["alertname", "job"]
-            group_interval: 10m
-            group_wait: 1m
-            receiver: discord
-            repeat_interval: 12h
-            routes:
-              - receiver: heartbeat
-                group_interval: 5m
-                group_wait: 0s
-                matchers:
-                  - alertname =~ "Watchdog"
-                repeat_interval: 5m
-              - receiver: "null"
-                matchers:
-                  - severity = "none"
-                  - alertname =~ "InfoInhibitor|Watchdog"
-              - receiver: discord
-                continue: true
-                matchers:
-                  - severity = "critical"
-          inhibit_rules:
-            - equal: ["alertname", "namespace"]
-              source_matchers:
-                - severity = "critical"
-              target_matchers:
-                - severity = "warning"
-          receivers:
-            - name: heartbeat
-              webhook_configs:
-                - send_resolved: true
-                  url: "{{ .ALERTMANAGER_HEARTBEAT_URL }}"
-            - name: "null"
-            - name: discord
-              discord_configs:
-                - send_resolved: true
-                  webhook_url: "{{ .DISCORD_WEBHOOK_URL }}"
-                  title: >-
-                    {{ "{{" }} .CommonLabels.alertname {{ "}}" }}
-                    [{{ "{{" }} .Status | toUpper {{ "}}" }}{{ "{{" }} if eq .Status "firing" {{ "}}" }}:{{ "{{" }} .Alerts.Firing | len {{ "}}" }}{{ "{{" }} end {{ "}}" }}]
-                  message: |-
-                    {{ "{{-" }} range .Alerts {{ "}}" }}
-                      {{ "{{-" }} if ne .Annotations.description "" {{ "}}" }}
-                        {{ "{{" }} .Annotations.description {{ "}}" }}
-                      {{ "{{-" }} else if ne .Annotations.summary "" {{ "}}" }}
-                        {{ "{{" }} .Annotations.summary {{ "}}" }}
-                      {{ "{{-" }} else if ne .Annotations.message "" {{ "}}" }}
-                        {{ "{{" }} .Annotations.message {{ "}}" }}
-                      {{ "{{-" }} else {{ "}}" }}
-                        Alert description not available
-                      {{ "{{-" }} end {{ "}}" }}
-                      {{ "{{-" }} if gt (len .Labels.SortedPairs) 0 {{ "}}" }}
-                        {{ "{{-" }} range .Labels.SortedPairs {{ "}}" }}
-                          **{{ "{{" }} .Name {{ "}}" }}:** {{ "{{" }} .Value {{ "}}" }}
-                        {{ "{{-" }} end {{ "}}" }}
-                      {{ "{{-" }} end {{ "}}" }}
-                    {{ "{{-" }} end {{ "}}" }}
-      engineVersion: v2
-
--- kubernetes/main/apps/observability/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack HelmRelease: observability/kube-prometheus-stack

+++ kubernetes/main/apps/observability/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack HelmRelease: observability/kube-prometheus-stack

@@ -1,223 +0,0 @@

----
-apiVersion: helm.toolkit.fluxcd.io/v2
-kind: HelmRelease
-metadata:
-  labels:
-    app.kubernetes.io/name: kube-prometheus-stack
-    kustomize.toolkit.fluxcd.io/name: kube-prometheus-stack
-    kustomize.toolkit.fluxcd.io/namespace: flux-system
-  name: kube-prometheus-stack
-  namespace: observability
-spec:
-  chart:
-    spec:
-      chart: kube-prometheus-stack
-      sourceRef:
-        kind: HelmRepository
-        name: prometheus-community
-        namespace: flux-system
-      version: 60.2.0
-  dependsOn:
-  - name: prometheus-operator-crds
-    namespace: observability
-  - name: rook-ceph-cluster
-    namespace: rook-ceph
-  - name: thanos
-    namespace: observability
-  install:
-    crds: Skip
-    remediation:
-      retries: 3
-  interval: 30m
-  timeout: 15m
-  upgrade:
-    cleanupOnFail: true
-    crds: Skip
-    remediation:
-      retries: 3
-      strategy: rollback
-  values:
-    alertmanager:
-      alertmanagerSpec:
-        configSecret: alertmanager-secret
-        replicas: 2
-        storage:
-          volumeClaimTemplate:
-            spec:
-              resources:
-                requests:
-                  storage: 1Gi
-              storageClassName: ceph-block
-        useExistingSecret: true
-      ingress:
-        annotations:
-          external-dns.alpha.kubernetes.io/target: internal.
-        enabled: true
-        hosts:
-        - alertmanager.
-        ingressClassName: internal
-        pathType: Prefix
-    cleanPrometheusOperatorObjectNames: true
-    crds:
-      enabled: false
-    grafana:
-      enabled: false
-      forceDeployDashboards: true
-      sidecar:
-        dashboards:
-          annotations:
-            grafana_folder: Kubernetes
-          multicluster:
-            etcd:
-              enabled: true
-    kube-state-metrics:
-      fullnameOverride: kube-state-metrics
-      metricLabelsAllowlist:
-      - pods=[*]
-      - deployments=[*]
-      - persistentvolumeclaims=[*]
-    kubeApiServer:
-      enabled: true
-      serviceMonitor:
-        metricRelabelings:
-        - action: keep
-          regex: (aggregator_openapi|aggregator_unavailable|apiextensions_openapi|apiserver_admission|apiserver_audit|apiserver_cache|apiserver_cel|apiserver_client|apiserver_crd|apiserver_current|apiserver_envelope|apiserver_flowcontrol|apiserver_init|apiserver_kube|apiserver_longrunning|apiserver_request|apiserver_requested|apiserver_response|apiserver_selfrequest|apiserver_storage|apiserver_terminated|apiserver_tls|apiserver_watch|apiserver_webhooks|authenticated_user|authentication|disabled_metric|etcd_bookmark|etcd_lease|etcd_request|field_validation|get_token|go|grpc_client|hidden_metric|kube_apiserver|kubernetes_build|kubernetes_feature|node_authorizer|pod_security|process_cpu|process_max|process_open|process_resident|process_start|process_virtual|registered_metric|rest_client|scrape_duration|scrape_samples|scrape_series|serviceaccount_legacy|serviceaccount_stale|serviceaccount_valid|watch_cache|workqueue)_(.+)
-          sourceLabels:
-          - __name__
-        - action: drop
-          regex: (apiserver|etcd|rest_client)_request(|_sli|_slo)_duration_seconds_bucket
-          sourceLabels:
-          - __name__
-        - action: drop
-          regex: (apiserver_response_sizes_bucket|apiserver_watch_events_sizes_bucket)
-          sourceLabels:
-          - __name__
-    kubeControllerManager:
-      enabled: true
-      endpoints:
-      - 10.69.1.21
-      - 10.69.1.22
-      - 10.69.1.23
-      serviceMonitor:
-        metricRelabelings:
-        - action: keep
-          regex: (apiserver_audit|apiserver_client|apiserver_delegated|apiserver_envelope|apiserver_storage|apiserver_webhooks|attachdetach_controller|authenticated_user|authentication|cronjob_controller|disabled_metric|endpoint_slice|ephemeral_volume|garbagecollector_controller|get_token|go|hidden_metric|job_controller|kubernetes_build|kubernetes_feature|leader_election|node_collector|node_ipam|process_cpu|process_max|process_open|process_resident|process_start|process_virtual|pv_collector|registered_metric|replicaset_controller|rest_client|retroactive_storageclass|root_ca|running_managed|scrape_duration|scrape_samples|scrape_series|service_controller|storage_count|storage_operation|ttl_after|volume_operation|workqueue)_(.+)
-          sourceLabels:
-          - __name__
-    kubeEtcd:
-      enabled: true
-      endpoints:
-      - 10.69.1.21
-      - 10.69.1.22
-      - 10.69.1.23
-    kubeProxy:
-      enabled: false
-    kubeScheduler:
-      enabled: true
-      endpoints:
-      - 10.69.1.21
-      - 10.69.1.22
-      - 10.69.1.23
-      serviceMonitor:
-        metricRelabelings:
-        - action: keep
-          regex: (apiserver_audit|apiserver_client|apiserver_delegated|apiserver_envelope|apiserver_storage|apiserver_webhooks|authenticated_user|authentication|disabled_metric|go|hidden_metric|kubernetes_build|kubernetes_feature|leader_election|process_cpu|process_max|process_open|process_resident|process_start|process_virtual|registered_metric|rest_client|scheduler|scrape_duration|scrape_samples|scrape_series|workqueue)_(.+)
-          sourceLabels:
-          - __name__
-    kubeStateMetrics:
-      enabled: true
-    kubelet:
-      enabled: true
-      serviceMonitor:
-        metricRelabelings:
-        - action: keep
-          regex: (apiserver_audit|apiserver_client|apiserver_delegated|apiserver_envelope|apiserver_storage|apiserver_webhooks|authentication_token|cadvisor_version|container_blkio|container_cpu|container_fs|container_last|container_memory|container_network|container_oom|container_processes|container|csi_operations|disabled_metric|get_token|go|hidden_metric|kubelet_certificate|kubelet_cgroup|kubelet_container|kubelet_containers|kubelet_cpu|kubelet_device|kubelet_graceful|kubelet_http|kubelet_lifecycle|kubelet_managed|kubelet_node|kubelet_pleg|kubelet_pod|kubelet_run|kubelet_running|kubelet_runtime|kubelet_server|kubelet_started|kubelet_volume|kubernetes_build|kubernetes_feature|machine_cpu|machine_memory|machine_nvm|machine_scrape|node_namespace|plugin_manager|prober_probe|process_cpu|process_max|process_open|process_resident|process_start|process_virtual|registered_metric|rest_client|scrape_duration|scrape_samples|scrape_series|storage_operation|volume_manager|volume_operation|workqueue)_(.+)
-          sourceLabels:
-          - __name__
-        - action: replace
-          sourceLabels:
-          - node
-          targetLabel: instance
-        - action: labeldrop
-          regex: (uid)
-        - action: labeldrop
-          regex: (id|name)
-        - action: drop
-          regex: (rest_client_request_duration_seconds_bucket|rest_client_request_duration_seconds_sum|rest_client_request_duration_seconds_count)
-          sourceLabels:
-          - __name__
-    nodeExporter:
-      enabled: true
-    prometheus:
-      ingress:
-        annotations:
-          external-dns.alpha.kubernetes.io/target: internal.
-          gethomepage.dev/description: Monitoring Scrape Service
-          gethomepage.dev/enabled: 'true'
-          gethomepage.dev/group: Observability
-          gethomepage.dev/icon: prometheus.png
-          gethomepage.dev/name: Prometheus
-          gethomepage.dev/widget.type: prometheus
-          gethomepage.dev/widget.url: http://kube-prometheus-stack-prometheus.observability:9090
-        enabled: true
-        hosts:
-        - prometheus.
-        ingressClassName: internal
-        pathType: Prefix
-      prometheusSpec:
-        enableAdminAPI: true
-        enableFeatures:
-        - auto-gomemlimit
-        - memory-snapshot-on-shutdown
-        - new-service-discovery-manager
-        externalLabels:
-          cluster: main
-        podMetadata:
-          annotations:
-            secret.reloader.stakater.com/reload: thanos-objstore-config
-        podMonitorSelectorNilUsesHelmValues: false
-        probeSelectorNilUsesHelmValues: false
-        replicaExternalLabelName: __replica__
-        replicas: 2
-        resources:
-          limits:
-            memory: 1500Mi
-          requests:
-            cpu: 100m
-        retention: 2d
-        retentionSize: 15GB
-        ruleSelectorNilUsesHelmValues: false
-        scrapeConfigSelectorNilUsesHelmValues: false
-        scrapeInterval: 1m
-        serviceMonitorSelectorNilUsesHelmValues: false
-        storageSpec:
-          volumeClaimTemplate:
-            spec:
-              resources:
-                requests:
-                  storage: 20Gi
-              storageClassName: ceph-block
-        thanos:
-          image: quay.io/thanos/thanos:v0.35.1
-          objectStorageConfig:
-            existingSecret:
-              key: config
-              name: thanos-objstore-config
-          version: 0.35.1
-      thanosService:
-        enabled: true
-      thanosServiceMonitor:
-        enabled: true
-    prometheus-node-exporter:
-      fullnameOverride: node-exporter
-      prometheus:
-        monitor:
-          enabled: true
-          relabelings:
-          - action: replace
-            regex: (.*)
-            replacement: $1
-            sourceLabels:
-            - __meta_kubernetes_pod_node_name
-            targetLabel: kubernetes_node
-
--- kubernetes/main/apps/observability/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack PrometheusRule: observability/miscellaneous-rules

+++ kubernetes/main/apps/observability/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack PrometheusRule: observability/miscellaneous-rules

@@ -1,38 +0,0 @@

----
-apiVersion: monitoring.coreos.com/v1
-kind: PrometheusRule
-metadata:
-  labels:
-    app.kubernetes.io/name: kube-prometheus-stack
-    kustomize.toolkit.fluxcd.io/name: kube-prometheus-stack
-    kustomize.toolkit.fluxcd.io/namespace: flux-system
-    prometheus: k8s
-    role: alert-rules
-  name: miscellaneous-rules
-  namespace: observability
-spec:
-  groups:
-  - name: dockerhub
-    rules:
-    - alert: BootstrapRateLimitRisk
-      annotations:
-        summary: Kubernetes cluster at risk of being rate limited by dockerhub on
-          bootstrap
-      expr: count(time() - container_last_seen{image=~"(docker.io).*",container!=""}
-        < 30) > 100
-      for: 15m
-      labels:
-        severity: critical
-  - name: oom
-    rules:
-    - alert: OOMKilled
-      annotations:
-        description: Container {{ $labels.container }} in pod {{ $labels.namespace
-          }}/{{ $labels.pod }} has been OOMKilled {{ $value }} times in the last 10
-          minutes.
-      expr: (kube_pod_container_status_restarts_total - kube_pod_container_status_restarts_total
-        offset 10m >= 1) and ignoring (reason) min_over_time(kube_pod_container_status_last_terminated_reason{reason="OOMKilled"}[10m])
-        == 1
-      labels:
-        severity: critical
-
--- kubernetes/main/apps/observability/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack ScrapeConfig: observability/node-exporter

+++ kubernetes/main/apps/observability/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack ScrapeConfig: observability/node-exporter

@@ -1,21 +0,0 @@

----
-apiVersion: monitoring.coreos.com/v1alpha1
-kind: ScrapeConfig
-metadata:
-  labels:
-    app.kubernetes.io/name: kube-prometheus-stack
-    kustomize.toolkit.fluxcd.io/name: kube-prometheus-stack
-    kustomize.toolkit.fluxcd.io/namespace: flux-system
-  name: node-exporter
-  namespace: observability
-spec:
-  metricsPath: /metrics
-  relabelings:
-  - action: replace
-    replacement: node-exporter
-    targetLabel: job
-  staticConfigs:
-  - targets:
-    - voyager.internal:9100
-    - pikvm.internal:9100
-
--- kubernetes/main/apps/observability/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack ScrapeConfig: observability/smartctl-exporter

+++ kubernetes/main/apps/observability/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack ScrapeConfig: observability/smartctl-exporter

@@ -1,20 +0,0 @@

----
-apiVersion: monitoring.coreos.com/v1alpha1
-kind: ScrapeConfig
-metadata:
-  labels:
-    app.kubernetes.io/name: kube-prometheus-stack
-    kustomize.toolkit.fluxcd.io/name: kube-prometheus-stack
-    kustomize.toolkit.fluxcd.io/namespace: flux-system
-  name: smartctl-exporter
-  namespace: observability
-spec:
-  metricsPath: /metrics
-  relabelings:
-  - action: replace
-    replacement: smartctl-exporter
-    targetLabel: job
-  staticConfigs:
-  - targets:
-    - voyager.internal:9633
-
--- kubernetes/main/apps/observability/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack ScrapeConfig: observability/pikvm

+++ kubernetes/main/apps/observability/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack ScrapeConfig: observability/pikvm

@@ -1,20 +0,0 @@

----
-apiVersion: monitoring.coreos.com/v1alpha1
-kind: ScrapeConfig
-metadata:
-  labels:
-    app.kubernetes.io/name: kube-prometheus-stack
-    kustomize.toolkit.fluxcd.io/name: kube-prometheus-stack
-    kustomize.toolkit.fluxcd.io/namespace: flux-system
-  name: pikvm
-  namespace: observability
-spec:
-  metricsPath: /api/export/prometheus/metrics
-  relabelings:
-  - action: replace
-    replacement: pikvm
-    targetLabel: job
-  staticConfigs:
-  - targets:
-    - pikvm.internal
-
--- kubernetes/main/apps/observability/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack ScrapeConfig: observability/blocky

+++ kubernetes/main/apps/observability/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack ScrapeConfig: observability/blocky

@@ -1,20 +0,0 @@

----
-apiVersion: monitoring.coreos.com/v1alpha1
-kind: ScrapeConfig
-metadata:
-  labels:
-    app.kubernetes.io/name: kube-prometheus-stack
-    kustomize.toolkit.fluxcd.io/name: kube-prometheus-stack
-    kustomize.toolkit.fluxcd.io/namespace: flux-system
-  name: blocky
-  namespace: observability
-spec:
-  metricsPath: /metrics
-  relabelings:
-  - action: replace
-    replacement: blocky
-    targetLabel: job
-  staticConfigs:
-  - targets:
-    - blocky.
-
--- kubernetes/main/apps/observability/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack ScrapeConfig: observability/minio-job

+++ kubernetes/main/apps/observability/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack ScrapeConfig: observability/minio-job

@@ -1,20 +0,0 @@

----
-apiVersion: monitoring.coreos.com/v1alpha1
-kind: ScrapeConfig
-metadata:
-  labels:
-    app.kubernetes.io/name: kube-prometheus-stack
-    kustomize.toolkit.fluxcd.io/name: kube-prometheus-stack
-    kustomize.toolkit.fluxcd.io/namespace: flux-system
-  name: minio-job
-  namespace: observability
-spec:
-  metricsPath: /minio/v2/metrics/cluster
-  relabelings:
-  - action: replace
-    replacement: minio-job
-    targetLabel: job
-  staticConfigs:
-  - targets:
-    - s3.
-
--- kubernetes/main/apps/observability/grafana/app Kustomization: flux-system/grafana HelmRelease: observability/grafana

+++ kubernetes/main/apps/observability/grafana/app Kustomization: flux-system/grafana HelmRelease: observability/grafana

@@ -97,19 +97,18 @@

           folder: System
           name: system
           options:
             path: /var/lib/grafana/dashboards/system
           orgId: 1
           type: file
-        - allowUiUpdates: true
-          disableDeletion: false
-          editable: true
-          folder: Thanos
-          name: thanos
-          options:
-            path: /var/lib/grafana/dashboards/thanos
+        - disableDeletion: false
+          editable: true
+          folder: VictoriaMetrics
+          name: victoriametrics
+          options:
+            path: /var/lib/grafana/dashboards/victoriametrics-folder
           orgId: 1
           type: file
     dashboards:
       data:
         crunchy-pgbackrest:
           datasource:
@@ -303,53 +302,50 @@

         spegel:
           datasource:
           - name: DS_PROMETHEUS
             value: Prometheus
           gnetId: 18089
           revision: 1
-      thanos:
-        thanos-bucket-replicate:
-          datasource: Prometheus
-          url: https://raw.githubusercontent.com/monitoring-mixins/website/master/assets/thanos/dashboards/bucket-replicate.json
-        thanos-compact:
-          datasource: Prometheus
-          url: https://raw.githubusercontent.com/monitoring-mixins/website/master/assets/thanos/dashboards/compact.json
-        thanos-overview:
-          datasource: Prometheus
-          url: https://raw.githubusercontent.com/monitoring-mixins/website/master/assets/thanos/dashboards/overview.json
-        thanos-query:
-          datasource: Prometheus
-          url: https://raw.githubusercontent.com/monitoring-mixins/website/master/assets/thanos/dashboards/query.json
-        thanos-query-frontend:
-          datasource: Prometheus
-          url: https://raw.githubusercontent.com/monitoring-mixins/website/master/assets/thanos/dashboards/query-frontend.json
-        thanos-receieve:
-          datasource: Prometheus
-          url: https://raw.githubusercontent.com/monitoring-mixins/website/master/assets/thanos/dashboards/receive.json
-        thanos-rule:
-          datasource: Prometheus
-          url: https://raw.githubusercontent.com/monitoring-mixins/website/master/assets/thanos/dashboards/rule.json
-        thanos-sidecar:
-          datasource: Prometheus
-          url: https://raw.githubusercontent.com/monitoring-mixins/website/master/assets/thanos/dashboards/sidecar.json
-        thanos-store:
-          datasource: Prometheus
-          url: https://raw.githubusercontent.com/monitoring-mixins/website/master/assets/thanos/dashboards/store.json
+      victoriametrics:
+        vm-cluster:
+          datasource:
+          - name: DS_PROMETHEUS
+            value: Prometheus
+          gnetId: 11176
+          revision: 37
+        vm-operator:
+          datasource:
+          - name: DS_PROMETHEUS
+            value: Prometheus
+          gnetId: 17869
+          revision: 2
+        vm-vmagent:
+          datasource:
+          - name: DS_PROMETHEUS
+            value: Prometheus
+          gnetId: 12683
+          revision: 18
+        vm-vmalert:
+          datasource:
+          - name: DS_PROMETHEUS
+            value: Prometheus
+          gnetId: 14950
+          revision: 11
     datasources:
       datasources.yaml:
         apiVersion: 1
         datasources:
         - access: proxy
           isDefault: true
           jsonData:
-            prometheusType: Thanos
+            prometheusType: Prometheus
             timeInterval: 1m
           name: Prometheus
           type: prometheus
           uid: prometheus
-          url: http://thanos-query-frontend.observability.svc.cluster.local:10902
+          url: http://vmsingle-victoria-metrics.observability.svc.cluster.local:8429
         - access: proxy
           jsonData:
             implementation: prometheus
           name: Alertmanager
           type: alertmanager
           url: http://alertmanager-operated.observability.svc.cluster.local:9093
--- kubernetes/main/apps/observability/thanos/app Kustomization: flux-system/thanos ObjectBucketClaim: observability/thanos-bucket

+++ kubernetes/main/apps/observability/thanos/app Kustomization: flux-system/thanos ObjectBucketClaim: observability/thanos-bucket

@@ -1,14 +0,0 @@

----
-apiVersion: objectbucket.io/v1alpha1
-kind: ObjectBucketClaim
-metadata:
-  labels:
-    app.kubernetes.io/name: thanos
-    kustomize.toolkit.fluxcd.io/name: thanos
-    kustomize.toolkit.fluxcd.io/namespace: flux-system
-  name: thanos-bucket
-  namespace: observability
-spec:
-  bucketName: thanos
-  storageClassName: ceph-bucket
-
--- kubernetes/main/apps/observability/thanos/app Kustomization: flux-system/thanos HelmRelease: observability/thanos

+++ kubernetes/main/apps/observability/thanos/app Kustomization: flux-system/thanos HelmRelease: observability/thanos

@@ -1,149 +0,0 @@

----
-apiVersion: helm.toolkit.fluxcd.io/v2
-kind: HelmRelease
-metadata:
-  labels:
-    app.kubernetes.io/name: thanos
-    kustomize.toolkit.fluxcd.io/name: thanos
-    kustomize.toolkit.fluxcd.io/namespace: flux-system
-  name: thanos
-  namespace: observability
-spec:
-  chart:
-    spec:
-      chart: thanos
-      sourceRef:
-        kind: HelmRepository
-        name: stevehipwell
-        namespace: flux-system
-      version: 1.17.2
-  dependsOn:
-  - name: openebs
-    namespace: storage
-  - name: rook-ceph-cluster
-    namespace: rook-ceph
-  install:
-    remediation:
-      retries: 3
-  interval: 30m
-  timeout: 15m
-  upgrade:
-    cleanupOnFail: true
-    remediation:
-      retries: 3
-      strategy: rollback
-  values:
-    additionalEndpoints:
-    - dnssrv+_grpc._tcp.kube-prometheus-stack-thanos-discovery.observability.svc.cluster.local
-    additionalReplicaLabels:
-    - __replica__
-    compact:
-      enabled: true
-      extraArgs:
-      - --compact.concurrency=4
-      - --delete-delay=30m
-      - --retention.resolution-raw=14d
-      - --retention.resolution-5m=30d
-      - --retention.resolution-1h=60d
-      persistence:
-        enabled: true
-        size: 20Gi
-        storageClass: ceph-block
-    objstoreConfig:
-      value:
-        config:
-          insecure: true
-        type: s3
-    query:
-      additionalStores:
-      - thanos-svc.:10901
-      extraArgs:
-      - --alert.query-url=https://thanos.
-      replicas: 2
-    queryFrontend:
-      enabled: true
-      extraArgs:
-      - --query-range.response-cache-config=$(THANOS_CACHE_CONFIG)
-      extraEnv:
-      - name: THANOS_CACHE_CONFIG
-        valueFrom:
-          configMapKeyRef:
-            key: cache.yaml
-            name: thanos-cache-configmap
-      ingress:
-        annotations:
-          external-dns.alpha.kubernetes.io/target: internal.
-        enabled: true
-        hosts:
-        - thanos.
-        ingressClassName: internal
-      podAnnotations:
-        configmap.reloader.stakater.com/reload: thanos-cache-configmap
-      replicas: 2
-    rule:
-      alertmanagersConfig:
-        value: |-
-          alertmanagers:
-            - api_version: v2
-              static_configs:
-                - dnssrv+_http-web._tcp.alertmanager-operated.observability.svc.cluster.local
-      enabled: true
-      extraArgs:
-      - --web.prefix-header=X-Forwarded-Prefix
-      persistence:
-        enabled: true
-        size: 20Gi
-        storageClass: ceph-block
-      replicas: 2
-      rules:
-        value: |-
-          groups:
-            - name: PrometheusWatcher
-              rules:
-                - alert: PrometheusDown
-                  annotations:
-                    summary: A Prometheus has disappeared from Prometheus target discovery
-                  expr: absent(up{job="kube-prometheus-stack-prometheus"})
-                  for: 5m
-                  labels:
-                    severity: critical
-    serviceMonitor:
-      enabled: true
-    storeGateway:
-      extraArgs:
-      - --index-cache.config=$(THANOS_CACHE_CONFIG)
-      extraEnv:
-      - name: THANOS_CACHE_CONFIG
-        valueFrom:
-          configMapKeyRef:
-            key: cache.yaml
-            name: thanos-cache-configmap
-      persistence:
-        enabled: true
-        size: 20Gi
-        storageClass: ceph-block
-      podAnnotations:
-        configmap.reloader.stakater.com/reload: thanos-cache-configmap
-      replicas: 2
-  valuesFrom:
-  - kind: ConfigMap
-    name: thanos-bucket
-    targetPath: objstoreConfig.value.config.bucket
-    valuesKey: BUCKET_NAME
-  - kind: ConfigMap
-    name: thanos-bucket
-    targetPath: objstoreConfig.value.config.endpoint
-    valuesKey: BUCKET_HOST
-  - kind: ConfigMap
-    name: thanos-bucket
-    targetPath: objstoreConfig.value.config.region
-    valuesKey: BUCKET_REGION
-  - kind: Secret
-    name: thanos-bucket
-    targetPath: objstoreConfig.value.config.access_key
-    valuesKey: AWS_ACCESS_KEY_ID
-  - kind: Secret
-    name: thanos-bucket
-    targetPath: objstoreConfig.value.config.secret_key
-    valuesKey: AWS_SECRET_ACCESS_KEY
-
--- kubernetes/main/apps/observability/thanos/app Kustomization: flux-system/thanos ConfigMap: observability/thanos-cache-configmap

+++ kubernetes/main/apps/observability/thanos/app Kustomization: flux-system/thanos ConfigMap: observability/thanos-cache-configmap

@@ -1,18 +0,0 @@

----
-apiVersion: v1
-data:
-  cache.yaml: |
-    ---
-    type: REDIS
-    config:
-      addr: dragonfly.database.svc.cluster.local:6379
-      db: 2
-kind: ConfigMap
-metadata:
-  labels:
-    app.kubernetes.io/name: thanos
-    kustomize.toolkit.fluxcd.io/name: thanos
-    kustomize.toolkit.fluxcd.io/namespace: flux-system
-  name: thanos-cache-configmap
-  namespace: observability
-
--- kubernetes/main/apps/observability/victoria-metrics/app Kustomization: flux-system/victoria-metrics HelmRelease: observability/victoria-metrics

+++ kubernetes/main/apps/observability/victoria-metrics/app Kustomization: flux-system/victoria-metrics HelmRelease: observability/victoria-metrics

@@ -0,0 +1,224 @@

+---
+apiVersion: helm.toolkit.fluxcd.io/v2
+kind: HelmRelease
+metadata:
+  labels:
+    app.kubernetes.io/name: victoria-metrics
+    kustomize.toolkit.fluxcd.io/name: victoria-metrics
+    kustomize.toolkit.fluxcd.io/namespace: flux-system
+  name: victoria-metrics
+  namespace: observability
+spec:
+  chart:
+    spec:
+      chart: victoria-metrics-k8s-stack
+      sourceRef:
+        kind: HelmRepository
+        name: victoria-metrics
+        namespace: flux-system
+      version: 0.23.2
+  install:
+    remediation:
+      retries: 3
+  interval: 30m
+  upgrade:
+    cleanupOnFail: true
+    remediation:
+      retries: 3
+      strategy: rollback
+  values:
+    alertmanager:
+      enabled: true
+      ingress:
+        annotations:
+          external-dns.alpha.kubernetes.io/target: internal.${SECRET_DOMAIN}
+        enabled: true
+        hosts:
+        - alertmanager.${SECRET_DOMAIN}
+        ingressClassName: internal
+        pathType: Prefix
+      spec:
+        configSecret: alertmanager-secret
+        replicaCount: 2
+        storage:
+          volumeClaimTemplate:
+            spec:
+              resources:
+                requests:
+                  storage: 1Gi
+              storageClassName: ceph-block
+    coreDns:
+      enabled: true
+    defaultDashboardsEnabled: true
+    defaultRules:
+      create: true
+      rules:
+        alertmanager: true
+        etcd: true
+        general: true
+        k8s: true
+        kubeApiserver: true
+        kubeApiserverAvailability: true
+        kubeApiserverBurnrate: true
+        kubeApiserverHistogram: true
+        kubeApiserverSlos: true
+        kubePrometheusGeneral: true
+        kubePrometheusNodeRecording: true
+        kubeScheduler: true
+        kubeStateMetrics: true
+        kubelet: true
+        kubernetesApps: true
+        kubernetesResources: true
+        kubernetesStorage: true
+        kubernetesSystem: true
+        network: true
+        node: true
+        vmagent: true
+        vmhealth: true
+        vmsingle: true
+    experimentalDashboardsEnabled: true
+    fullnameOverride: victoria-metrics
+    grafana:
+      enabled: false
+      forceDeployDashboards: true
+      sidecar:
+        dashboards:
+          annotations:
+            grafana_folder: Kubernetes
+          multicluster:
+            etcd:
+              enabled: true
+    kube-state-metrics:
+      enabled: true
+      fullnameOverride: kube-state-metrics
+      metricLabelsAllowlist:
+      - pods=[*]
+      - deployments=[*]
+      - persistentvolumeclaims=[*]
+    kubeApiServer:
+      enabled: true
+    kubeControllerManager:
+      enabled: true
+      endpoints:
+      - 10.69.1.21
+      - 10.69.1.22
+      - 10.69.1.23
+    kubeDns:
+      enabled: false
+    kubeEtcd:
+      enabled: true
+      endpoints:
+      - 10.69.1.21
+      - 10.69.1.22
+      - 10.69.1.23
+    kubeProxy:
+      enabled: false
+    kubeScheduler:
+      enabled: true
+      endpoints:
+      - 10.69.1.21
+      - 10.69.1.22
+      - 10.69.1.23
+    kubelet:
+      enabled: true
+    prometheus-node-exporter:
+      enabled: true
+      fullnameOverride: node-exporter
+      prometheus:
+        monitor:
+          enabled: true
+          relabelings:
+          - action: replace
+            regex: (.*)
+            replacement: $1
+            sourceLabels:
+            - __meta_kubernetes_pod_node_name
+            targetLabel: kubernetes_node
+    victoria-metrics-operator:
+      enabled: true
+      operator:
+        disable_prometheus_converter: false
+        enable_converter_ownership: true
+    vmagent:
+      enabled: true
+      ingress:
+        annotations:
+          external-dns.alpha.kubernetes.io/target: internal.${SECRET_DOMAIN}
+        enabled: true
+        hosts:
+        - vmagent.${SECRET_DOMAIN}
+        ingressClassName: internal
+      spec:
+        additionalScrapeConfigs:
+          key: prometheus-additional.yaml
+          name: vm-additional-scrape-configs
+        externalLabels:
+          cluster: ${CLUSTER_NAME}
+        replicaCount: 1
+        resources:
+          limits:
+            cpu: 400m
+            memory: 512Mi
+          requests:
+            cpu: 50m
+            memory: 256Mi
+        scrapeInterval: 30s
+        shardCount: 2
+        topologySpreadConstraints:
+        - labelSelector:
+            matchLabels:
+              app.kubernetes.io/name: vmagent
+          maxSkew: 1
+          topologyKey: kubernetes.io/hostname
+          whenUnsatisfiable: DoNotSchedule
+    vmalert:
+      enabled: true
+      ingress:
+        annotations:
+          external-dns.alpha.kubernetes.io/target: internal.${SECRET_DOMAIN}
+        enabled: true
+        hosts:
+        - vmalert.${SECRET_DOMAIN}
+        ingressClassName: internal
+      spec:
+        extraArgs:
+          external.url: https://vmalert.${SECRET_DOMAIN}
+        replicaCount: 2
+        resources:
+          limits:
+            cpu: 150m
+            memory: 256Mi
+          requests:
+            cpu: 50m
+            memory: 128Mi
+        topologySpreadConstraints:
+        - labelSelector:
+            matchLabels:
+              app.kubernetes.io/name: vmalert
+          maxSkew: 1
+          topologyKey: kubernetes.io/hostname
+          whenUnsatisfiable: DoNotSchedule
+    vmsingle:
+      enabled: true
+      ingress:
+        annotations:
+          external-dns.alpha.kubernetes.io/target: internal.${SECRET_DOMAIN}
+        enabled: true
+        hosts:
+        - victoria-metrics.${SECRET_DOMAIN}
+        ingressClassName: internal
+      spec:
+        extraArgs:
+          dedup.minScrapeInterval: 30s
+          maxLabelsPerTimeseries: '90'
+          search.minStalenessInterval: 5m
+          vmalert.proxyURL: http://vmalert-victoria-metrics.observability.svc.cluster.local:8080
+        retentionPeriod: 1y
+        storage:
+          accessModes:
+          - ReadWriteOnce
+          resources:
+            requests:
+              storage: 50Gi
+          storageClassName: ceph-block
+

@joryirving joryirving merged commit 4baa9d2 into main Jun 19, 2024
12 of 16 checks passed
@joryirving joryirving deleted the feat/vmetrics-again branch June 19, 2024 18:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubernetes Changes made in the kubernetes directory cluster/main cluster/utility
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants