Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -444,7 +444,7 @@
removed.

Instead, all services that expose metrics will now create `ServiceMonitor`
resources, if their helm chart is applied with `metrics.serviceMonitor.enable`
resources, if their helm chart is applied with `metrics.serviceMonitor.enabled`
set to true.

This prevents scraping agents from querying services that don't expose metrics
Expand Down
4 changes: 2 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@ DOCKER_TAG ?= $(USER)
# default helm chart version must be 0.0.42 for local development (because 42 is the answer to the universe and everything)
HELM_SEMVER ?= 0.0.42
# The list of helm charts needed on internal kubernetes testing environments
CHARTS_INTEGRATION := wire-server databases-ephemeral redis-cluster fake-aws nginx-ingress-controller nginx-ingress-services wire-server-metrics fluent-bit kibana sftd restund coturn
CHARTS_INTEGRATION := wire-server databases-ephemeral redis-cluster fake-aws nginx-ingress-controller nginx-ingress-services fluent-bit kibana sftd restund coturn
# The list of helm charts to publish on S3
# FUTUREWORK: after we "inline local subcharts",
# (e.g. move charts/brig to charts/wire-server/brig)
# this list could be generated from the folder names under ./charts/ like so:
# CHARTS_RELEASE := $(shell find charts/ -maxdepth 1 -type d | xargs -n 1 basename | grep -v charts)
CHARTS_RELEASE := wire-server redis-ephemeral redis-cluster databases-ephemeral fake-aws fake-aws-s3 fake-aws-sqs aws-ingress fluent-bit kibana backoffice calling-test demo-smtp elasticsearch-curator elasticsearch-external elasticsearch-ephemeral minio-external cassandra-external nginx-ingress-controller nginx-ingress-services reaper wire-server-metrics sftd restund coturn inbucket
CHARTS_RELEASE := wire-server redis-ephemeral redis-cluster databases-ephemeral fake-aws fake-aws-s3 fake-aws-sqs aws-ingress fluent-bit kibana backoffice calling-test demo-smtp elasticsearch-curator elasticsearch-external elasticsearch-ephemeral minio-external cassandra-external nginx-ingress-controller nginx-ingress-services reaper sftd restund coturn inbucket
BUILDAH_PUSH ?= 0
KIND_CLUSTER_NAME := wire-server
BUILDAH_KIND_LOAD ?= 1
Expand Down
5 changes: 5 additions & 0 deletions changelog.d/0-release-notes/wire-server-metrics-removal
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
The experimental wire-server-metrics helm chart has been removed.

These were mostly a wrapper around prometheus operator. It makes more sense to
refer to the upstream docs of Prometheus Operator or Grafana Agent Operator for
installation instead.
5 changes: 0 additions & 5 deletions charts/nginx-ingress-services/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,3 @@ A: Ensure that your certificate is _valid_ and has _not expired_; trying to serv

* the `apiVersion` of all resources based on cert-manager's CRDs, namely `./templates/issuer.yaml` and
`./templates/certificate.yaml`, has to be changed to `cert-manager.io/v1alpha3`


### Monitoring

__FUTUREWORK:__ When `wire-server-metrics` is ready, expiration & renewal should be integrated into monitoring.
21 changes: 0 additions & 21 deletions charts/wire-server-metrics/.helmignore

This file was deleted.

5 changes: 0 additions & 5 deletions charts/wire-server-metrics/Chart.yaml

This file was deleted.

15 changes: 0 additions & 15 deletions charts/wire-server-metrics/README.md

This file was deleted.

6 changes: 0 additions & 6 deletions charts/wire-server-metrics/requirements.yaml

This file was deleted.

This file was deleted.

39 changes: 0 additions & 39 deletions charts/wire-server-metrics/values.yaml

This file was deleted.

218 changes: 11 additions & 207 deletions docs/src/how-to/install/monitoring.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,215 +3,19 @@
Monitoring wire-server using Prometheus and Grafana
=======================================================

Introduction
------------
All wire-server helm charts offering prometheus metrics expose a
`metrics.serviceMonitor.enabled` option.

The following instructions detail the installation of a monitoring
system consisting of a Prometheus instance and corresponding Alert
Manager in addition to a Grafana instance for viewing dashboards related
to cluster and wire-services health.
If these are set to true, the helm charts will install `ServiceMonitor`
resources, which can be used to mark services for scraping by
[Prometheus Operator](https://prometheus-operator.dev/),
[Grafana Agent Operator](https://grafana.com/docs/grafana-cloud/kubernetes-monitoring/agent-k8s/),
or similar prometheus-compatible tools.

Prerequisites
-------------
Refer to their documentation for installation.

You need to have wire-server installed, see either of

* :ref:`helm`
* :ref:`helm_prod`.

How to install Prometheus and Grafana on Kubernetes using Helm
---------------------------------------------------------------

.. note::

The following makes use of overrides for helm charts. You may wish to read :ref:`understand-helm-overrides` first.

Create an override file:

.. code:: bash

mkdir -p wire-server-metrics
curl -sSL https://raw.githubusercontent.com/wireapp/wire-server-deploy/master/values/wire-server-metrics/demo-values.example.yaml > wire-server-metrics/values.yaml

And edit this file by editing/uncommenting as needed with respect to the next sections.

The monitoring system requires disk space if you wish to be resilient to
pod failure. This disk space is given to pods by using a so-called "Storage Class". You have three options:

* (1) If you deploy on a kubernetes cluster hosted on AWS you may install the ``aws-storage`` helm chart which provides configurations of Storage Classes for AWS's elastic block storage (EBS). For this, install the aws storage classes with ``helm upgrade --install aws-storage wire/aws-storage --wait``.
* (2) If you're not using AWS, but you sill want to have persistent metrics, see :ref:`using-custom-storage-classes`.
* (3) If you don't want persistence at all, see :ref:`using-no-storage-classes`.

Once you have a storage class configured (or put the override configuration to not use persistence), next we can install the monitoring suite itself.

There are a few known issues surrounding the ``prometheus-operator``
helm chart.

You will likely have to install the Custom Resource Definitions manually
before installing the ``wire-server-metrics`` chart:

::

kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/d34d70de61fe8e23bb21f6948993c510496a0b31/example/prometheus-operator-crd/alertmanager.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/d34d70de61fe8e23bb21f6948993c510496a0b31/example/prometheus-operator-crd/prometheus.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/d34d70de61fe8e23bb21f6948993c510496a0b31/example/prometheus-operator-crd/prometheusrule.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/d34d70de61fe8e23bb21f6948993c510496a0b31/example/prometheus-operator-crd/servicemonitor.crd.yaml

Now we can install the metrics chart, run the following::

helm upgrade --install wire-server-metrics wire/wire-server-metrics --wait -f wire-server-metrics/values.yaml

See the `Prometheus Operator
README <https://github.com/helm/charts/tree/master/stable/prometheus-operator#work-arounds-for-known-issues>`__
for more information and troubleshooting help.

Adding Dashboards
-----------------

Grafana dashboard configurations are included as JSON inside the
``charts/wire-server-metrics/dashboards`` directory. You may import
these via Grafana's web UI. See `Accessing
grafana <#accessing-grafana>`__.

Monitoring in a separate namespace
----------------------------------

It is advisable to separate your monitoring services from your
application services. To accomplish this you may deploy
``wire-server-metrics`` into a separate namespace from ``wire-server``.
Simply provide a different namespace to the ``helm upgrade --install``
calls with ``--namespace your-desired-namespace``.

The wire-server-metrics chart will monitor all wire services across *all* namespaces.

Accessing grafana
Dashboards
-----------------

Forward a port from your localhost to the grafana service running in
your cluster:

::

kubectl port-forward service/<release-name>-grafana 3000:80 -n <namespace>

Now you can access grafana at ``http://localhost:3000``

The username and password are stored in the ``grafana`` secret of your
namespace

By default this is:

- username: ``admin``
- password: ``admin``

Accessing prometheus
--------------------

Forward a port from your localhost to the prometheus service running in
your cluster:

::

kubectl port-forward service/<release-name>-prometheus 9090:9090 -n <namespace>

Now you can access prometheus at ``http://localhost:9090``


Customization
---------------

.. _using-no-storage-classes:

Monitoring without persistent disk
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If you wish to deploy monitoring without any persistent disk (not
recommended) you may add the following overrides to your ``values.yaml``
file.

.. code:: yaml

# This configuration switches to use memory instead of disk for metrics services
# NOTE: If the pods are killed you WILL lose all your metrics history
kube-prometheus-stack:
grafana:
persistence:
enabled: false
prometheus:
prometheusSpec:
storageSpec: null
alertmanager:
alertmanagerSpec:
storage: null

.. _using-custom-storage-classes:

Using Custom Storage Classes
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If you're using a provider other than AWS please reference the
`Kubernetes documentation on storage
classes <https://kubernetes.io/docs/concepts/storage/storage-classes/>`__
for configuring a storage class for your kubernetes cluster.

If you wish to use a different storage class (for instance if you don't
run on AWS) you may add the following overrides to your ``values.yaml``
file.

.. code:: yaml

kube-prometheus-stack:
grafana:
persistence:
storageClassName: "<my-storage-class>"
prometheus:
prometheusSpec:
storageSpec:
volumeClaimTemplate:
spec:
storageClassName: "<my-storage-class>"
alertmanager:
alertmanagerSpec:
storage:
volumeClaimTemplate:
spec:
storageClassName: "<my-storage-class>"


Troubleshooting
---------------

"validation failed"
^^^^^^^^^^^^^^^^^^^^^

If you receive the following error:

::

Error: validation failed: [unable to recognize "": no matches for kind "Alertmanager" in version
"monitoring.coreos.com/v1", unable to recognize "": no matches for kind "Prometheus" in version
"monitoring.coreos.com/v1", unable to recognize "": no matches for kind "PrometheusRule" in version

Please run the script to install Custom Resource Definitions which is
detailed in the installation instructions above.

"object is being deleted"
^^^^^^^^^^^^^^^^^^^^^^^^^^

When upgrading you may see the following error:

::

Error: object is being deleted: customresourcedefinitions.apiextensions.k8s.io "prometheusrules.monitoring.coreos.com" already exists

Helm sometimes has trouble cleaning up or defining Custom Resource
Definitions. Try manually deleting the resource definitions and trying
your helm install again:

::

kubectl delete customresourcedefinitions \
alertmanagers.monitoring.coreos.com \
prometheuses.monitoring.coreos.com \
servicemonitors.monitoring.coreos.com \
prometheusrules.monitoring.coreos.com
Grafana dashboard configurations are included as JSON inside the ``dashboards``
directory. You may import these via Grafana's web UI.