Harden elasticsearch chart for Kube 1.5 (#1062)

* Update elasticsearch chart to work with Kube 1.5 * Add environment variable KUBERNETES_MASTER, resolves issue documented here: fabric8io/fabric8#6229 (comment) * Rename PetSet to StatefulSet, rename template file * Add initialDelay and increase timesouts to all liveness and readiness checks. This was the only way I could get it to deploy reliably in my environment. * Update to a newer image version * Harden aspects of the elasticsearch chart * Added configmap to explicitly provide cluster configurations and scripts * Replace depreciating `ES_HEAP_SIZE` with `ES_JAVA_OPTS` to position for ES v5 support * Removed alpha storage class operators * Removed catastrophic liveness probe checking entire clusters health * Readiness probe now inspects local node health * Added termination grace period (defaults to 15m) to allow pre-stop-script.sh time to gracefully migrate shards * Added init container to configure `vm.max_map_count` * Updated elasticsearch.yaml: * Added `PROCESSOR` configuration to prevent large cluster garbage collection issues leading to node eviction * Added configurable gateway defaults to help avoid a split brain, requiring two masters online and in consensus before recovery can continue * Updated pre-stop-script.sh: * Check `v1beta1` `statefulset` endpoint * Evalute `.spec.replicas` for statefulset desired size * Clear `_cluster/settings` ip exclusion prior to shutdown to avoid a possible (random) ip match scenario on expansion of the clsuter * Data nodes now use default storage class if once is not specified * Apply best practices * Add Notes for client service types, and warnings
helm · Jul 5, 2017 · 09892a3 · 09892a3
1 parent eb6d3e2
commit 09892a3
Show file tree

Hide file tree

Showing 16 changed files with 510 additions and 272 deletions.
diff --git a/incubator/elasticsearch/Chart.yaml b/incubator/elasticsearch/Chart.yaml
@@ -1,11 +1,15 @@
 name: elasticsearch
 home: https://www.elastic.co/products/elasticsearch
-version: 0.1.4
+version: 0.1.6
 description: Flexible and powerful open source, distributed real-time search and analytics engine.
 icon: https://static-www.elastic.co/assets/blteb1c97719574938d/logo-elastic-elasticsearch-lt.svg
 sources:
   - https://www.elastic.co/products/elasticsearch
   - https://github.com/jetstack/elasticsearch-pet
+  - https://github.com/giantswarm/kubernetes-elastic-stack
+  - https://github.com/GoogleCloudPlatform/elasticsearch-docker
 maintainers:
   - name: Christian Simon
     email: [email protected]
+  - name: Michael Haselton
+    email: [email protected]
diff --git a/incubator/elasticsearch/README.md b/incubator/elasticsearch/README.md
@@ -7,14 +7,14 @@ elasticsearch and their
 
 ## Prerequisites Details
 
-* Kubernetes 1.3 with alpha APIs enabled
+* Kubernetes 1.5
 * PV dynamic provisioning support on the underlying infrastructure
 
-## PetSet Details
-* http://kubernetes.io/docs/user-guide/petset/
+## StatefulSets Details
+* https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
 
-## PetSet Caveats
-* http://kubernetes.io/docs/user-guide/petset/#alpha-limitations
+## StatefulSets Caveats
+* https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#limitations
 
 ## Todo
 
@@ -25,9 +25,9 @@ elasticsearch and their
 ## Chart Details
 This chart will do the following:
 
-* Implemented a dynamically scalable elasticsearch cluster using Kubernetes PetSets/Deployments
+* Implemented a dynamically scalable elasticsearch cluster using Kubernetes StatefulSets/Deployments
 * Multi-role deployment: master, client and data nodes
-* PetSet Supports scaling down without degrading the cluster 
+* Statefulset Supports scaling down without degrading the cluster
 
 ## Installing the Chart
 
@@ -51,33 +51,27 @@ $ kubectl delete pvcs -l release=my-release,type=data
 
 The following tables lists the configurable parameters of the elasticsearch chart and their default values.
 
-|         Parameter         |           Description             |                         Default                          |
-|---------------------------|-----------------------------------|----------------------------------------------------------|
-| `Image`                   | Container image name              | `jetstack/elasticsearch-pet`                             |
-| `ImageTag`                | Container image tag               | `2.3.4`                                                  |
-| `ImagePullPolicy`         | Container pull policy             | `Always`                                                 |
-| `ClientReplicas`          | Client node replicas (deployment) | `2`                                                      |
-| `ClientCpuRequests`       | Client node requested cpu         | `25m`                                                    |
-| `ClientMemoryRequests`    | Client node requested memory      | `256Mi`                                                  |
-| `ClientCpuLimits`         | Client node requested cpu         | `100m`                                                   |
-| `ClientMemoryLimits`      | Client node requested memory      | `512Mi`                                                  |
-| `ClientHeapSize`          | Client node heap size             | `128m`                                                   |
-| `MasterReplicas`          | Master node replicas (deployment) | `2`                                                      |
-| `MasterCpuRequests`       | Master node requested cpu         | `25m`                                                    |
-| `MasterMemoryRequests`    | Master node requested memory      | `256Mi`                                                  |
-| `MasterCpuLimits`         | Master node requested cpu         | `100m`                                                   |
-| `MasterMemoryLimits`      | Master node requested memory      | `512Mi`                                                  |
-| `MasterHeapSize`          | Master node heap size             | `128m`                                                   |
-| `DataReplicas`            | Data node replicas (petset)       | `3`                                                      |
-| `DataCpuRequests`         | Data node requested cpu           | `250m`                                                   |
-| `DataMemoryRequests`      | Data node requested memory        | `2Gi`                                                    |
-| `DataCpuLimits`           | Data node requested cpu           | `1`                                                      |
-| `DataMemoryLimits`        | Data node requested memory        | `4Gi`                                                    |
-| `DataHeapSize`            | Data node heap size               | `1536m`                                                  |
-| `DataStorage`             | Data persistent volume size       | `30Gi`                                                   |
-| `DataStorageClass`        | Data persistent volume Class      | `anything`                                               |
-| `DataStorageClassVersion` | Version of StorageClass           | `alpha`                                                  |
-| `Component`               | Selector Key                      | `elasticsearch`                                          |
+| Parameter                            | Description                             | Default                             |
+| ------------------------------------ | --------------------------------------- | ----------------------------------- |
+| `image.repository`                   | Container image name                    | `jetstack/elasticsearch-pet`        |
+| `image.tag`                          | Container image tag                     | `2.4.0`                             |
+| `image.pullPolicy`                   | Container pull policy                   | `Always`                            |
+| `client.name`                        | Client component name                   | `client`                            |
+| `client.replicas`                    | Client node replicas (deployment)       | `2`                                 |
+| `client.resources`                   | Client node resources requests & limits | `{} - cpu limit must be an integer` |
+| `client.heapSize`                    | Client node heap size                   | `128m`                              |
+| `client.serviceType`                 | Client service type                     | `ClusterIP`                         |
+| `master.name`                        | Master component name                   | `master`                            |
+| `master.replicas`                    | Master node replicas (deployment)       | `2`                                 |
+| `master.resources`                   | Master node resources requests & limits | `{} - cpu limit must be an integer` |
+| `master.heapSize`                    | Master node heap size                   | `128m`                              |
+| `master.name`                        | Data component name                     | `data`                              |
+| `data.replicas`                      | Data node replicas (statefulset)        | `3`                                 |
+| `data.resources`                     | Data node resources requests & limits   | `{} - cpu limit must be an integer` |
+| `data.heapSize`                      | Data node heap size                     | `1536m`                             |
+| `data.storage`                       | Data persistent volume size             | `30Gi`                              |
+| `data.storageClass`                  | Data persistent volume Class            | `nil`                               |
+| `data.terminationGracePeriodSeconds` | Data termination grace period (seconds) | `3600`                              |
 
 Specify each parameter using the `--set key=value[,key=value]` argument to `helm install`.
 
@@ -102,7 +96,7 @@ would degrade performance heaviliy. The issue is tracked in
 
 ## Select right storage class for SSD volumes
 
-### GCE + Kubernetes 1.4
+### GCE + Kubernetes 1.5
 
 Create StorageClass for SSD-PD
 
@@ -117,9 +111,8 @@ parameters:
   type: pd-ssd
 EOF
 ```
-Create cluster with Storage class `ssd` on Kubernetes 1.4+
+Create cluster with Storage class `ssd` on Kubernetes 1.5+
 
 ```
-$ helm install incubator/elasticsearch --name my-release --set DataStorageClass=ssd,DataStorageClassVersion=beta
-
+$ helm install incubator/elasticsearch --name my-release --set data.storageClass=ssd,data.storage=100Gi
 ```
diff --git a/incubator/elasticsearch/templates/NOTES.txt b/incubator/elasticsearch/templates/NOTES.txt
@@ -0,0 +1,31 @@
+The elasticsearch cluster has been installed.
+
+Elasticsearch can be accessed:
+
+  * Within your cluster, at the following DNS name at port 9200:
+
+    {{ template "client.fullname" . }}.{{ .Release.Namespace }}.svc.cluster.local
+
+  * From outside the cluster, run these commands in the same shell:
+    {{- if contains "NodePort" .Values.client.serviceType }}
+
+    export NODE_PORT=$(kubectl get --namespace {{ .Release.Namespace }} -o jsonpath="{.spec.ports[0].nodePort}" services {{ template "client.fullname" . }})
+    export NODE_IP=$(kubectl get nodes --namespace {{ .Release.Namespace }} -o jsonpath="{.items[0].status.addresses[0].address}")
+    echo http://$NODE_IP:$NODE_PORT
+    {{- else if contains "LoadBalancer" .Values.client.serviceType }}
+
+     WARNING: You have likely exposed your Elasticsearch cluster direct to the internet.
+              Elasticsearch does not implement any security for public facing clusters by default.
+              As a minimum level of security; switch to ClusterIP/NodePort and place an Nginx gateway infront of the cluster in order to lock down access to dangerous HTTP endpoints and verbs.
+
+     NOTE: It may take a few minutes for the LoadBalancer IP to be available.
+           You can watch the status of by running 'kubectl get svc -w {{ template "client.fullname" . }}'
+
+    export SERVICE_IP=$(kubectl get svc --namespace {{ .Release.Namespace }} {{ template "client.fullname" . }} -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
+    echo http://$SERVICE_IP:9200
+    {{- else if contains "ClusterIP"  .Values.client.serviceType }}
+
+    export POD_NAME=$(kubectl get pods --namespace {{ .Release.Namespace }} -l "app={{ template "name" . }},component={{ .Values.client.name }},release={{ .Release.Name }}" -o jsonpath="{.items[0].metadata.name}")
+    echo "Visit http://127.0.0.1:9200 to use Elasticsearch"
+    kubectl port-forward --namespace {{ .Release.Namespace }} $POD_NAME 9200:9200
+    {{- end }}
diff --git a/incubator/elasticsearch/templates/_helpers.tpl b/incubator/elasticsearch/templates/_helpers.tpl
@@ -0,0 +1,43 @@
+{{/* vim: set filetype=mustache: */}}
+{{/*
+Expand the name of the chart.
+*/}}
+{{- define "name" -}}
+{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" -}}
+{{- end -}}
+
+{{/*
+Create a default fully qualified app name.
+We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
+*/}}
+{{- define "fullname" -}}
+{{- $name := default .Chart.Name .Values.nameOverride -}}
+{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" -}}
+{{- end -}}
+
+{{/*
+Create a default fully qualified client name.
+We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
+*/}}
+{{- define "client.fullname" -}}
+{{- $name := default .Chart.Name .Values.nameOverride -}}
+{{- printf "%s-%s-%s" .Release.Name $name .Values.client.name | trunc 63 | trimSuffix "-" -}}
+{{- end -}}
+
+{{/*
+Create a default fully qualified data name.
+We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
+*/}}
+{{- define "data.fullname" -}}
+{{- $name := default .Chart.Name .Values.nameOverride -}}
+{{- printf "%s-%s-%s" .Release.Name $name .Values.data.name | trunc 63 | trimSuffix "-" -}}
+{{- end -}}
+
+{{/*
+Create a default fully qualified master name.
+We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
+*/}}
+{{- define "master.fullname" -}}
+{{- $name := default .Chart.Name .Values.nameOverride -}}
+{{- printf "%s-%s-%s" .Release.Name $name .Values.master.name | trunc 63 | trimSuffix "-" -}}
+{{- end -}}
diff --git a/incubator/elasticsearch/templates/elasticsearch-client-deployment.yaml b/incubator/elasticsearch/templates/elasticsearch-client-deployment.yaml
@@ -1,30 +1,44 @@
-apiVersion: extensions/v1beta1
+apiVersion: apps/v1beta1
 kind: Deployment
 metadata:
-  name: "{{ printf "%s-client-%s" .Release.Name .Values.Name | trunc 24 }}"
   labels:
-    heritage: {{.Release.Service | quote }}
-    release: {{.Release.Name | quote }}
-    chart: "{{.Chart.Name}}-{{.Chart.Version}}"
-    component: "{{.Release.Name}}-{{.Values.Component}}"
-    type: client
+    app: {{ template "name" . }}
+    chart: {{ .Chart.Name }}-{{ .Chart.Version }}
+    component: "{{ .Values.client.name }}"
+    heritage: {{ .Release.Service }}
+    release: {{ .Release.Name }}
+  name: {{ template "client.fullname" . }}
 spec:
-  replicas: {{default 2 .Values.ClientReplicas }}
+  replicas: {{ .Values.client.replicas }}
   template:
     metadata:
       labels:
-        heritage: {{.Release.Service | quote }}
-        release: {{.Release.Name | quote }}
-        chart: "{{.Chart.Name}}-{{.Chart.Version}}"
-        component: "{{.Release.Name}}-{{.Values.Component}}"
-        type: client
+        app: {{ template "name" . }}
+        component: "{{ .Values.client.name }}"
+        release: {{ .Release.Name }}
+      annotations:
+        # see https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html
+        # and https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration-memory.html#mlockall
+        pod.alpha.kubernetes.io/init-containers: '[
+          {
+            "name": "sysctl",
+            "image": "busybox",
+            "imagePullPolicy": "Always",
+            "command": ["sysctl", "-w", "vm.max_map_count=262144"],
+            "securityContext": {
+              "privileged": true
+            }
+          }
+        ]'
     spec:
-      serviceAccountName: "{{ printf "%s-%s" .Release.Name .Values.Name | trunc 24 }}"
+      serviceAccountName: {{ template "fullname" . }}
       containers:
       - name: elasticsearch
         env:
         - name: SERVICE
-          value: "{{ printf "%s-cluster-%s" .Release.Name .Values.Name | trunc 24 }}"
+          value: {{ template "master.fullname" . }}
+        - name: KUBERNETES_MASTER
+          value: kubernetes.default.svc.cluster.local
         - name: KUBERNETES_NAMESPACE
           valueFrom:
             fieldRef:
@@ -33,30 +47,34 @@ spec:
           value: "false"
         - name: NODE_MASTER
           value: "false"
-        - name: ES_HEAP_SIZE
-          value: "{{.Values.ClientHeapSize}}"
+        - name: PROCESSORS
+          valueFrom:
+            resourceFieldRef:
+              resource: limits.cpu
+        - name: ES_JAVA_OPTS
+          value: "-Djava.net.preferIPv4Stack=true -Xms{{ .Values.client.heapSize }} -Xmx{{ .Values.client.heapSize }}"
         resources:
-          requests:
-            cpu: "{{.Values.ClientCpuRequests}}"
-            memory: "{{.Values.ClientMemoryRequests}}"
-          limits:
-            cpu: "{{.Values.ClientCpuLimits}}"
-            memory: "{{.Values.ClientMemoryLimits}}"
-        livenessProbe:
-          httpGet:
-            path: /
-            port: 9200
-          initialDelaySeconds: 30
-          timeoutSeconds: 1
+{{ toYaml .Values.client.resources | indent 12 }}
         readinessProbe:
           httpGet:
-            path: /
+            path: /_cluster/health?local=true
             port: 9200
-          timeoutSeconds: 5
-        image: "{{.Values.Image}}:{{.Values.ImageTag}}"
-        imagePullPolicy: "{{.Values.ImagePullPolicy}}"
+          initialDelaySeconds: 5
+        image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
+        imagePullPolicy: {{ default "" .Values.image.pullPolicy | quote }}
         ports:
         - containerPort: 9200
           name: http
         - containerPort: 9300
           name: transport
+        volumeMounts:
+        - mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
+          name: config
+          subPath: elasticsearch.yml
+        - mountPath: /usr/share/elasticsearch/config/logging.yml
+          name: config
+          subPath: logging.yml
+      volumes:
+      - name: config
+        configMap:
+          name: {{ template "fullname" . }}
diff --git a/incubator/elasticsearch/templates/elasticsearch-client-svc.yaml b/incubator/elasticsearch/templates/elasticsearch-client-svc.yaml
@@ -0,0 +1,19 @@
+apiVersion: v1
+kind: Service
+metadata:
+  labels:
+    app: {{ template "name" . }}
+    chart: {{ .Chart.Name }}-{{ .Chart.Version }}
+    component: "{{ .Values.client.name }}"
+    heritage: {{ .Release.Service }}
+    release: {{ .Release.Name }}
+  name: {{ template "client.fullname" . }}
+spec:
+  ports:
+    - port: 9200
+      targetPort: http
+  selector:
+    app: {{ template "name" . }}
+    component: "{{ .Values.client.name }}"
+    release: {{ .Release.Name }}
+  type: {{ .Values.client.serviceType }}
diff --git a/incubator/elasticsearch/templates/elasticsearch-cluster-svc.yaml b/incubator/elasticsearch/templates/elasticsearch-cluster-svc.yaml