Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 66 additions & 13 deletions deploy-manage/deploy/cloud-on-k8s/deploy-eck-on-gke-autopilot.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,25 +19,57 @@
1. It is recommended that each Kubernetes host’s virtual memory kernel settings be modified. Refer to [Virtual memory](virtual-memory.md).
2. It is recommended that {{es}} Pods have an `initContainer` that waits for virtual memory settings to be in place.
3. For Elastic Agent/Beats there are storage limitations to be considered.
4. Ensure you are using a node class that is applicable for your workload by adding a `cloud.google.com/compute-class` label in a `nodeSelector`. Refer to [GKE Autopilot documentation.](https://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-compute-classes).
4. Ensure you are using a node class that is applicable for your workload by adding a `cloud.google.com/compute-class` label in a `nodeSelector`. Refer to [GKE Autopilot documentation](https://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-compute-classes).

## Ensuring virtual memory kernel settings [k8s-autopilot-setting-virtual-memory]

If you are intending to run production workloads on GKE Autopilot then `vm.max_map_count` should be set. The recommended way to set this kernel setting on the Autopilot hosts is with a `Daemonset` as described in the [Virtual memory](virtual-memory.md) section. You must be running at least version 1.25 when on the `regular` channel or using the `rapid` channel, which currently runs version 1.27.
If you are intending to run production workloads on GKE Autopilot then `vm.max_map_count` should be set. The recommended way to set this kernel setting on the Autopilot hosts depends on your GKE version:

::::{warning}
Only use the provided `Daemonset` exactly as specified or it could be rejected by the Autopilot control plane.
::::
* **GKE 1.30.3-gke.1451000 or later**: [Use a custom ComputeClass](/deploy-manage/deploy/cloud-on-k8s/virtual-memory.md#k8s_using_a_computeclass_to_set_virtual_memory). Using a custom ComputeClass allows you to set a higher value for `vm.max_map_count`, avoiding the limitations of the `DaemonSet` approach.
* **Earlier versions**: [Use a DaemonSet](/deploy-manage/deploy/cloud-on-k8s/virtual-memory.md#k8s_using_a_daemonset_to_set_virtual_memory). You must be running at least version 1.25 when on the `regular` channel or using the `rapid` channel, which currently runs version 1.27.

::::{warning}
Use the provided `Daemonset` exactly as specified, with a `vm.max_map_count` value of `262144`, or it could be rejected by the Autopilot control plane.
::::

## Install the ECK Operator [k8s-autopilot-deploy-the-operator]

Refer to [*Install ECK*](install.md) for more information on installation options.

## Deploy an {{es}} cluster [k8s-autopilot-deploy-elasticsearch]

Create an {{es}} cluster. If you are using the `Daemonset` described in the [Virtual memory](virtual-memory.md) section to set `max_map_count` you can add the `initContainer` below is also used to ensure the setting is set prior to starting {{es}}.
Create an {{es}} cluster. The information that you need to provide in your spec depends on whether you've increased your virtual memory kernel setting, and the method that you used.

::::{tab-set}

:::{tab-item} Using a custom ComputeClass
If you used a custom ComputeClass to set `vm.max_map_count`, then you need to reference the custom ComputeClass as part of your template spec.

```shell subs=true
```yaml subs=true
cat <<EOF | kubectl apply -f -
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: elasticsearch-sample
spec:
version: {{version.stack}}
nodeSets:
- name: default
count: 1
podTemplate:
spec:
nodeSelector:
cloud.google.com/compute-class: "elasticsearch"
EOF
```
:::


:::{tab-item} Using a DaemonSet

If you used a DaemonSet to set `max_map_count`, you can add the following `initContainer` to ensure the setting is set prior to starting {{es}}.

Check notice on line 70 in deploy-manage/deploy/cloud-on-k8s/deploy-eck-on-gke-autopilot.md

View workflow job for this annotation

GitHub Actions / preview / vale

Elastic.Wordiness: Consider using 'before' instead of 'prior to'.

```yaml subs=true
cat <<EOF | kubectl apply -f -
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
Expand All @@ -48,23 +80,44 @@
nodeSets:
- name: default
count: 1
# Only uncomment the below section if you are not using the Daemonset to set max_map_count.
# config:
# node.store.allow_mmap: false
podTemplate:
spec:
# This init container ensures that the `max_map_count` setting has been applied before starting Elasticsearch.
# This is not required, but is encouraged when using the previously mentioned Daemonset to set max_map_count.
# This is not required, but is encouraged when using the Daemonset to set max_map_count.
# Do not use this if setting config.node.store.allow_mmap: false
initContainers:
- name: max-map-count-check
command: ['sh', '-c', "while true; do mmc=$(cat /proc/sys/vm/max_map_count); if [ ${mmc} -eq 262144 ]; then exit 0; fi; sleep 1; done"]
command: ['sh', '-c', "while true; do mmc=$(cat /proc/sys/vm/max_map_count); if [ ${mmc} -eq 262144 ]; then exit 0; fi; sleep 1; done"]
EOF
```
:::
::::

### Deploy without custom virtual memory

If you didn't increase your virtual memory, then you need to set `node.store.allow_mmap` to `false`.

```yaml subs=true
cat <<EOF | kubectl apply -f -
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: elasticsearch-sample
spec:
version: {{version.stack}}
nodeSets:
- name: default
count: 1
config:
node.store.allow_mmap: false
EOF
```
:::
::::

## Deploy a standalone Elastic Agent and/or Beats [k8s-autopilot-deploy-agent-beats]

When running Elastic Agent and Beats within GKE Autopilot there are storage constraints to be considered. No `HostPath` volumes are allowed, which the ECK operator defaults to when unset for both `Deployments` and `Daemonsets`. Instead use [Kubernetes ephemeral volumes](https://kubernetes.io/docs/concepts/storage/ephemeral-volumes).
When running Elastic Agent and Beats within GKE Autopilot there are storage constraints to be considered. No `HostPath` volumes are allowed, which the ECK operator defaults to when unset for both `Deployments` and `DaemonSets`. Instead use [Kubernetes ephemeral volumes](https://kubernetes.io/docs/concepts/storage/ephemeral-volumes).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sentence is a bit difficult to digest.... the when unset is not clear (what can be set that uses a hostPath? Probably the data directory of the beat or the state directory of the Elastic agent, but the unset statement feels weird).

Not sure if this sounds better or feels easier.... @pebrc, @shainaraskas , wdyt?

Suggested change
When running Elastic Agent and Beats within GKE Autopilot there are storage constraints to be considered. No `HostPath` volumes are allowed, which the ECK operator defaults to when unset for both `Deployments` and `DaemonSets`. Instead use [Kubernetes ephemeral volumes](https://kubernetes.io/docs/concepts/storage/ephemeral-volumes).
When running {{agent}} and {{beats}} on GKE Autopilot, storage constraints apply. GKE Autopilot does not allow `hostPath` volumes. By default, the ECK operator uses a `hostPath` volume for the data directory when no alternative volume is configured, whether the workload is deployed as a `Deployment` or a `DaemonSet`. To run successfully, you can use [Kubernetes ephemeral volumes](https://kubernetes.io/docs/concepts/storage/ephemeral-volumes) instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will leave this alone for now as it's outside of the scope of the original issue


Refer to [Recipes to deploy {{es}}, {{kib}}, Elastic Fleet Server and Elastic Agent and/or Beats within GKE Autopilot](https://github.com/elastic/cloud-on-k8s/tree/main/config/recipes/autopilot).

65 changes: 61 additions & 4 deletions deploy-manage/deploy/cloud-on-k8s/virtual-memory.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,14 @@

By default, {{es}} uses memory mapping (`mmap`) to efficiently access indices. Default values for virtual address space on Linux distributions can be too low for {{es}} to work properly, which may result in out-of-memory exceptions. This is why [the quickstart example](/deploy-manage/deploy/cloud-on-k8s/elasticsearch-deployment-quickstart.md) disables `mmap` through the `node.store.allow_mmap: false` setting. For production workloads, we recommended you increase the kernel setting `vm.max_map_count` to `1048576` and leave `node.store.allow_mmap` unset.

The kernel setting `vm.max_map_count=1048576` can be set on the host directly, by a dedicated init container which must be privileged, or a dedicated Daemonset.
The kernel setting `vm.max_map_count=1048576` can be set on the host directly, by a dedicated init container which must be privileged, a dedicated Daemonset, or a custom ComputeClass.

:::{important}
For {{es}} version 8.16 and later, set the `vm.max_map_count` kernel setting to `1048576`; for {{es}} version 8.15 and earlier, set `vm.max_map_count` to `262144`.
For {{es}} version 8.16 and later, set the `vm.max_map_count` kernel setting to `1048576`; for {{es}} version 8.15 and earlier, set `vm.max_map_count` to `262144`.

The exception is in GKE Autopilot environments. Your options depend on your GKE version:
* **GKE 1.30.3-gke.1451000 or later**: Use a custom `ComputeClass`, rather than a `DaemonSet`, to override the kernel setting.
* **Earlier versions**: `vm.max_map_count` must be set to `262144`.
:::

For more information, check the {{es}} documentation on [Virtual memory](/deploy-manage/deploy/self-managed/vm-max-map-count.md).
Expand Down Expand Up @@ -91,13 +95,14 @@
securityContext:
privileged: true
runAsUser: 0
command: ['/usr/local/bin/bash', '-e', '-c', 'echo 262144 > /proc/sys/vm/max_map_count']
command: ['/usr/local/bin/bash', '-e', '-c', 'echo 1048576 > /proc/sys/vm/max_map_count'] <1>
containers:
- name: sleep
image: docker.io/bash:5.2.21
command: ['sleep', 'infinity']
EOF
```
1. In GKE Autopilot environments, `vm.max_map_count` must be set to 262144 when using a DaemonSet.

To run an {{es}} instance that waits for the kernel setting to be in place:

Expand All @@ -122,8 +127,60 @@
# Do not use this if setting config.node.store.allow_mmap: false
initContainers:
- name: max-map-count-check
command: ['sh', '-c', "while true; do mmc=$(cat /proc/sys/vm/max_map_count); if [ ${mmc} -eq 262144 ]; then exit 0; fi; sleep 1; done"]
command: ['sh', '-c', "while true; do mmc=$(cat /proc/sys/vm/max_map_count); if [ ${mmc} -eq 262144 ]; then exit 0; fi; sleep 1; done"] <1>
EOF
```
1. In GKE Autopilot environments, `vm.max_map_count` must be set to 262144 when using a DaemonSet.


## Using a custom ComputeClass to set virtual memory [k8s_using_a_computeclass_to_set_virtual_memory]
```{applies_to}
deployment:
eck: ga 3.2+
```

If you're using GKE to run ECK, then you can use a [custom ComputeClass](https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-custom-compute-classes), rather than a DaemonSet, to increase the `vm.max_map_count` setting. On [GKE Autopilot](/deploy-manage/deploy/cloud-on-k8s/deploy-eck-on-gke-autopilot.md) this allows you to set a higher value, which is not possible with a DaemonSet.

Check notice on line 142 in deploy-manage/deploy/cloud-on-k8s/virtual-memory.md

View workflow job for this annotation

GitHub Actions / preview / vale

Elastic.Wordiness: Consider using 'impossible' instead of 'not possible'.

1. Create a ComputeClass that changes the host kernel setting on all nodes:

```yaml
cat <<EOF | kubectl apply -f -
apiVersion: cloud.google.com/v1
kind: ComputeClass
metadata:
name: elasticsearch
spec:
whenUnsatisfiable: "DoNotScaleUp" <1>
nodePoolAutoCreation:
enabled: true
priorityDefaults: <2>
nodeSystemConfig:
linuxNodeConfig:
sysctls:
vm.max_map_count: 1048576
priorities:
- machineFamily: n2
EOF
```
1. Default since GKE 1.33

Check notice on line 165 in deploy-manage/deploy/cloud-on-k8s/virtual-memory.md

View workflow job for this annotation

GitHub Actions / preview / vale

Elastic.Wordiness: Consider using 'because' instead of 'since'.
2. `priorityDefaults` is available only since GKE 1.32.1-gke.1729000

Check notice on line 166 in deploy-manage/deploy/cloud-on-k8s/virtual-memory.md

View workflow job for this annotation

GitHub Actions / preview / vale

Elastic.Wordiness: Consider using 'because' instead of 'since'.

2. Create your {{es}} instance using the custom ComputeClass:

```yaml subs=true
cat <<'EOF' | kubectl apply -f -
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: elasticsearch
spec:
version: {{version.stack}}
nodeSets:
- name: default
count: 1
podTemplate:
spec:
nodeSelector:
cloud.google.com/compute-class: "elasticsearch"
EOF
```
Loading