diff --git a/modules/cluster-logging-deploy-storage-considerations.adoc b/modules/cluster-logging-deploy-storage-considerations.adoc index eda469cc9217..57a043aec1ea 100644 --- a/modules/cluster-logging-deploy-storage-considerations.adoc +++ b/modules/cluster-logging-deploy-storage-considerations.adoc @@ -5,9 +5,11 @@ [id="cluster-logging-deploy-storage-considerations_{context}"] = Storage considerations for cluster logging and {product-title} +//// +An Elasticsearch index is a collection of primary shards and their corresponding replica shards. This is how Elasticsearch implements high availability internally, so there is little requirement to use hardware based mirroring RAID variants. RAID 0 can still be used to increase overall disk performance. +//// -A persistent volume is required for each Elasticsearch deployment to have one data volume per data node. On {product-title} this is achieved using -persistent volume claims. +A persistent volume is required for each Elasticsearch deployment configuration. On {product-title} this is achieved using persistent volume claims. [NOTE] ==== @@ -56,9 +58,7 @@ Baseline (256 characters per minute -> 15KB/min) |=== -Calculating total logging throughput and disk space required for your {product-title} cluster requires knowledge of your applications. For example, if one of your -applications on average logs 10 lines-per-second, each 256 bytes-per-line, -calculate per-application throughput and disk space as follows: +Calculating the total logging throughput and disk space required for your {product-title} cluster requires knowledge of your applications. For example, if one of your applications on average logs 10 lines-per-second, each 256 bytes-per-line, calculate per-application throughput and disk space as follows: ---- (bytes-per-line * (lines-per-second) = 2560 bytes per app per second @@ -69,19 +69,11 @@ calculate per-application throughput and disk space as follows: Fluentd ships any logs from *systemd journal* and */var/log/containers/* to Elasticsearch. -Therefore, consider how much data you need in advance and that you are -aggregating application log data. Some Elasticsearch users have found that it -is necessary to keep absolute storage consumption around 50% and below 70% at all times. This -helps to avoid Elasticsearch becoming unresponsive during large merge -operations. +Elasticsearch requires sufficient memory to perform large merge operations. If it does not have enough memory, it becomes unresponsive. To avoid this problem, evaluate how much application log data you need, and allocate approximately double that amount of free storage capacity. -By default, at 85% Elasticsearch stops allocating new data to the node, at 90% Elasticsearch attempts to relocate -existing shards from that node to other nodes if possible. But if no nodes have free capacity below 85%, Elasticsearch effectively rejects creating new indices -and becomes RED. +By default, when storage capacity is 85% full, Elasticsearch stops allocating new data to the node. At 90%, Elasticsearch attempts to relocate existing shards from that node to other nodes if possible. But if no nodes have a free capacity below 85%, Elasticsearch effectively rejects creating new indices and becomes RED. [NOTE] ==== -These low and high watermark values are Elasticsearch defaults in the current release. You can modify these values, -but you also must apply any modifications to the alerts also. The alerts are based -on these defaults. +These low and high watermark values are Elasticsearch defaults in the current release. You can modify these default values. Although the alerts use the same default values, you cannot change these values in the alerts. ====