cloud-platform-terraform-opensearch-cloudwatch-alarm

This Terraform module will create OpenSearch CloudWatch alarm for use on the Cloud Platform.

Usage

module "opensearch_cloudwatch_alarm" {
  source              = "github.com/ministryofjustice/cloud-platform-terraform-opensearch-cloudwatch-alarm?ref=version" # use the latest release

  alarm_name_prefix   = local.<os_domain_name>
  domain_name         = local.<os_domain_name>
  sns_topic           = module.baselines.slack_sns_topic
  min_available_nodes = aws_opensearch_domain.<os_domain_name>.cluster_config[0].instance_count
  tags                = local.logs_tags
}

Metric name	Statistic	Period (second)	ComparisonOperator	Threshold	EvaluationPeriods
ClusterStatus.red	Maximum	60	GreaterThanOrEqualToThreshold	1	1
ClusterStatus.yellow	Maximum	60	GreaterThanOrEqualToThreshold	1	1
FreeStorageSpace	Minimum	60	LessThanOrEqualToThreshold	20480	1
ClusterIndexWritesBlocked	Maximum	300	GreaterThanOrEqualToThreshold	1	1
Nodes	Minimum	86400	LessThanThreshold	1	1
AutomatedSnapshotFailure	Maximum	60	GreaterThanOrEqualToThreshold	1	1
CPUUtilization	Maximum	900	GreaterThanOrEqualToThreshold	80	3
JVMMemoryPressure	Maximum	60	GreaterThanOrEqualToThreshold	95	3
MasterCPUUtilization	Maximum	900	GreaterThanOrEqualToThreshold	50	3
MasterJVMMemoryPressure	Maximum	60	GreaterThanOrEqualToThreshold	95	3
KMSKeyError	Maximum	60	GreaterThanOrEqualToThreshold	1	1
KMSKeyInaccessible	Maximum	60	GreaterThanOrEqualToThreshold	1	1
Shards.active	Maximum	60	GreaterThanOrEqualToThreshold	30000	1
MasterReachableFromNode	Maximum	86400	LessThanThreshold	1	1
ThreadpoolWriteQueue	Average	60	GreaterThanOrEqualToThreshold	100	3
ThreadpoolSearchQueue	Average	60	GreaterThanOrEqualToThreshold	500	1
ThreadpoolSearchQueue	Maximum	60	GreaterThanOrEqualToThreshold	5000	1
ThreadpoolWriteRejected	Maximum	60	GreaterThanOrEqualToThreshold	1	1
ThreadpoolSearchRejected	Maximum	60	GreaterThanOrEqualToThreshold	1	1

Requirements

Name	Version
terraform	>= 1.2.5

Providers

Name	Version
aws	n/a

Modules

No modules.

Resources

Name	Type
aws_cloudwatch_metric_alarm.automated_snapshot_failure	resource
aws_cloudwatch_metric_alarm.cluster_index_writes_blocked	resource
aws_cloudwatch_metric_alarm.cluster_status_is_red	resource
aws_cloudwatch_metric_alarm.cluster_status_is_yellow	resource
aws_cloudwatch_metric_alarm.cpu_utilization_too_high	resource
aws_cloudwatch_metric_alarm.free_storage_space_too_low	resource
aws_cloudwatch_metric_alarm.free_storage_space_total_too_low	resource
aws_cloudwatch_metric_alarm.insufficient_available_nodes	resource
aws_cloudwatch_metric_alarm.jvm_memory_pressure_too_high	resource
aws_cloudwatch_metric_alarm.kms_key_error	resource
aws_cloudwatch_metric_alarm.kms_key_inaccessible	resource
aws_cloudwatch_metric_alarm.master_cpu_utilization_too_high	resource
aws_cloudwatch_metric_alarm.master_jvm_memory_pressure_too_high	resource
aws_cloudwatch_metric_alarm.shards_active_too_high	resource
aws_cloudwatch_metric_alarm.threadpool_search_queue_average	resource
aws_cloudwatch_metric_alarm.threadpool_search_queue_max	resource
aws_cloudwatch_metric_alarm.threadpool_search_rejected	resource
aws_cloudwatch_metric_alarm.threadpool_write_queue_too_high	resource
aws_cloudwatch_metric_alarm.threadpool_write_rejected	resource
aws_cloudwatch_metric_alarm.unreachable_master_node	resource
aws_caller_identity.default	data source

Inputs

Name	Description	Type	Default	Required
alarm_automated_snapshot_failure_period	The period of the automated snapshot failure. The statistics should be applied in seconds	`number`	`60`	no
alarm_automated_snapshot_failure_periods	The number of periods to alert that automatic snapshots failed. Default: 1, raise this to be less noisy, as this can occur often for only 1 period	`number`	`1`	no
alarm_cluster_index_writes_blocked_period	The period of the cluster index writes being blocked. The statistics should be applied in seconds	`number`	`300`	no
alarm_cluster_index_writes_blocked_periods	The number of periods to alert that cluster index writes are blocked. Default: 1, raise this to be less noisy, as this can occur often for only 1 period	`number`	`1`	no
alarm_cluster_status_is_red_period	The period of the cluster status is in red. The statistics should be applied in seconds	`number`	`60`	no
alarm_cluster_status_is_red_periods	The number of periods to alert that cluster status is red. Default: 1, raise this to be less noisy, as this can occur often for only 1 period	`number`	`1`	no
alarm_cluster_status_is_yellow_period	The period of the cluster status is in yellow. The statistics should be applied in seconds	`number`	`60`	no
alarm_cluster_status_is_yellow_periods	The number of periods to alert that cluster status is yellow. Default: 1, raise this to be less noisy, as this can occur often for only 1 period	`number`	`1`	no
alarm_cpu_utilization_too_high_period	The period of the CPU utilization is too high. The statistics should be applied in seconds	`number`	`900`	no
alarm_cpu_utilization_too_high_periods	The number of periods to alert that CPU usage is too high. Default: 3, raise this to be less noisy, as this can occur often for only 1 period	`number`	`3`	no
alarm_free_storage_space_too_low_period	The period of the per-node free storage is too low. The statistics should be applied in seconds	`number`	`60`	no
alarm_free_storage_space_too_low_periods	The number of periods to alert that the per-node free storage space is too low. Default: 1, raise this to be less noisy, as this can occur often for only 1 period	`number`	`1`	no
alarm_free_storage_space_total_too_low_period	The period of the total cluster free storage is too low. The statistics should be applied in seconds	`number`	`60`	no
alarm_free_storage_space_total_too_low_periods	The number of periods to alert that total cluster free storage space is too low. Default: 1, raise this to be less noisy, as this can occur often for only 1 period	`number`	`1`	no
alarm_jvm_memory_pressure_too_high_period	The period of the JVM memory pressure is too high. The statistics should be applied in seconds	`number`	`900`	no
alarm_jvm_memory_pressure_too_high_periods	The number of periods which it must be in the alarmed state to alert	`number`	`3`	no
alarm_kms_period	The period of the KMS-related metrics. The statistics should be applied in seconds	`number`	`60`	no
alarm_kms_periods	The number of periods to alert that kms has failed. Default: 1, raise this to be less noisy, as this can occur often for only 1 period	`number`	`1`	no
alarm_master_cpu_utilization_too_high_period	The period of the CPU utilization of master nodes are too high. The statistics should be applied in seconds	`number`	`900`	no
alarm_master_cpu_utilization_too_high_periods	The number of periods to alert that masters CPU usage is too high. Default: 3, raise this to be less noisy, as this can occur often for only 1 period	`number`	`3`	no
alarm_master_jvm_memory_pressure_too_high_period	The period of the JVM memory pressure of master nodes are too high. The statistics should be applied in seconds	`number`	`900`	no
alarm_master_jvm_memory_pressure_too_high_periods	The number of periods which it must be in the alarmed state to alert	`number`	`3`	no
alarm_min_available_nodes_period	The period of the minimum available nodes. The statistics should be applied in seconds	`number`	`86400`	no
alarm_min_available_nodes_periods	The number of periods to alert that minimum number of available nodes dropped below a threshold. Default: 1, raise this to be less noisy, as this can occur often for only 1 period	`number`	`1`	no
alarm_name_postfix	Alarm name suffix, used in the naming of alarms created	`string`	`""`	no
alarm_name_prefix	Alarm name prefix, used in the naming of alarms created	`string`	`""`	no
alarm_shard_active_number_too_high_period	The period of the active shard number are too high. The statistics should be applied in seconds	`number`	`60`	no
alarm_shard_active_number_too_high_periods	The number of periods to alert that active shard number is too high. Default: 1, raise this to be less noisy, as this can occur often for only 1 period	`number`	`1`	no
alarm_threadpool_search_queue_too_high_period	The period of the threadpool search queue is too high. The statistics should be applied in seconds	`number`	`60`	no
alarm_threadpool_search_queue_too_high_periods	The number of periods to alert that threadpool search queue is too high. Default: 1, raise this to be less noisy, as this can occur often for only 1 period	`number`	`1`	no
alarm_threadpool_search_rejected_period	The period of the threadpool search queue rejected is increasing. The statistics should be applied in seconds	`number`	`60`	no
alarm_threadpool_search_rejected_periods	The number of periods to alert that threadpool write queue rejected is increasing. Default: 1, raise this to be less noisy, as this can occur often for only 1 period	`number`	`1`	no
alarm_threadpool_write_queue_too_high_period	The period of the threadpool write queue is too high. The statistics should be applied in seconds	`number`	`60`	no
alarm_threadpool_write_queue_too_high_periods	The number of periods to alert that threadpool write queue is too high. Default: 1, raise this to be less noisy, as this can occur often for only 1 period	`number`	`3`	no
alarm_threadpool_write_rejected_period	The period of the threadpool write queue rejected is increasing. The statistics should be applied in seconds	`number`	`60`	no
alarm_threadpool_write_rejected_periods	The number of periods to alert that threadpool write queue rejected is increasing. Default: 1, raise this to be less noisy, as this can occur often for only 1 period	`number`	`1`	no
alarm_unreachable_master_node_period	The period of the master node is unreachable. The statistics should be applied in seconds	`number`	`86400`	no
alarm_unreachable_master_node_periods	The number of periods to alert that master node is unreachable. Default: 1, raise this to be less noisy, as this can occur often for only 1 period	`number`	`1`	no
cpu_utilization_threshold	The maximum percentage of CPU utilization	`number`	`80`	no
domain_name	The Elasticsearch domain name you want to monitor	`string`	n/a	yes
free_storage_space_threshold	The minimum amount of available storage space in megabytes. This is per-node.	`number`	`20480`	no
free_storage_space_total_threshold	The minimum amount of available storage space in megabytes aggregated across your cluster (for multi-node). This is an aggregate, typically use (free_storage_space_threshold * min_available_nodes)	`number`	`20480`	no
jvm_memory_pressure_threshold	The maximum percentage of the Java heap used for all data nodes in the cluster	`number`	`80`	no
master_cpu_utilization_threshold	The maximum percentage of CPU utilization of master nodes	`number`	`80`	no
master_jvm_memory_pressure_threshold	The maximum percentage of the Java heap used for master nodes in the cluster	`number`	`80`	no
min_available_nodes	The minimum available (reachable) nodes to have, set to non-zero to enable	`number`	`0`	no
monitor_automated_snapshot_failure	Enable monitoring of automated snapshot failure	`bool`	`true`	no
monitor_cluster_index_writes_blocked	Enable monitoring of cluster index writes being blocked	`bool`	`true`	no
monitor_cluster_status_is_red	Enable monitoring of cluster status is in red	`bool`	`true`	no
monitor_cluster_status_is_yellow	Enable monitoring of cluster status is in yellow	`bool`	`true`	no
monitor_cpu_utilization_too_high	Enable monitoring of CPU utilization is too high	`bool`	`true`	no
monitor_free_storage_space_too_low	Enable monitoring of cluster per-node free storage is too low	`bool`	`true`	no
monitor_free_storage_space_total_too_low	Enable monitoring of cluster total free storage is too low. Disabled by default, if you set this you must set free_storage_space_total_threshold also	`bool`	`false`	no
monitor_jvm_memory_pressure_too_high	Enable monitoring of JVM memory pressure is too high	`bool`	`true`	no
monitor_kms	Enable monitoring of KMS-related metrics. Only enable this when using KMS with ElasticSearch	`bool`	`true`	no
monitor_master_cpu_utilization_too_high	Enable monitoring of CPU utilization of master nodes are too high. Only enable this when dedicated master is enabled	`bool`	`true`	no
monitor_master_jvm_memory_pressure_too_high	Enable monitoring of JVM memory pressure of master nodes are too high. Only enable this wwhen dedicated master is enabled	`bool`	`true`	no
monitor_min_available_nodes	Enable monitoring of minimum available nodes	`bool`	`true`	no
monitor_shard	Enable monitoring of sharding of master nodes are too high.	`bool`	`true`	no
monitor_threadpool_search_queue	Enable monitoring of threadpool search queue number is too high	`bool`	`true`	no
monitor_threadpool_search_rejected	Enable monitoring of threadpool search queue rejected number is increasing	`bool`	`true`	no
monitor_threadpool_write_queue	Enable monitoring of threadpool write queue number is too high.	`bool`	`true`	no
monitor_threadpool_write_rejected	Enable monitoring of threadpool write queue rejected number is increasing	`bool`	`true`	no
monitor_unreachable_master_node	Enable monitoring of master nodes are running and reachable. Only enable this wwhen dedicated master is enabled	`bool`	`true`	no
shard_active_number_threshold	The maximum number of active primary and replica shards number	`number`	`30000`	no
sns_topic	SNS topic you want to specify. If leave empty, it will use a prefix and a timestampe appended	`string`	`""`	no
tags	A map of tags to add to all resources	`map(string)`	`{}`	no
threadpool_search_queue_average_threshold	The average number of cluster searching concurrency	`number`	`500`	no
threadpool_search_queue_max_threshold	The maximum number of cluster searching concurrency	`number`	`5000`	no
threadpool_search_rejected_threshold	The number of cluster threadpool search rejected threshold. Value 1 means it is increasing	`number`	`1`	no
threadpool_write_queue_threshold	The maximum number of cluster indexing concurrency	`number`	`100`	no
threadpool_write_rejected_threshold	The number of cluster threadpool write rejected threshold. Value 1 means it is increasing	`number`	`1`	no

Outputs

No outputs.

Reading Material

Cloud Platform user guide

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github		.github
examples		examples
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.tf		main.tf
os-cw-alarm.tf		os-cw-alarm.tf
outputs.tf		outputs.tf
variables.tf		variables.tf
versions.tf		versions.tf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cloud-platform-terraform-opensearch-cloudwatch-alarm

Usage

Requirements

Providers

Modules

Resources

Inputs

Outputs

Tags

Reading Material

About

Releases 2

Packages

Contributors 5

Languages

License

ministryofjustice/cloud-platform-terraform-opensearch-cloudwatch-alarm

Folders and files

Latest commit

History

Repository files navigation

cloud-platform-terraform-opensearch-cloudwatch-alarm

Usage

Requirements

Providers

Modules

Resources

Inputs

Outputs

Tags

Reading Material

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 5

Languages

Packages