Skip to content

Commit

Permalink
Merge pull request #2063 from thezackm/chore/vsphere-updates
Browse files Browse the repository at this point in the history
chore: update vsphere quickstart
  • Loading branch information
zstix authored Sep 28, 2023
2 parents 46a4897 + 41b34e7 commit e92a7c5
Show file tree
Hide file tree
Showing 25 changed files with 2,010 additions and 1,321 deletions.
19 changes: 19 additions & 0 deletions alert-policies/vmware-vsphere/cluster-overall-status.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: vSphere Cluster overallStatus = 'red'
description: |+
This alert fires when a vSphere Cluster has an overall status = 'red' for longer than 5 minutes.
type: STATIC
nrql:
query: "FROM VSphereClusterSample SELECT count(*) FACET datacenterName, displayName WHERE overallStatus = 'red'"
valueFunction: SINGLE_VALUE
terms:
- priority: CRITICAL
operator: ABOVE
threshold: 0
thresholdDuration: 300
thresholdOccurrences: ALL
signal:
aggregationDelay: 120
aggregationMethod: EVENT_FLOW
aggregationWindow: 60

violationTimeLimitSeconds: 259200
19 changes: 19 additions & 0 deletions alert-policies/vmware-vsphere/datacenter-overall-status.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: vSphere Datacenter overallStatus = 'red'
description: |+
This alert fires when a vSphere Datacenter has an overall status = 'red' for longer than 5 minutes.
type: STATIC
nrql:
query: "FROM VSphereDatacenterSample SELECT count(*) FACET datacenterName WHERE overallStatus = 'red'"
valueFunction: SINGLE_VALUE
terms:
- priority: CRITICAL
operator: ABOVE
threshold: 0
thresholdDuration: 300
thresholdOccurrences: ALL
signal:
aggregationDelay: 120
aggregationMethod: EVENT_FLOW
aggregationWindow: 60

violationTimeLimitSeconds: 259200
19 changes: 19 additions & 0 deletions alert-policies/vmware-vsphere/datastore-accessible.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: vSphere Datastore is not accessible
description: |+
This alert fires when a vSphere Datastore is not accessible for longer than 5 minutes, indicating a loss of connectivity.
type: STATIC
nrql:
query: "FROM VSphereDatastoreSample SELECT count(*) FACET datacenterName, name, displayName WHERE accessible = 'false'"
valueFunction: SINGLE_VALUE
terms:
- priority: CRITICAL
operator: ABOVE
threshold: 0
thresholdDuration: 300
thresholdOccurrences: ALL
signal:
aggregationDelay: 120
aggregationMethod: EVENT_FLOW
aggregationWindow: 60

violationTimeLimitSeconds: 259200
19 changes: 19 additions & 0 deletions alert-policies/vmware-vsphere/datastore-capacity-percent.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: vSphere Datastore high Capacity Utilization
description: |+
This alert fires when a vSphere Datastore has capacity utilization % > 90 for longer than 10 minutes.
type: STATIC
nrql:
query: "FROM VSphereDatastoreSample SELECT ((max(capacity) - max(freeSpace)) / max(capacity)) * 100 FACET datacenterName, name, displayName"
valueFunction: SINGLE_VALUE
terms:
- priority: CRITICAL
operator: ABOVE
threshold: 90
thresholdDuration: 600
thresholdOccurrences: ALL
signal:
aggregationDelay: 120
aggregationMethod: EVENT_FLOW
aggregationWindow: 60

violationTimeLimitSeconds: 259200
19 changes: 19 additions & 0 deletions alert-policies/vmware-vsphere/datastore-overall-status.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: vSphere Datastore overallStatus = 'red'
description: |+
This alert fires when a vSphere Datastore has an overall status = 'red' for longer than 5 minutes.
type: STATIC
nrql:
query: "FROM VSphereDatastoreSample SELECT count(*) FACET datacenterName, name, displayName WHERE overallStatus = 'red'"
valueFunction: SINGLE_VALUE
terms:
- priority: CRITICAL
operator: ABOVE
threshold: 0
thresholdDuration: 300
thresholdOccurrences: ALL
signal:
aggregationDelay: 120
aggregationMethod: EVENT_FLOW
aggregationWindow: 60

violationTimeLimitSeconds: 259200
19 changes: 19 additions & 0 deletions alert-policies/vmware-vsphere/host-connection-state.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: vSphere Host connection lost
description: |+
This alert fires when a vSphere Host is not responding to heartbeats for longer than 5 minutes.
type: STATIC
nrql:
query: "FROM VSphereHostSample SELECT count(*) FACET datacenterName, clusterName, hypervisorHostname WHERE connectionState = 'notResponding'"
valueFunction: SINGLE_VALUE
terms:
- priority: CRITICAL
operator: ABOVE
threshold: 0
thresholdDuration: 300
thresholdOccurrences: ALL
signal:
aggregationDelay: 120
aggregationMethod: EVENT_FLOW
aggregationWindow: 60

violationTimeLimitSeconds: 259200
19 changes: 19 additions & 0 deletions alert-policies/vmware-vsphere/host-cpu-percent.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: vSphere Host high CPU Utilization
description: |+
This alert fires when a vSphere Host has a CPU utilization % > 90 for longer than 5 minutes.
type: STATIC
nrql:
query: "FROM VSphereHostSample SELECT max(cpu.percent) FACET datacenterName, clusterName, hypervisorHostname"
valueFunction: SINGLE_VALUE
terms:
- priority: CRITICAL
operator: ABOVE
threshold: 90
thresholdDuration: 300
thresholdOccurrences: ALL
signal:
aggregationDelay: 120
aggregationMethod: EVENT_FLOW
aggregationWindow: 60

violationTimeLimitSeconds: 259200
19 changes: 19 additions & 0 deletions alert-policies/vmware-vsphere/host-memory-percent.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: vSphere Host high Memory Utilization
description: |+
This alert fires when a vSphere Host has memory utilization % > 90 for longer than 5 minutes.
type: STATIC
nrql:
query: "FROM VSphereHostSample SELECT (max(mem.usage) / max(mem.size)) * 100 FACET datacenterName, clusterName, hypervisorHostname"
valueFunction: SINGLE_VALUE
terms:
- priority: CRITICAL
operator: ABOVE
threshold: 90
thresholdDuration: 300
thresholdOccurrences: ALL
signal:
aggregationDelay: 120
aggregationMethod: EVENT_FLOW
aggregationWindow: 60

violationTimeLimitSeconds: 259200
19 changes: 19 additions & 0 deletions alert-policies/vmware-vsphere/host-overall-status.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: vSphere Host overallStatus = 'red'
description: |+
This alert fires when a vSphere Host has an overall status = 'red' for longer than 5 minutes.
type: STATIC
nrql:
query: "FROM VSphereHostSample SELECT count(*) FACET datacenterName, clusterName, hypervisorHostname WHERE overallStatus = 'red'"
valueFunction: SINGLE_VALUE
terms:
- priority: CRITICAL
operator: ABOVE
threshold: 0
thresholdDuration: 300
thresholdOccurrences: ALL
signal:
aggregationDelay: 120
aggregationMethod: EVENT_FLOW
aggregationWindow: 60

violationTimeLimitSeconds: 259200
19 changes: 19 additions & 0 deletions alert-policies/vmware-vsphere/resourcepool-cpu-percent.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: vSphere Resource Pool high CPU Utilization
description: |+
This alert fires when a vSphere Resource Pool has a CPU utilization % > 90 for longer than 5 minutes.
type: STATIC
nrql:
query: "FROM VSphereResourcePoolSample SELECT (max(cpu.overallUsage) / max(cpu.totalMHz)) * 100 FACET datacenterName, clusterName, resourcePoolName"
valueFunction: SINGLE_VALUE
terms:
- priority: CRITICAL
operator: ABOVE
threshold: 90
thresholdDuration: 300
thresholdOccurrences: ALL
signal:
aggregationDelay: 120
aggregationMethod: EVENT_FLOW
aggregationWindow: 60

violationTimeLimitSeconds: 259200
19 changes: 19 additions & 0 deletions alert-policies/vmware-vsphere/resourcepool-memory-percent.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: vSphere Resource Pool high Memory Utilization
description: |+
This alert fires when a vSphere Resource Pool has memory utilization % > 90 for longer than 5 minutes.
type: STATIC
nrql:
query: "FROM VSphereResourcePoolSample SELECT (max(mem.usage) / max(mem.size)) * 100 FACET datacenterName, clusterName, resourcePoolName"
valueFunction: SINGLE_VALUE
terms:
- priority: CRITICAL
operator: ABOVE
threshold: 90
thresholdDuration: 300
thresholdOccurrences: ALL
signal:
aggregationDelay: 120
aggregationMethod: EVENT_FLOW
aggregationWindow: 60

violationTimeLimitSeconds: 259200
19 changes: 19 additions & 0 deletions alert-policies/vmware-vsphere/resourcepool-overall-status.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: vSphere Resource Pool overallStatus = 'red'
description: |+
This alert fires when a vSphere Resource Pool has an overall status = 'red' for longer than 5 minutes.
type: STATIC
nrql:
query: "FROM VSphereResourcePoolSample SELECT count(*) FACET datacenterName, clusterName, resourcePoolName WHERE overallStatus = 'red'"
valueFunction: SINGLE_VALUE
terms:
- priority: CRITICAL
operator: ABOVE
threshold: 0
thresholdDuration: 300
thresholdOccurrences: ALL
signal:
aggregationDelay: 120
aggregationMethod: EVENT_FLOW
aggregationWindow: 60

violationTimeLimitSeconds: 259200
19 changes: 19 additions & 0 deletions alert-policies/vmware-vsphere/vm-overall-status.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: vSphere Virtual Machine overallStatus = 'red'
description: |+
This alert fires when a vSphere Virtual Machine has an overall status = 'red' for longer than 5 minutes.
type: STATIC
nrql:
query: "FROM VSphereVmSample SELECT count(*) FACET datacenterName, clusterName, displayName WHERE overallStatus = 'red'"
valueFunction: SINGLE_VALUE
terms:
- priority: CRITICAL
operator: ABOVE
threshold: 0
thresholdDuration: 300
thresholdOccurrences: ALL
signal:
aggregationDelay: 120
aggregationMethod: EVENT_FLOW
aggregationWindow: 60

violationTimeLimitSeconds: 259200
Loading

0 comments on commit e92a7c5

Please sign in to comment.