-
Notifications
You must be signed in to change notification settings - Fork 301
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #2063 from thezackm/chore/vsphere-updates
chore: update vsphere quickstart
- Loading branch information
Showing
25 changed files
with
2,010 additions
and
1,321 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
name: vSphere Cluster overallStatus = 'red' | ||
description: |+ | ||
This alert fires when a vSphere Cluster has an overall status = 'red' for longer than 5 minutes. | ||
type: STATIC | ||
nrql: | ||
query: "FROM VSphereClusterSample SELECT count(*) FACET datacenterName, displayName WHERE overallStatus = 'red'" | ||
valueFunction: SINGLE_VALUE | ||
terms: | ||
- priority: CRITICAL | ||
operator: ABOVE | ||
threshold: 0 | ||
thresholdDuration: 300 | ||
thresholdOccurrences: ALL | ||
signal: | ||
aggregationDelay: 120 | ||
aggregationMethod: EVENT_FLOW | ||
aggregationWindow: 60 | ||
|
||
violationTimeLimitSeconds: 259200 |
19 changes: 19 additions & 0 deletions
19
alert-policies/vmware-vsphere/datacenter-overall-status.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
name: vSphere Datacenter overallStatus = 'red' | ||
description: |+ | ||
This alert fires when a vSphere Datacenter has an overall status = 'red' for longer than 5 minutes. | ||
type: STATIC | ||
nrql: | ||
query: "FROM VSphereDatacenterSample SELECT count(*) FACET datacenterName WHERE overallStatus = 'red'" | ||
valueFunction: SINGLE_VALUE | ||
terms: | ||
- priority: CRITICAL | ||
operator: ABOVE | ||
threshold: 0 | ||
thresholdDuration: 300 | ||
thresholdOccurrences: ALL | ||
signal: | ||
aggregationDelay: 120 | ||
aggregationMethod: EVENT_FLOW | ||
aggregationWindow: 60 | ||
|
||
violationTimeLimitSeconds: 259200 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
name: vSphere Datastore is not accessible | ||
description: |+ | ||
This alert fires when a vSphere Datastore is not accessible for longer than 5 minutes, indicating a loss of connectivity. | ||
type: STATIC | ||
nrql: | ||
query: "FROM VSphereDatastoreSample SELECT count(*) FACET datacenterName, name, displayName WHERE accessible = 'false'" | ||
valueFunction: SINGLE_VALUE | ||
terms: | ||
- priority: CRITICAL | ||
operator: ABOVE | ||
threshold: 0 | ||
thresholdDuration: 300 | ||
thresholdOccurrences: ALL | ||
signal: | ||
aggregationDelay: 120 | ||
aggregationMethod: EVENT_FLOW | ||
aggregationWindow: 60 | ||
|
||
violationTimeLimitSeconds: 259200 |
19 changes: 19 additions & 0 deletions
19
alert-policies/vmware-vsphere/datastore-capacity-percent.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
name: vSphere Datastore high Capacity Utilization | ||
description: |+ | ||
This alert fires when a vSphere Datastore has capacity utilization % > 90 for longer than 10 minutes. | ||
type: STATIC | ||
nrql: | ||
query: "FROM VSphereDatastoreSample SELECT ((max(capacity) - max(freeSpace)) / max(capacity)) * 100 FACET datacenterName, name, displayName" | ||
valueFunction: SINGLE_VALUE | ||
terms: | ||
- priority: CRITICAL | ||
operator: ABOVE | ||
threshold: 90 | ||
thresholdDuration: 600 | ||
thresholdOccurrences: ALL | ||
signal: | ||
aggregationDelay: 120 | ||
aggregationMethod: EVENT_FLOW | ||
aggregationWindow: 60 | ||
|
||
violationTimeLimitSeconds: 259200 |
19 changes: 19 additions & 0 deletions
19
alert-policies/vmware-vsphere/datastore-overall-status.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
name: vSphere Datastore overallStatus = 'red' | ||
description: |+ | ||
This alert fires when a vSphere Datastore has an overall status = 'red' for longer than 5 minutes. | ||
type: STATIC | ||
nrql: | ||
query: "FROM VSphereDatastoreSample SELECT count(*) FACET datacenterName, name, displayName WHERE overallStatus = 'red'" | ||
valueFunction: SINGLE_VALUE | ||
terms: | ||
- priority: CRITICAL | ||
operator: ABOVE | ||
threshold: 0 | ||
thresholdDuration: 300 | ||
thresholdOccurrences: ALL | ||
signal: | ||
aggregationDelay: 120 | ||
aggregationMethod: EVENT_FLOW | ||
aggregationWindow: 60 | ||
|
||
violationTimeLimitSeconds: 259200 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
name: vSphere Host connection lost | ||
description: |+ | ||
This alert fires when a vSphere Host is not responding to heartbeats for longer than 5 minutes. | ||
type: STATIC | ||
nrql: | ||
query: "FROM VSphereHostSample SELECT count(*) FACET datacenterName, clusterName, hypervisorHostname WHERE connectionState = 'notResponding'" | ||
valueFunction: SINGLE_VALUE | ||
terms: | ||
- priority: CRITICAL | ||
operator: ABOVE | ||
threshold: 0 | ||
thresholdDuration: 300 | ||
thresholdOccurrences: ALL | ||
signal: | ||
aggregationDelay: 120 | ||
aggregationMethod: EVENT_FLOW | ||
aggregationWindow: 60 | ||
|
||
violationTimeLimitSeconds: 259200 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
name: vSphere Host high CPU Utilization | ||
description: |+ | ||
This alert fires when a vSphere Host has a CPU utilization % > 90 for longer than 5 minutes. | ||
type: STATIC | ||
nrql: | ||
query: "FROM VSphereHostSample SELECT max(cpu.percent) FACET datacenterName, clusterName, hypervisorHostname" | ||
valueFunction: SINGLE_VALUE | ||
terms: | ||
- priority: CRITICAL | ||
operator: ABOVE | ||
threshold: 90 | ||
thresholdDuration: 300 | ||
thresholdOccurrences: ALL | ||
signal: | ||
aggregationDelay: 120 | ||
aggregationMethod: EVENT_FLOW | ||
aggregationWindow: 60 | ||
|
||
violationTimeLimitSeconds: 259200 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
name: vSphere Host high Memory Utilization | ||
description: |+ | ||
This alert fires when a vSphere Host has memory utilization % > 90 for longer than 5 minutes. | ||
type: STATIC | ||
nrql: | ||
query: "FROM VSphereHostSample SELECT (max(mem.usage) / max(mem.size)) * 100 FACET datacenterName, clusterName, hypervisorHostname" | ||
valueFunction: SINGLE_VALUE | ||
terms: | ||
- priority: CRITICAL | ||
operator: ABOVE | ||
threshold: 90 | ||
thresholdDuration: 300 | ||
thresholdOccurrences: ALL | ||
signal: | ||
aggregationDelay: 120 | ||
aggregationMethod: EVENT_FLOW | ||
aggregationWindow: 60 | ||
|
||
violationTimeLimitSeconds: 259200 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
name: vSphere Host overallStatus = 'red' | ||
description: |+ | ||
This alert fires when a vSphere Host has an overall status = 'red' for longer than 5 minutes. | ||
type: STATIC | ||
nrql: | ||
query: "FROM VSphereHostSample SELECT count(*) FACET datacenterName, clusterName, hypervisorHostname WHERE overallStatus = 'red'" | ||
valueFunction: SINGLE_VALUE | ||
terms: | ||
- priority: CRITICAL | ||
operator: ABOVE | ||
threshold: 0 | ||
thresholdDuration: 300 | ||
thresholdOccurrences: ALL | ||
signal: | ||
aggregationDelay: 120 | ||
aggregationMethod: EVENT_FLOW | ||
aggregationWindow: 60 | ||
|
||
violationTimeLimitSeconds: 259200 |
19 changes: 19 additions & 0 deletions
19
alert-policies/vmware-vsphere/resourcepool-cpu-percent.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
name: vSphere Resource Pool high CPU Utilization | ||
description: |+ | ||
This alert fires when a vSphere Resource Pool has a CPU utilization % > 90 for longer than 5 minutes. | ||
type: STATIC | ||
nrql: | ||
query: "FROM VSphereResourcePoolSample SELECT (max(cpu.overallUsage) / max(cpu.totalMHz)) * 100 FACET datacenterName, clusterName, resourcePoolName" | ||
valueFunction: SINGLE_VALUE | ||
terms: | ||
- priority: CRITICAL | ||
operator: ABOVE | ||
threshold: 90 | ||
thresholdDuration: 300 | ||
thresholdOccurrences: ALL | ||
signal: | ||
aggregationDelay: 120 | ||
aggregationMethod: EVENT_FLOW | ||
aggregationWindow: 60 | ||
|
||
violationTimeLimitSeconds: 259200 |
19 changes: 19 additions & 0 deletions
19
alert-policies/vmware-vsphere/resourcepool-memory-percent.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
name: vSphere Resource Pool high Memory Utilization | ||
description: |+ | ||
This alert fires when a vSphere Resource Pool has memory utilization % > 90 for longer than 5 minutes. | ||
type: STATIC | ||
nrql: | ||
query: "FROM VSphereResourcePoolSample SELECT (max(mem.usage) / max(mem.size)) * 100 FACET datacenterName, clusterName, resourcePoolName" | ||
valueFunction: SINGLE_VALUE | ||
terms: | ||
- priority: CRITICAL | ||
operator: ABOVE | ||
threshold: 90 | ||
thresholdDuration: 300 | ||
thresholdOccurrences: ALL | ||
signal: | ||
aggregationDelay: 120 | ||
aggregationMethod: EVENT_FLOW | ||
aggregationWindow: 60 | ||
|
||
violationTimeLimitSeconds: 259200 |
19 changes: 19 additions & 0 deletions
19
alert-policies/vmware-vsphere/resourcepool-overall-status.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
name: vSphere Resource Pool overallStatus = 'red' | ||
description: |+ | ||
This alert fires when a vSphere Resource Pool has an overall status = 'red' for longer than 5 minutes. | ||
type: STATIC | ||
nrql: | ||
query: "FROM VSphereResourcePoolSample SELECT count(*) FACET datacenterName, clusterName, resourcePoolName WHERE overallStatus = 'red'" | ||
valueFunction: SINGLE_VALUE | ||
terms: | ||
- priority: CRITICAL | ||
operator: ABOVE | ||
threshold: 0 | ||
thresholdDuration: 300 | ||
thresholdOccurrences: ALL | ||
signal: | ||
aggregationDelay: 120 | ||
aggregationMethod: EVENT_FLOW | ||
aggregationWindow: 60 | ||
|
||
violationTimeLimitSeconds: 259200 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
name: vSphere Virtual Machine overallStatus = 'red' | ||
description: |+ | ||
This alert fires when a vSphere Virtual Machine has an overall status = 'red' for longer than 5 minutes. | ||
type: STATIC | ||
nrql: | ||
query: "FROM VSphereVmSample SELECT count(*) FACET datacenterName, clusterName, displayName WHERE overallStatus = 'red'" | ||
valueFunction: SINGLE_VALUE | ||
terms: | ||
- priority: CRITICAL | ||
operator: ABOVE | ||
threshold: 0 | ||
thresholdDuration: 300 | ||
thresholdOccurrences: ALL | ||
signal: | ||
aggregationDelay: 120 | ||
aggregationMethod: EVENT_FLOW | ||
aggregationWindow: 60 | ||
|
||
violationTimeLimitSeconds: 259200 |
Oops, something went wrong.