-
Notifications
You must be signed in to change notification settings - Fork 301
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'release' into NR-172737-migrate-circle-ci
- Loading branch information
Showing
1,469 changed files
with
123,832 additions
and
5,505 deletions.
There are no files selected for viewing
Validating CODEOWNERS rules …
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
31 changes: 31 additions & 0 deletions
31
alert-policies/active-directory/active-directory-replication-failures.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
name: Active Directory Replication Failures | ||
description: |+ | ||
This alert is triggered when the Attempt timestamp != the Success timestamp, indicating a failure in replication between domain contollers. | ||
type: STATIC | ||
nrql: | ||
query: "FROM activeDirectoryReplicationPartners SELECT count(*) FACET server, partner WHERE lastReplicationSuccess != lastReplicationAttempt" | ||
|
||
valueFunction: SINGLE_VALUE | ||
terms: | ||
- priority: CRITICAL | ||
operator: ABOVE | ||
threshold: 0 | ||
thresholdDuration: 120 | ||
thresholdOccurrences: ALL | ||
|
||
expiration: | ||
closeViolationsOnExpiration: false | ||
openViolationOnExpiration: false | ||
expirationDuration: null | ||
|
||
signal: | ||
aggregationDelay: 120 | ||
aggregationMethod: EVENT_FLOW | ||
aggregationTimer: null | ||
aggregationWindow: 60 | ||
fillOption: NONE | ||
fillValue: null | ||
slideBy: null | ||
|
||
violationTimeLimitSeconds: 86400 |
32 changes: 32 additions & 0 deletions
32
alert-policies/active-directory/active-directory-windows-services.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
name: Active Directory Windows Services | ||
description: |+ | ||
This alert is triggered when any of the targeted Windows Services are in a state other than "running". | ||
The scope of this alert is Windows Services using the 'label.primary_app = active_directory' decoration. | ||
type: STATIC | ||
nrql: | ||
query: "FROM Metric SELECT filter(count(*), WHERE state != 'running') FACET hostname, entity.name WHERE metricName = 'windows_service_state' AND label.primary_app = 'active_directory'" | ||
|
||
valueFunction: SINGLE_VALUE | ||
terms: | ||
- priority: CRITICAL | ||
operator: ABOVE | ||
threshold: 0 | ||
thresholdDuration: 300 | ||
thresholdOccurrences: ALL | ||
|
||
expiration: | ||
closeViolationsOnExpiration: false | ||
openViolationOnExpiration: false | ||
expirationDuration: null | ||
|
||
signal: | ||
aggregationDelay: 120 | ||
aggregationMethod: EVENT_FLOW | ||
aggregationTimer: null | ||
aggregationWindow: 60 | ||
fillOption: NONE | ||
fillValue: null | ||
slideBy: null | ||
|
||
violationTimeLimitSeconds: 86400 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
35 changes: 35 additions & 0 deletions
35
alert-policies/adobe-commerce-business-insights/5xxErrors.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
name: 5xx Server Errors | ||
|
||
description: |+ | ||
This alert is triggered if the customer faces 5xx server errors more than 5 times in 5 minutes. | ||
type: STATIC | ||
nrql: | ||
query: "SELECT count(*) as '5xx Server Errors' from Transaction WHERE httpResponseCode LIKE '5%'" | ||
|
||
# Function used to aggregate the NRQL query value(s) for comparison to the terms.threshold (Default: SINGLE_VALUE) | ||
valueFunction: SINGLE_VALUE | ||
|
||
# List of Critical and Warning thresholds for the condition | ||
terms: | ||
- priority: CRITICAL | ||
# Operator used to compare against the threshold. | ||
operator: ABOVE | ||
# Value that triggers a violation | ||
threshold: 10 | ||
# Time in seconds; 120 - 3600 | ||
thresholdDuration: 300 | ||
# How many data points must be in violation for the duration | ||
thresholdOccurrences: ALL | ||
- priority: WARNING | ||
# Operator used to compare against the threshold. | ||
operator: ABOVE | ||
# Value that triggers a violation | ||
threshold: 5 | ||
# Time in seconds; 120 - 3600 | ||
thresholdDuration: 300 | ||
# How many data points must be in violation for the duration | ||
thresholdOccurrences: ALL | ||
|
||
# Duration after which a violation automatically closes | ||
# Time in seconds; 300 - 2592000 (Default: 86400 [1 day]) | ||
violationTimeLimitSeconds: 86400 |
35 changes: 35 additions & 0 deletions
35
alert-policies/adobe-commerce-business-insights/cpuUsage.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
name: CPU Usage (%) | ||
|
||
description: |+ | ||
This alert is triggered if CPU usage exceeds 90% for 5 minutes. | ||
type: STATIC | ||
nrql: | ||
query: "SELECT latest(host.cpuPercent) AS 'CPU Used %' FROM Metric" | ||
|
||
# Function used to aggregate the NRQL query value(s) for comparison to the terms.threshold (Default: SINGLE_VALUE) | ||
valueFunction: SINGLE_VALUE | ||
|
||
# List of Critical and Warning thresholds for the condition | ||
terms: | ||
- priority: CRITICAL | ||
# Operator used to compare against the threshold. | ||
operator: ABOVE | ||
# Value that triggers a violation | ||
threshold: 90 | ||
# Time in seconds; 120 - 3600 | ||
thresholdDuration: 300 | ||
# How many data points must be in violation for the duration | ||
thresholdOccurrences: ALL | ||
- priority: WARNING | ||
# Operator used to compare against the threshold. | ||
operator: ABOVE | ||
# Value that triggers a violation | ||
threshold: 80 | ||
# Time in seconds; 120 - 3600 | ||
thresholdDuration: 300 | ||
# How many data points must be in violation for the duration | ||
thresholdOccurrences: ALL | ||
|
||
# Duration after which a violation automatically closes | ||
# Time in seconds; 300 - 2592000 (Default: 86400 [1 day]) | ||
violationTimeLimitSeconds: 86400 |
35 changes: 35 additions & 0 deletions
35
alert-policies/adobe-commerce-business-insights/downtime.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
name: Downtime (%) | ||
|
||
description: |+ | ||
This alert is triggered if Downtime is more than 1% for 2 minutes. | ||
type: STATIC | ||
nrql: | ||
query: "SELECT percentage(count(result), where result = 'FAILED') as 'Downtime (%)' from SyntheticCheck" | ||
|
||
# Function used to aggregate the NRQL query value(s) for comparison to the terms.threshold (Default: SINGLE_VALUE) | ||
valueFunction: SINGLE_VALUE | ||
|
||
# List of Critical and Warning thresholds for the condition | ||
terms: | ||
- priority: CRITICAL | ||
# Operator used to compare against the threshold. | ||
operator: ABOVE | ||
# Value that triggers a violation | ||
threshold: 1 | ||
# Time in seconds; 120 - 3600 | ||
thresholdDuration: 120 | ||
# How many data points must be in violation for the duration | ||
thresholdOccurrences: ALL | ||
- priority: WARNING | ||
# Operator used to compare against the threshold. | ||
operator: ABOVE | ||
# Value that triggers a violation | ||
threshold: 0.5 | ||
# Time in seconds; 120 - 3600 | ||
thresholdDuration: 120 | ||
# How many data points must be in violation for the duration | ||
thresholdOccurrences: ALL | ||
|
||
# Duration after which a violation automatically closes | ||
# Time in seconds; 300 - 2592000 (Default: 86400 [1 day]) | ||
violationTimeLimitSeconds: 86400 |
35 changes: 35 additions & 0 deletions
35
alert-policies/adobe-commerce-business-insights/memoryUsage.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
name: Memory Usage (%) | ||
|
||
description: |+ | ||
This alert is triggered if Memory usage exceeds 90% for 5 minutes. | ||
type: STATIC | ||
nrql: | ||
query: "SELECT latest(host.memoryUsedPercent) as 'Memory Used %' FROM Metric" | ||
|
||
# Function used to aggregate the NRQL query value(s) for comparison to the terms.threshold (Default: SINGLE_VALUE) | ||
valueFunction: SINGLE_VALUE | ||
|
||
# List of Critical and Warning thresholds for the condition | ||
terms: | ||
- priority: CRITICAL | ||
# Operator used to compare against the threshold. | ||
operator: ABOVE | ||
# Value that triggers a violation | ||
threshold: 90 | ||
# Time in seconds; 120 - 3600 | ||
thresholdDuration: 300 | ||
# How many data points must be in violation for the duration | ||
thresholdOccurrences: ALL | ||
- priority: WARNING | ||
# Operator used to compare against the threshold. | ||
operator: ABOVE | ||
# Value that triggers a violation | ||
threshold: 80 | ||
# Time in seconds; 120 - 3600 | ||
thresholdDuration: 300 | ||
# How many data points must be in violation for the duration | ||
thresholdOccurrences: ALL | ||
|
||
# Duration after which a violation automatically closes | ||
# Time in seconds; 300 - 2592000 (Default: 86400 [1 day]) | ||
violationTimeLimitSeconds: 86400 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
# Name of the alert | ||
name: Fail Generation | ||
|
||
# Description and details | ||
description: |+ | ||
This alert is triggered when more than 10 read/write transactions fail on generation check in 5 minutes. | ||
# Type of alert | ||
type: STATIC | ||
|
||
# NRQL query | ||
nrql: | ||
query: "FROM Metric SELECT latest(aerospike_namespace_fail_generation) as 'Fail Generation'" | ||
|
||
# Function used to aggregate the NRQL query value(s) for comparison to the terms.threshold (Default: SINGLE_VALUE) | ||
valueFunction: SINGLE_VALUE | ||
|
||
# List of Critical and Warning thresholds for the condition | ||
terms: | ||
- priority: CRITICAL | ||
# Operator used to compare against the threshold. | ||
operator: ABOVE | ||
# Value that triggers a violation | ||
threshold: 10 | ||
# Time in seconds; 120 - 3600 | ||
thresholdDuration: 300 | ||
# How many data points must be in violation for the duration | ||
thresholdOccurrences: ALL | ||
- priority: WARNING | ||
# Operator used to compare against the threshold. | ||
operator: ABOVE | ||
# Value that triggers a violation | ||
threshold: 5 | ||
# Time in seconds; 120 - 3600 | ||
thresholdDuration: 300 | ||
# How many data points must be in violation for the duration | ||
thresholdOccurrences: ALL | ||
|
||
# Duration after which a violation automatically closes | ||
# Time in seconds; 300 - 2592000 (Default: 86400 [1 day]) | ||
violationTimeLimitSeconds: 86400 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
# Name of the alert | ||
name: Heap Efficiency (%) | ||
|
||
# Description and details | ||
description: |+ | ||
This alert is triggered when heap efficiency is below 50% for 5 minutes. | ||
# Type of alert | ||
type: STATIC | ||
|
||
# NRQL query | ||
nrql: | ||
query: "FROM Metric SELECT latest(aerospike_node_stats_heap_efficiency_pct)" | ||
|
||
# Function used to aggregate the NRQL query value(s) for comparison to the terms.threshold (Default: SINGLE_VALUE) | ||
valueFunction: SINGLE_VALUE | ||
|
||
# List of Critical and Warning thresholds for the condition | ||
terms: | ||
- priority: CRITICAL | ||
# Operator used to compare against the threshold. | ||
operator: BELOW | ||
# Value that triggers a violation | ||
threshold: 50 | ||
# Time in seconds; 120 - 3600 | ||
thresholdDuration: 300 | ||
# How many data points must be in violation for the duration | ||
thresholdOccurrences: ALL | ||
- priority: WARNING | ||
# Operator used to compare against the threshold. | ||
operator: BELOW | ||
# Value that triggers a violation | ||
threshold: 60 | ||
# Time in seconds; 120 - 3600 | ||
thresholdDuration: 300 | ||
# How many data points must be in violation for the duration | ||
thresholdOccurrences: ALL | ||
|
||
# Duration after which a violation automatically closes | ||
# Time in seconds; 300 - 2592000 (Default: 86400 [1 day]) | ||
violationTimeLimitSeconds: 86400 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
name: Partitions Unavailable | ||
|
||
description: |+ | ||
This alert is triggered when partitions are unavailable for 5 minutes. | ||
type: STATIC | ||
nrql: | ||
query: "FROM Metric SELECT latest(aerospike_namespace_unavailable_partitions) as 'Partitions Unavailable'" | ||
|
||
# Function used to aggregate the NRQL query value(s) for comparison to the terms.threshold (Default: SINGLE_VALUE) | ||
valueFunction: SINGLE_VALUE | ||
|
||
# List of Critical and Warning thresholds for the condition | ||
terms: | ||
- priority: CRITICAL | ||
# Operator used to compare against the threshold. | ||
operator: ABOVE | ||
# Value that triggers a violation | ||
threshold: 1 | ||
# Time in seconds; 120 - 360 | ||
thresholdDuration: 300 | ||
# How many data points must be in violation for the duration | ||
thresholdOccurrences: ALL | ||
- priority: WARNING | ||
# Operator used to compare against the threshold. | ||
operator: ABOVE | ||
# Value that triggers a violation | ||
threshold: 0 | ||
# Time in seconds; 120 - 3600 | ||
thresholdDuration: 300 | ||
# How many data points must be in violation for the duration | ||
thresholdOccurrences: ALL | ||
|
||
# Duration after which a violation automatically closes | ||
# Time in seconds; 300 - 2592000 (Default: 86400 [1 day]) | ||
violationTimeLimitSeconds: 86400 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
name: Uptime | ||
|
||
description: |+ | ||
This alert is triggered when the uptime is below 100 seconds for 5 minutes. | ||
type: STATIC | ||
nrql: | ||
query: "FROM Metric SELECT latest(aerospike_node_stats_uptime) as 'Uptime'" | ||
|
||
# Function used to aggregate the NRQL query value(s) for comparison to the terms.threshold (Default: SINGLE_VALUE) | ||
valueFunction: SINGLE_VALUE | ||
|
||
# List of Critical and Warning thresholds for the condition | ||
terms: | ||
- priority: CRITICAL | ||
# Operator used to compare against the threshold. | ||
operator: BELOW | ||
# Value that triggers a violation | ||
threshold: 100 | ||
# Time in seconds; 120 - 3600 | ||
thresholdDuration: 300 | ||
# How many data points must be in violation for the duration | ||
thresholdOccurrences: ALL | ||
- priority: WARNING | ||
# Operator used to compare against the threshold. | ||
operator: BELOW | ||
# Value that triggers a violation | ||
threshold: 300 | ||
# Time in seconds; 120 - 3600 | ||
thresholdDuration: 300 | ||
# How many data points must be in violation for the duration | ||
thresholdOccurrences: ALL | ||
|
||
# Duration after which a violation automatically closes | ||
# Time in seconds; 300 - 2592000 (Default: 86400 [1 day]) | ||
violationTimeLimitSeconds: 86400 |
Oops, something went wrong.