Skip to content

Commit

Permalink
Merge pull request #2180 from rahul188/aws-backup
Browse files Browse the repository at this point in the history
[Product Partnerships] Added Amazon Backup
  • Loading branch information
zstix authored Jan 16, 2024
2 parents 91adb70 + bd4833b commit c4f5a37
Show file tree
Hide file tree
Showing 9 changed files with 651 additions and 0 deletions.
34 changes: 34 additions & 0 deletions alert-policies/amazon-backup/HighBackupJobFailure.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: High Backup Job Failure

description: |+
This alert is triggered if the number of Backup Job Failure exceeds 10 for 10 minutes.
type: STATIC
nrql:
query: "SELECT sum(`aws.backup.NumberOfBackupJobsFailed`) as 'Query' FROM Metric"

# Function used to aggregate the NRQL query value(s) for comparison to the terms.threshold (Default: SINGLE_VALUE)
valueFunction: SINGLE_VALUE

# List of Critical and Warning thresholds for the condition
terms:
- priority: CRITICAL
# Operator used to compare against the threshold.
operator: ABOVE
# Value that triggers a violation
threshold: 10
# Time in seconds; 120 - 3600
thresholdDuration: 600
# How many data points must be in violation for the duration
thresholdOccurrences: ALL

# Adding a Warning threshold is optional
- priority: WARNING
operator: ABOVE
threshold: 5
thresholdDuration: 600
thresholdOccurrences: ALL

# Duration after which a violation automatically closes
# Time in seconds; 300 - 2592000 (Default: 86400 [1 day])
violationTimeLimitSeconds: 86400
34 changes: 34 additions & 0 deletions alert-policies/amazon-backup/HighCopyJobsFailure.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: High Copy Job Failure

description: |+
This alert is triggered if the number of Copy Job Failure exceeds 10 for 10 minutes.
type: STATIC
nrql:
query: "SELECT sum(`aws.backup.NumberOfCopyJobsFailed`) as 'Query' FROM Metric"

# Function used to aggregate the NRQL query value(s) for comparison to the terms.threshold (Default: SINGLE_VALUE)
valueFunction: SINGLE_VALUE

# List of Critical and Warning thresholds for the condition
terms:
- priority: CRITICAL
# Operator used to compare against the threshold.
operator: ABOVE
# Value that triggers a violation
threshold: 10
# Time in seconds; 120 - 3600
thresholdDuration: 600
# How many data points must be in violation for the duration
thresholdOccurrences: ALL

# Adding a Warning threshold is optional
- priority: WARNING
operator: ABOVE
threshold: 5
thresholdDuration: 600
thresholdOccurrences: ALL

# Duration after which a violation automatically closes
# Time in seconds; 300 - 2592000 (Default: 86400 [1 day])
violationTimeLimitSeconds: 86400
34 changes: 34 additions & 0 deletions alert-policies/amazon-backup/HighRestoreJobsFailure.yml.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: High Restore Job Failure

description: |+
This alert is triggered if the number of Restore Job Failure exceeds 10 for 10 minutes.
type: STATIC
nrql:
query: "SELECT sum(`aws.backup.NumberOfRestoreJobsFailed`) as 'Query' FROM Metric"

# Function used to aggregate the NRQL query value(s) for comparison to the terms.threshold (Default: SINGLE_VALUE)
valueFunction: SINGLE_VALUE

# List of Critical and Warning thresholds for the condition
terms:
- priority: CRITICAL
# Operator used to compare against the threshold.
operator: ABOVE
# Value that triggers a violation
threshold: 10
# Time in seconds; 120 - 3600
thresholdDuration: 600
# How many data points must be in violation for the duration
thresholdOccurrences: ALL

# Adding a Warning threshold is optional
- priority: WARNING
operator: ABOVE
threshold: 5
thresholdDuration: 600
thresholdOccurrences: ALL

# Duration after which a violation automatically closes
# Time in seconds; 300 - 2592000 (Default: 86400 [1 day])
violationTimeLimitSeconds: 86400
Loading

0 comments on commit c4f5a37

Please sign in to comment.