Rule pattern is similar to traditional wildcards. Like metric names, there are
delimeted by dots, and only *
is supported. A single *
must match a single
word.
a.b.c.d.e matches a.*.c.*.e
a.b.c.d.e dose not match a.*
[ ] On trend up AND Value >= ___
OR
[ ] On trend down AND Value <= ___
on trend up
=>Alert when trend rises abnormally
.on trend down
=>Alert when trend drops abnormally
.value >= X
=>Alert when metric value >= X
.value <= X
=>Alert when metric value <= X
.on trend up && value >= X
=>Alert when trend rises abnormally to at least X
.on trend down && value <= X
=>Alert when trend dorps abnormally to at most X
.on trend up || on trend down
=>Alert when trend rises or drops abnormally
.(on trend up && value >= X) || on trend down
=>Alert when trend rises to at least X or drops abnormally
.- And more..
Setting a fixed threshold to 0 means the rule is unrelated to this threshold.
For a timer, we only care its upward trend, the image below means:
alert when the time cost of note.add
rises abnormally.
The words "count ps" means "count per second".
For a count_ps, we care its upward and downward trend, the image below
means: alert when the number of calls of note.add
rises or drops abnormally.
For an errors counter, we care its upward trend, the image below means: alert when the number of errors rises to at least 10.
We may want to use simple thresholds but not dynamic trend analyzation, the image below means: alert when the number of hard errors is greater than 10.
Comment is required to create or edit a rule. Banshee supports dynamic variable matching for comments:
Rule Pattern: counter.*.*.error
Rule Comment: API $2 of Service $1 errors
Metric Name: counter.note.add.error
Alert Message: API add of Service note errors
Metrics matching disabled rules are still accepted and analyzed by banshee, but the alert won't work.
To disable a rule forever:
To disable a rule for a while:
Banshee won't trigger a detection for an idle metric, because only incoming datapoints drive the detection.
statsd won't forward datapoints to its backends if its
deleteIdleStats
is configured true
. But for some timer countps metrics, the "null" values
indicate accidents.
We can force banshee to track idle metrics, by setting the rule option below. Banshee would alert if any matched metric dosen't come for a certain time.
Universal receivers receive alerts from all projects.
By default, the configured silent time range is [0,6]
, which means
all projects won't alert in 00:00~06:00
by default.
Here is an example to customize this behavior: