I was running Cortex and I came across an issue that caused my alertmanager pods to crash unexpectedly. I was trying to use the following alertmanager configuration:
global:
resolve_timeout: 5s
receivers:
- name: example-email
email_configs:
- to: '[email protected]'
inhibit_rules:
-
I forgot to add my inhibition rules but didn't realize it would cause a NPE. The error from the crash was:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x10a83b1]
goroutine 261 [running]:
golang.a2z.com/AWSPrometheusCortex/vendor/github.com/prometheus/alertmanager/inhibit.NewInhibitRule(0x0)
The error is coming from https://github.com/prometheus/alertmanager/blob/main/inhibit/inhibit.go#L53-L56:
for _, cr := range rs {
r := NewInhibitRule(cr)
ih.rules = append(ih.rules, r)
}
Are nil list elements, in this case inhibition rules, processed even if they are nil on purpose? If I add if cr == nil continue to the code, it doesn't cause the alertmanager pods to crash since it skips the nil inhibition rule.