fix: silence alarms in CODE and handle missing data correctly #404

jacobwinch · 2021-04-09T11:01:18Z

What does this change?

This PR makes two improvements to our alarms:

Disables alarm actions in CODE by default. This prevents us from spamming teams with alerts about their CODE environments (in my experience these just generate unactionable noise, as CODE environments are expected to break sometimes).
Handles missing data correctly for lambda-error-percentage alarms. No data in this scenario is fine; it just means that the lambda has not been invoked at all.

Does this change require changes to existing projects or CDK CLI?

No. Any user who wants to keep (or temporarily test) alarm actions in their CODE environment can still do so if they wish by setting actionsEnabledInCode to true.

How to test

I've tried upgrading https://github.com/guardian/tag-janitor/blob/main/cdk/lib/cdk-stack.ts (locally) in order to use this change and confirmed that the changes set looks sensible in AWS. I haven't tested actually deploying the change as there is no CODE environment for this stack. I'll finish the upgrade and check the deployment in PROD after releasing this change.

How can we measure success?

We should receive fewer unactionable alarms
Lambda-error-percentage alarms will be considered OK in scenarios where the lambda is not being invoked

Have we considered potential risks?

There is a small risk that disabling these alarm actions in CODE (by default) will prevent us from noticing problems with alarm configuration. I think using the ActionsEnabled property allows us to minimise this risk as much as possible.

akash1810

Excellent improvement 👍🏽

github-actions · 2021-04-09T12:27:10Z

🎉 This PR is included in version 8.0.1 🎉

The release is available on:

Your semantic-release bot 📦🚀

jacobwinch added 3 commits April 9, 2021 11:39

fix: disable alarm actions in CODE by default

a2ef8e6

fix: handle missing data correctly for Lambda error % alarms

9138fa1

chore: appease the linter

5df16ac

jacobwinch marked this pull request as ready for review April 9, 2021 12:20

jacobwinch requested a review from a team April 9, 2021 12:20

akash1810 approved these changes Apr 9, 2021

View reviewed changes

jacobwinch merged commit 9040502 into main Apr 9, 2021

jacobwinch deleted the jw-alarm-improvements branch April 9, 2021 12:24

github-actions bot added the released label Apr 9, 2021

This was referenced Apr 9, 2021

Upgrade to @guardian/cdk 8.0.1 guardian/tag-janitor#92

Merged

feat: Add optional alarm configuration to EC2 App pattern #462

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: silence alarms in CODE and handle missing data correctly #404

fix: silence alarms in CODE and handle missing data correctly #404

jacobwinch commented Apr 9, 2021 •

edited

Loading

akash1810 left a comment

github-actions bot commented Apr 9, 2021

fix: silence alarms in CODE and handle missing data correctly #404

fix: silence alarms in CODE and handle missing data correctly #404

Conversation

jacobwinch commented Apr 9, 2021 • edited Loading

What does this change?

Does this change require changes to existing projects or CDK CLI?

How to test

How can we measure success?

Have we considered potential risks?

akash1810 left a comment

Choose a reason for hiding this comment

github-actions bot commented Apr 9, 2021

jacobwinch commented Apr 9, 2021 •

edited

Loading