Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: silence alarms in CODE and handle missing data correctly #404

Merged
merged 3 commits into from
Apr 9, 2021

Conversation

jacobwinch
Copy link
Contributor

@jacobwinch jacobwinch commented Apr 9, 2021

What does this change?

This PR makes two improvements to our alarms:

  1. Disables alarm actions in CODE by default. This prevents us from spamming teams with alerts about their CODE environments (in my experience these just generate unactionable noise, as CODE environments are expected to break sometimes).
  2. Handles missing data correctly for lambda-error-percentage alarms. No data in this scenario is fine; it just means that the lambda has not been invoked at all.

Does this change require changes to existing projects or CDK CLI?

No. Any user who wants to keep (or temporarily test) alarm actions in their CODE environment can still do so if they wish by setting actionsEnabledInCode to true.

How to test

I've tried upgrading https://github.com/guardian/tag-janitor/blob/main/cdk/lib/cdk-stack.ts (locally) in order to use this change and confirmed that the changes set looks sensible in AWS. I haven't tested actually deploying the change as there is no CODE environment for this stack. I'll finish the upgrade and check the deployment in PROD after releasing this change.

How can we measure success?

  • We should receive fewer unactionable alarms
  • Lambda-error-percentage alarms will be considered OK in scenarios where the lambda is not being invoked

Have we considered potential risks?

There is a small risk that disabling these alarm actions in CODE (by default) will prevent us from noticing problems with alarm configuration. I think using the ActionsEnabled property allows us to minimise this risk as much as possible.

@jacobwinch jacobwinch marked this pull request as ready for review April 9, 2021 12:20
@jacobwinch jacobwinch requested a review from a team April 9, 2021 12:20
Copy link
Member

@akash1810 akash1810 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent improvement 👍🏽

@jacobwinch jacobwinch merged commit 9040502 into main Apr 9, 2021
@jacobwinch jacobwinch deleted the jw-alarm-improvements branch April 9, 2021 12:24
@github-actions
Copy link
Contributor

github-actions bot commented Apr 9, 2021

🎉 This PR is included in version 8.0.1 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants