-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(alerts): Add ratelimiting for metric alerts #58032
Conversation
Codecov Report
@@ Coverage Diff @@
## master #58032 +/- ##
==========================================
+ Coverage 79.03% 79.06% +0.02%
==========================================
Files 5131 5138 +7
Lines 223537 223607 +70
Branches 37640 37650 +10
==========================================
+ Hits 176678 176784 +106
+ Misses 41188 41158 -30
+ Partials 5671 5665 -6
|
this feels like something that will need to be on the docs since before it could fire every minute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given that this is a very sensitive change, could we consider putting it behind an options backed feature flag? so if something goes wrong we can simply turn off? and test internally first?
[project.id for project in last_incident.projects.all()] if last_incident else [] | ||
) | ||
minutes_since_last_incident = ( | ||
(timezone.now() - last_incident.date_added).seconds / 60 if last_incident else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I slightly prefer using timedelta
to raw second math
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you elaborate what you mean by that? i believe the .seconds
is a timedelta
property
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking generally good to me. Since we need indexes here I'd recommend making a separate pr for them and merging that pr first
|
||
# If an incident was created for this rule, trigger type, and subscription | ||
# within the last 10 minutes, don't make another one | ||
last_incident: Incident | None = trigger.triggered_incidents.order_by("-date_added").first() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might need to add an index on IncidentTrigger
on alert_rule_trigger, date_added
as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure i understand the use of an additional index here. i know it would've been helpful had we kept the original approach with Incident.objects.filter(...)
, but since we use trigger.triggered_incidents.order_by("-date_added").first()
to get last_incident
, how does an additional index help us make this more efficent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh sorry, I misread the code and thought it was sorting on date_added
in IncidentTrigger
, but it's in Incident
We might need an index on the Incident table though... (id, date_added)
Here's what this query will look like
SELECT *
FROM "sentry_incident"
INNER JOIN "sentry_incidenttrigger" ON ("sentry_incident"."id" = "sentry_incidenttrigger"."incident_id")
WHERE "sentry_incidenttrigger"."alert_rule_trigger_id" = 1
ORDER BY "sentry_incident"."date_added" DESC
This PR adds a new index to `Incident` to make querying for the metric alert rate limiting feature more efficient (#58032).
Just a note to not merge this until the index is created |
This PR adds a new index to `IncidentTrigger` to make querying for the metric alert rate limiting feature more efficient #58032. [Query in question](https://github.com/getsentry/sentry/pull/58032/files#diff-d262481ba0aaae3473d14f62d3cbb554e099bda1093454b7935890139f3fe5a8R558-R563)
Ratelimiting is added (under an options-backed feature flag) to ensure metric alerts don't fire more than every 10 minutes. Action item from INC-454.
Closes #54730