-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Alert implementation #198
Conversation
@alexluong is attempting to deploy a commit to the Hookdeck team on Vercel, but is not a member of this team. To resolve this issue, you can:
To read more about collaboration on Vercel, click here. |
…ounce & alert behavior
} | ||
|
||
// Create request | ||
req, err := http.NewRequestWithContext(ctx, http.MethodPost, n.callbackURL, bytes.NewReader(body)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't we say we'd publish to alert to a MQ instead? If we stick to a callback URL we need some auth mecanishm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I recall, we briefly discussed publishing alerts to an MQ but I don't think we made a final call on that. With that said, the way the notifier is implemented we can update the mechanism pretty easily.
I wonder if it makes sense to send the alert to the deliverymq and let the operator register their internal destinations somehow?
Let me know if you'd like us to send the alert to an mq here. I think it's a bit trickier but totally doable too.
Anyway, I did implement the auth callback using API key as the spec, please check it out!
initial comment
partial implementation of #105; this PR only implements an `internal/alert` package which is the logic for the alert feature. I'll open another PR shortly where this package is used within the `deliverymq` flow and full e2e test cases.Note: I just went over the spec and noticed the
ALERT_FAILURE_RATE
feature. Will get around to that next too.Alert System Implementation
Implements a Redis-backed alert system for monitoring destination health and auto-disabling failing destinations.
Core Interfaces
AlertMonitor
: Main interface for handling delivery attemptsHandleAttempt(ctx context.Context, attempt DeliveryAttempt) error
AlertStore
: Redis persistence layer for failure counts and alert statesIncrementAndGetAlertState(ctx context.Context, tenantID, destinationID string) (AlertState, error)
ResetAlertState(ctx context.Context, tenantID, destinationID string) error
UpdateLastAlert(ctx context.Context, tenantID, destinationID string, t time.Time, level int) error
AlertEvaluator
: Determines when to trigger alerts based on thresholdsGetAlertLevel(failures int64) int
ShouldAlert(failures int64, lastAlertTime time.Time, lastAlertLevel int) (level int, shouldAlert bool)
AlertNotifier
: Sends HTTP alerts to configured callback URLNotify(ctx context.Context, alert Alert) error
Redis Schema
Key Structure:
alert:{tenant_id}:{destination_id}:failures
- Counter for consecutive failuresalert:{tenant_id}:{destination_id}:last_alert
- Hash storing:time
: Last alert timestamplevel
: Last alert level (percentage)Key Features