-
Notifications
You must be signed in to change notification settings - Fork 347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Burst Detection Monoid #162
Comments
Neat paper. Is this different from fitting a piecewise-constant model to the rate at which the emails come? (I guess, to the parameter of a poisson process generating the emails?) I've had good luck with that kind of model in the past, but not sure how to implement it as a monoid... |
Actually that's an excellent point, Avi. I bet you can equate this algorithm to fitting piecewise linear with a regularization parameter (since the exponential model gives a constant rate of arrival between each transition). This probably can be done as a moniod in the same way that IO is a monad: you use the list Monoid along with two data types: BurstHistory, ArrivalTime. So you combine ArrivalTimes into a list, but when the list gets long enough, we fit a history to it and replace the fitted points with a history of rate changes. I think something like this could work, but you don't really want to keep the whole history of slope changes, instead you want to output the ones that can't change the future anymore, and only keep the last point for further fitting. Avi: have you seen (even approximate) methods for doing a streaming fit to the data to the piecewise linear fit? |
Re constant rate of arrival between each transition - that's why I said piecewise-constant, not piecewise-linear. Right? I've never seen a true streaming fit - it always involves buffering up a window, like you describe... once you have a window there are a million ways to do the fit.... this paper's cost function looks pretty plausible (in particular I like the asymmetry of it being expensive to change the rate upwards but free to change the rate downwards). Somewhat related and maybe interesting: http://www2.research.att.com/~sen/pub/nad-imc03.pdf keeps a Count-Min-Sketch for each time period, and then shows how to combine a series of historical CMS to produce a CMS of the moving average, or the Holt-Winters forecast, etc; then you can use that as the basis for alerting on the current CMS. (Actually they don't use min() as their estimator so it's not quite a CMS, they do something slightly fancier, but I think the data structure is identical). |
Here's a working link for that approach: https://www.cs.utexas.edu/~yzhang/papers/nad-imc03.pdf |
I'd like to see a monoid for detecting bursts in streams of data:
http://www.cs.cornell.edu/home/kleinber/bhs.pdf
/cc @johnynek
The text was updated successfully, but these errors were encountered: