-
-
Notifications
You must be signed in to change notification settings - Fork 37.8k
Added Time based SMA to Filter Sensor #13104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
da50319
90e1dcb
790d67a
212e341
3467fdd
2f767f9
8480db5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -8,6 +8,7 @@ | |
| import statistics | ||
| from collections import deque, Counter | ||
| from numbers import Number | ||
| from datetime import datetime | ||
|
|
||
| import voluptuous as vol | ||
|
|
||
|
|
@@ -26,6 +27,7 @@ | |
| FILTER_NAME_LOWPASS = 'lowpass' | ||
| FILTER_NAME_OUTLIER = 'outlier' | ||
| FILTER_NAME_THROTTLE = 'throttle' | ||
| FILTER_NAME_TIME_SMA = 'time_sma' | ||
| FILTERS = Registry() | ||
|
|
||
| CONF_FILTERS = 'filters' | ||
|
|
@@ -44,24 +46,32 @@ | |
| ICON = 'mdi:chart-line-variant' | ||
|
|
||
| FILTER_SCHEMA = vol.Schema({ | ||
| vol.Optional(CONF_FILTER_WINDOW_SIZE, | ||
| default=DEFAULT_WINDOW_SIZE): vol.Coerce(int), | ||
| vol.Optional(CONF_FILTER_PRECISION, | ||
| default=DEFAULT_PRECISION): vol.Coerce(int), | ||
| }) | ||
|
|
||
| FILTER_OUTLIER_SCHEMA = FILTER_SCHEMA.extend({ | ||
| vol.Required(CONF_FILTER_NAME): FILTER_NAME_OUTLIER, | ||
| vol.Optional(CONF_FILTER_WINDOW_SIZE, | ||
| default=DEFAULT_WINDOW_SIZE): vol.Coerce(int), | ||
| vol.Optional(CONF_FILTER_RADIUS, | ||
| default=DEFAULT_FILTER_RADIUS): vol.Coerce(float), | ||
| }) | ||
|
|
||
| FILTER_LOWPASS_SCHEMA = FILTER_SCHEMA.extend({ | ||
| vol.Required(CONF_FILTER_NAME): FILTER_NAME_LOWPASS, | ||
| vol.Optional(CONF_FILTER_WINDOW_SIZE, | ||
| default=DEFAULT_WINDOW_SIZE): vol.Coerce(int), | ||
| vol.Optional(CONF_FILTER_TIME_CONSTANT, | ||
| default=DEFAULT_FILTER_TIME_CONSTANT): vol.Coerce(int), | ||
| }) | ||
|
|
||
| FILTER_TIME_SMA_SCHEMA = FILTER_SCHEMA.extend({ | ||
| vol.Required(CONF_FILTER_NAME): FILTER_NAME_TIME_SMA, | ||
| vol.Required(CONF_FILTER_WINDOW_SIZE): vol.All(cv.time_period, | ||
| cv.positive_timedelta) | ||
| }) | ||
|
|
||
| FILTER_THROTTLE_SCHEMA = FILTER_SCHEMA.extend({ | ||
| vol.Required(CONF_FILTER_NAME): FILTER_NAME_THROTTLE, | ||
| }) | ||
|
|
@@ -72,6 +82,7 @@ | |
| vol.Required(CONF_FILTERS): vol.All(cv.ensure_list, | ||
| [vol.Any(FILTER_OUTLIER_SCHEMA, | ||
| FILTER_LOWPASS_SCHEMA, | ||
| FILTER_TIME_SMA_SCHEMA, | ||
| FILTER_THROTTLE_SCHEMA)]) | ||
| }) | ||
|
|
||
|
|
@@ -277,6 +288,46 @@ def _filter_state(self, new_state): | |
| return filtered | ||
|
|
||
|
|
||
| @FILTERS.register(FILTER_NAME_TIME_SMA) | ||
| class TimeSMAFilter(Filter): | ||
| """Simple Moving Average (SMA) Filter. | ||
|
|
||
| The window_size is determined by time, and SMA is time weighted. | ||
| """ | ||
|
|
||
| def __init__(self, window_size, precision, entity): | ||
| """Initialize Filter.""" | ||
| super().__init__(FILTER_NAME_TIME_SMA, 0, precision, entity) | ||
| self._time_window = int(window_size.total_seconds()) | ||
| self.last_leak = None | ||
| self.queue = deque() | ||
|
|
||
| def _leak(self, now): | ||
| """Remove timeouted elements.""" | ||
| while self.queue: | ||
| timestamp, _ = self.queue[0] | ||
| if timestamp + self._time_window <= now: | ||
| self.last_leak = self.queue.popleft() | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there any reason this is assigned to an instance variable instead of a local variable and returned at the end of the while loop ?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please look at the previous comment #13104 (comment) Keeping this value assures me I always have some value to work with in the _filter_state() The filter always lags 1 state behind, so using the new state in these cases (when there are no previous states in the windows) would make the behaviour erratic. |
||
| else: | ||
| return | ||
|
|
||
| def _filter_state(self, new_state): | ||
| now = int(datetime.now().timestamp()) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use our date util package and UTC timestamps: from homeassistant.util import date as dt_util
dt_util.utcnow()That way you won't get hit by daylight savings etc. Any reason you're using timestamps instead of datetime objects? window size is already given to you in a timedelta. So you can do things like
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Time needs to be multiplied with an unknown type value, so I preferred to keep safe and make sure time is an int. |
||
|
|
||
| self._leak(now) | ||
| self.queue.append((now, float(new_state))) | ||
| moving_sum = 0 | ||
| start = now - self._time_window | ||
| _, prev_val = self.last_leak or (0, float(new_state)) | ||
|
|
||
| for timestamp, val in self.queue: | ||
| moving_sum += (timestamp-start)*prev_val | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I based the implementation in this paper: http://www.eckner.com/papers/Algorithms%20for%20Unevenly%20Spaced%20Time%20Series.pdf My implementation is that of SMA_last, what you propose is SMA_linear. Both are acceptable filters, yet different ones.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok that is true; I don't see much value in this SMA_last method over the trapezoids though. The latter just seems quite a bit more accurate in most cases.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will probably PR a second filter based on SMA_linear, but the use cases are different: SMA_last: "discrete observation values"
SMA_linear: "discretely-observed continuous-time stochastic processes"
|
||
| start, prev_val = timestamp, val | ||
| moving_sum += (now-start)*prev_val | ||
|
|
||
| return moving_sum/self._time_window | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What if we don't receive any value within I know this might be a bit of work to do but it's what I'd expect as a user. See usages of
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I disagree, If I don't get any values it just means that it hasn't changed.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The problem here is that the time has changed and time is also an input into this filter. Imagine that our current value is based off 2 states that happened during the window. If we wait long enough that 1 state drops out of our window, wouldn't you expect to base the value off just the other state?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @balloob that is exactly what it will happen. Even if no points are left in the time window, the filter will report based on the last value that exited the windows (_, prev_val = self.last_leak )
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @dgomes I think you're misunderstanding Paulus' comment (or I'm not understanding it...) If there's only one point within the time window, as a user I wouldn't expect it to use the last value outside the time window for any computation. It should just discard anything that's outside the time window. In other words: only every consider the data points that are inside the time frame for computation. See this gif for an example (the blue points are the data points, the blue area is our time window and red line is the output): Couple of things going on here:
(The computation of the values while both data points are inside the time window is probably not right in my gif, but that's not the focus of this comment.)
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @OttoWinter that is not the filter I'm implementing here. Sure, that filter can be implemented, but it's not the one in this PR. Please share how you are creating those graphs 😄
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @dgomes I see that you report only on the last value but you only update the filter value when a new state comes in right?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @dgomes Ok, sorry I will shut up about other filters 😅 Yeah those graphics were done with (ab)using GeoGebra, but I would describe it as the single most user-unfriendly maths tools ever written IMHO (but it's sadly what's taught here in Austria 🇦🇹). Seriously, it's incredibly hard to use if you're starting out. (incidentally, because of its poor usability the rest of my class will probably also have to re-take a recent maths exam 😥) btw, if for some reason you're looking for a nice tool (in python, yay) for creating nice presentations, have a look at manim. For example, it has allowed me to create these fancy visualizations about graph theory: https://youtu.be/t2Gb1YA-HcU 😺
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @balloob right, this filter always lags 1 state behind. The concept is that the raw sensor has maintained the previous status until the new state has arrived. E.g.: Think about a water pump that publishes its rate (1 to 5). If the pump is at speed 5 and comes down to speed 4 (new state) the SMA is calculated with the period of time in which the pump was at 5 (we weight in the time the pump was at 5), 4 is of no use in the SMA since the period of time is 0 (it just got to 4). @OttoWinter please contribute more filters :) It's just that I'm prioritising filters I can use in my setup. There are plenty of important filters missing which I hope some else can contribute 😄
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay I'm going to shut up talking about filters as I have clearly not a clue what I am talking about 👍 |
||
|
|
||
|
|
||
| @FILTERS.register(FILTER_NAME_THROTTLE) | ||
| class ThrottleFilter(Filter): | ||
| """Throttle Filter. | ||
|
|
||



There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this name, it's not very descriptive. Maybe
time_weighted_averageinstead?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That name is too long :(
The actual name would be SMA_last as per the paper the filter is based upon:
http://www.eckner.com/papers/Algorithms%20for%20Unevenly%20Spaced%20Time%20Series.pdf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really think it's too long. I mean yes it's longer than most other filter names but
time_smais just incredibly hard to understand and the more descriptive name outweighs how long it is for me. And we shouldn't force our users to read scientific papers, ideally the filter would just work.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Furthermore it's not a weighted average. Just refer to wikipedia for the differences:
https://en.wikipedia.org/wiki/Moving_average
I agree users should not need to read scientific papers to profit from filters, but that was another PR ;)
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From another comment on this PR I understand that the "last" part of the official name is actually an important implementation detail of this algorithm. I would expect that to be part of the name unless we plan on making it first/last etc configurable for this filter later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think putting the last/first/average part into a
type:option very clean. If we add separate filters for each filter type we would eventually just pollute the filter list.And since these filters are so similar it would also make sense semantically. Can of course also go into a follow-up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As described in the other comment, a second PR with the "linear" implementation is possible, although I have other filters in the priority list first. Maybe someone else can have a go 😄
I propose to link the paper in the documentation and code "type: last" as the default for the time being.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like adding a type: last to the options for this filter 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type is reserved word, can I use variant :) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean with reserved? We're talking about the config format right? I think that this makes sense:
I am a bit confused why a name can be too long? Assume your users, like me, have no clue about filters. I would prefer less abbreviations or having to read papers to get what I want.
time_simple_moving_averagesounds about right?