Added a default timeout to Tradfri observations#18497
Added a default timeout to Tradfri observations#18497alex3305 wants to merge 2 commits intohome-assistant:devfrom alex3305:tradfri-observe-reset
Conversation
The IKEA Tradfri devices were configured by default that the observation of the individual devices never timed out. These observations are used to check the current state of the device. However, I have experienced that having an infinite observation, there is a fair possibility that devices aren't responsive any more from the UI. This seem to be caused by a race condition somewhere in the async code of Home Assistant or pytradfri. Or possibly even the underlying apicoap library. Since setting states through automation still seems to work and after debugging for a couple of hours, I figured a workaround to this issue was the least I could contribute. As I was unable to find the root cause. This at least (partially) solves #9822 and #14386 and is almost equal to the proposal of @max-te, but with a less frequent observation timeout.
|
Appearently Travis failed because I forgot to commit a variable. I amended that with an amended commit, but was not picked up by Travis unfortunately. Is there anyway to trigger Travis again? Also it seems that the code that Travis fails on is unrelated. |
|
It's a shame there are some oddities with the gateway and CoAP, when the connection dies the observation should end and be restarted automatically. While I don't (and have never) experienced this issue on my network, with my gateway, I understand that others do have issues with holding persistent connections to the gateway. There have been talks about adding support for a heartbeat in aiocoap in the future , but I'm not sure where that ended up. |
|
This differs from my workaround in that you don't call something like |
|
@lwis I've not tested this extensively yet. I was hoping to see multiple people testing this. The @max-te In Small edit @lwis I saw that you wrote most of the |
|
At the time I was under the impression that |
Amends #18497 with an additional call to `loop_call_later`. According to a little more reasearch regarding this PR, me and @max-te saw that the `err_callback` wasn't called when the set duration timed out. Maybe this can be fixed in pytradfri, or as @lwis suggested there probably should be a heartbeat in pytradfri to prevent this kind of behaviour. Although this workaround works, this change can also cause a bit of a stack leak with Tradfri devices. But since the timeout is set at 1 hour at default, this shouldn't be much of an issue.
| duration=0) | ||
| duration=DEFAULT_OBSERVE_TIMEOUT) | ||
| self.hass.async_create_task(self._api(cmd)) | ||
| self.hass.loop.call_later(DEFAULT_OBSERVE_TIMEOUT - 1, |
| duration=0) | ||
| duration=DEFAULT_OBSERVE_TIMEOUT) | ||
| self.hass.async_create_task(self._api(cmd)) | ||
| self.hass.loop.call_later(DEFAULT_OBSERVE_TIMEOUT - 1, |
| duration=0) | ||
| duration=DEFAULT_OBSERVE_TIMEOUT) | ||
| self.hass.async_create_task(self._api(cmd)) | ||
| self.hass.loop.call_later(DEFAULT_OBSERVE_TIMEOUT - 1, |
|
Can you make this configurable with the default set to 0? |
|
@lwis I cannot figure out how to get the configurable value working. With all the async threads being passed around and separate classes, it's quite hard to wrap my head around it. Can you give me any pointers? |
| KEY_API = 'tradfri_api' | ||
| CONF_ALLOW_TRADFRI_GROUPS = 'allow_tradfri_groups' | ||
| DEFAULT_ALLOW_TRADFRI_GROUPS = False | ||
| DEFAULT_OBSERVE_TIMEOUT = 3600 # Set default timeout to 1 hour in seconds |
There was a problem hiding this comment.
The name suggests that is a default for options. Call it TIMEOUT_OBSERVE
There was a problem hiding this comment.
I'm looking into it, also with @lwis suggestion to make it configurable.
But thanks for the suggestion.
|
I will close this PR, because I want to wait out for home-assistant-libs/pytradfri#208 to be merged and possibly released. That will at least make Tradfri more stable regarding updates and observations. |
Description:
The IKEA Tradfri devices were configured by default that the observation of the
individual devices never timed out. These observations are used to check the
current state of the device.
However, I have experienced that having an infinite observation, there is a
fair possibility that devices aren't responsive any more from the UI. This
seem to be caused by a race condition somewhere in the async code of Home
Assistant or pytradfri. Or possibly even the underlying apicoap library.
Since setting states through automation still seems to work and after
debugging for a couple of hours, I figured a workaround to this issue was the
least I could contribute. As I was unable to find the root cause. This at least
(partially) solves #9822 and #14386 and is almost equal to the proposal of
@max-te, but with a less frequent observation timeout.
Related issue (if applicable): partially fixes #9822 and #14386
Checklist:
tox. Your PR cannot be merged unless tests pass