Suppress roborock failures under some unavailability threshold#158673
Conversation
We aggressively refresh roborock devices local channel (every 30 seconds) and there is a known issue where devices go unavailable around 3am every day for a period of ~1 minute which causes log spam during a non-critical background refresh. We instead will suppress refresh failures until a minimum unavailability threshold has passed.
|
Hey there @Lash-L, mind taking a look at this pull request as it has been labeled with an integration ( Code owner commandsCode owners of
|
There was a problem hiding this comment.
Pull request overview
This PR addresses a known issue where Roborock devices temporarily go offline around 3am for approximately 1 minute, causing log spam during routine background refreshes. The solution implements an unavailability threshold that suppresses reporting update failures until 2 minutes of consecutive failures have occurred.
Key changes:
- Added time-based failure suppression logic to avoid marking devices as unavailable during brief outages
- Improved exception handling by changing
HomeAssistantErrortoUpdateFailedinupdate_map()for consistency - Added comprehensive test coverage for the unavailability threshold behavior
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| homeassistant/components/roborock/coordinator.py | Implements unavailability threshold by tracking last successful update time and suppressing failures under 2 minutes; changes update_map() to raise UpdateFailed instead of HomeAssistantError for consistency |
| tests/components/roborock/test_init.py | Adds comprehensive test that verifies entities remain available during short failure periods (90 seconds), become unavailable after exceeding threshold (4.5 minutes total), and recover when updates resume |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
Should this be picked for a patch? |
See the PR description, i was expecting to wait on this, but maybe i'm being too conservative. |
|
OH, sorry, I missed the description 😓 Sorry |
We aggressively refresh roborock devices local channel (every 30 seconds) and there is a known issue where devices go unavailable around 3am every day for a period of ~1 minute which causes log spam during a non-critical background refresh. We instead will suppress refresh failures until a minimum unavailability threshold has passed.
Proposed change
Suppress roborock failures under some unavailability threshold to handle the 3am unavailability issue.
Other members: Very happy to have review/approval, but I would request you let me merge this myself. Also, I would like to target this for 2025.1.x, and not a patch release since it is a long standing issue.
Type of change
Additional information
Checklist
ruff format homeassistant tests)If user exposed functionality or configuration variables are added/changed:
If the code communicates with devices, web services, or third-party tools:
Updated and included derived files by running:
python3 -m script.hassfest.requirements_all.txt.Updated by running
python3 -m script.gen_requirements_all.To help with the load of incoming pull requests: