Replace asyncio.wait with asyncio.gather since wait ignores exceptions#33380
Replace asyncio.wait with asyncio.gather since wait ignores exceptions#33380balloob merged 13 commits intohome-assistant:devfrom
Conversation
|
|
||
| if tasks: | ||
| await asyncio.wait(tasks) | ||
| await asyncio.gather(*tasks) |
There was a problem hiding this comment.
We shouldn't want this, because we don't want setup of 1 platform cancel the others. The same with the change for reset. We still want to log them.
There was a problem hiding this comment.
In all proposed changes here, it's ok to just log the errors and continue.
There was a problem hiding this comment.
that makes sense, but brings up a few questions / suggestions:
- If we don't want this specific exception to be caught, perhaps instead of throwing the exception, just logging? Is there anywhere that we do catch it and do something with it?
- If my change passed all the unit tests, but there is functionality that it breaks (we don't want setup of 1 platform cancel the others), perhaps we should add a unit tests that tests this scenario. I don't know enough about the startup process of hass to create this, but sounds like a simple test to prevent others from making my mistake :)
- In any case, even if we cancel this exception, probably better to change the wait(tasks) to gather(*tasks) so the exceptions don't disappear and catch them in the calling method. I tried looking and it seems like _async_set_up_integrations in bootstrap.py may be the right place, but not sure
What do you think is the best behavior? Maybe something like:
- during startup, we log the exceptions as we don't want to fail the other components
- when an integration is loaded dynamically, leave the exception so it would fail the load
As mentioned, I am pretty ignorant about the startup process, so hopefully this makes some sense...
There was a problem hiding this comment.
- We should log it in this method where we change from wait to gather. Gather can return exceptions.
- Yeah that's bad. I'll add one.
- Yeah
There was a problem hiding this comment.
ok. just added the changes to _async_set_up_integrations
I assume CORE_INTEGRATIONS don't need to be checked as if they fail, we fail anyway, right?
|
also, the same issue exists with async_add_entities with duplicate IDs. an exception gets thrown and never gets caught. This is the thing that fails the mqtt tests. |
|
Was actually just looking at this to add the tests (and got distracted obviously), but here is a quick update:
|
I agree. Will make the change tomorrow |
|
ok. made the change in entity_platform so it doesn't throw exception on duplicate unique_id and simply logs and returns. |
fix for test_entity_platform so it expects the exception
…with log and return reverted test_entity_platform to its original since now there is no exception thrown
…lly fail in the CI
| ) | ||
| else: | ||
| _LOGGER.error( | ||
| "Error setting up integration %s - returned False", domain |
There was a problem hiding this comment.
The entity platform is already dealing with return value of False
There was a problem hiding this comment.
ok, so just remove the log in that case?
| await asyncio.wait(futures.values()) | ||
| errors = [domain for domain in domains if futures[domain].exception()] | ||
| for domain in errors: | ||
| _LOGGER.error( |
There was a problem hiding this comment.
For this one, we have set up async_setup_component in such a way that we already wrap every piece of code that we call from an integration with try…catch. So if any of this raises, it's truly unexpected. So we should make sure we pass the exception as exc_info so we can print a stack trace
| "Error setting up integration %s - received exception", | ||
| domain, | ||
| futures[domain].exception(), | ||
| **kwargs, # type: ignore |
There was a problem hiding this comment.
Why do it like this and not do exc_info=(type(exception), exception, exception.__traceback__) ?
There was a problem hiding this comment.
i just copied it from the default exception handler in core.py, but you are correct. i made the change. do you want me to change the one in core.py as well?
There was a problem hiding this comment.
BTW, if i understand the python behavior correctly, the one in core.py can also be replaced by exc_info=True, as it happens during the context of the exception, but not sure as i am new to python
There was a problem hiding this comment.
Yes, if we're within a catch block, we should let the logger get it themselves.
|
thanks! one thing that is still puzzling me is that there are some tests (at least some of dyson and the localfile test) that fail on the CI if i remove them from the ignore list and they run fine locally, both with pytest and tox. |
|
On CI we run things in parallel with |
|
Ok. I find it failing quite predictably in the CI on specific tests and i wasn't able to get them to fail locally. I opened an issue (#33504) to try and understand because it happens consistently. Anything else that is different? In general, the tests are usually running independent of each other as each one creates a new hass fixture, right? |
|
CI VMs can be slow so time difference can be greater which can matter when playing with time, eg not patching time correctly and firing time changed events. |
fix for test_entity_platform so it expects the exception
Breaking change
Not a breaking change
Proposed change
Fix for the helpers/test_entity_platform exceptions
Replaced asyncio.wait with asyncio.gather since wait doesn't propagate the exceptions to the waiting function. Once there, a duplicate unique_id triggers an exception that the test now catches.
Not my code, but just a suggestion
Type of change
Example entry for
configuration.yaml:# Example configuration.yamlAdditional information
Checklist
black --fast homeassistant tests)If user exposed functionality or configuration variables are added/changed:
If the code communicates with devices, web services, or third-party tools:
Updated and included derived files by running:
python3 -m script.hassfest.requirements_all.txt.Updated by running
python3 -m script.gen_requirements_all..coveragerc.The integration reached or maintains the following Integration Quality Scale: