Skip to content

Replace asyncio.wait with asyncio.gather since wait ignores exceptions#33380

Merged
balloob merged 13 commits intohome-assistant:devfrom
ziv1234:fix_2
Apr 1, 2020
Merged

Replace asyncio.wait with asyncio.gather since wait ignores exceptions#33380
balloob merged 13 commits intohome-assistant:devfrom
ziv1234:fix_2

Conversation

@ziv1234
Copy link
Copy Markdown
Contributor

@ziv1234 ziv1234 commented Mar 28, 2020

fix for test_entity_platform so it expects the exception

Breaking change

Not a breaking change

Proposed change

Fix for the helpers/test_entity_platform exceptions
Replaced asyncio.wait with asyncio.gather since wait doesn't propagate the exceptions to the waiting function. Once there, a duplicate unique_id triggers an exception that the test now catches.
Not my code, but just a suggestion

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New integration (thank you!)
  • New feature (which adds functionality to an existing integration)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Example entry for configuration.yaml:

# Example configuration.yaml

Additional information

  • This PR fixes or closes issue: fixes #
  • This PR is related to issue:
  • Link to documentation pull request:

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • The code has been formatted using Black (black --fast homeassistant tests)
  • Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

If the code communicates with devices, web services, or third-party tools:

  • The manifest file has all fields filled out correctly.
    Updated and included derived files by running: python3 -m script.hassfest.
  • New or updated dependencies have been added to requirements_all.txt.
    Updated by running python3 -m script.gen_requirements_all.
  • Untested files have been added to .coveragerc.

The integration reached or maintains the following Integration Quality Scale:

  • No score or internal
  • 🥈 Silver
  • 🥇 Gold
  • 🏆 Platinum


if tasks:
await asyncio.wait(tasks)
await asyncio.gather(*tasks)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't want this, because we don't want setup of 1 platform cancel the others. The same with the change for reset. We still want to log them.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all proposed changes here, it's ok to just log the errors and continue.

Copy link
Copy Markdown
Contributor Author

@ziv1234 ziv1234 Mar 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense, but brings up a few questions / suggestions:

  1. If we don't want this specific exception to be caught, perhaps instead of throwing the exception, just logging? Is there anywhere that we do catch it and do something with it?
  2. If my change passed all the unit tests, but there is functionality that it breaks (we don't want setup of 1 platform cancel the others), perhaps we should add a unit tests that tests this scenario. I don't know enough about the startup process of hass to create this, but sounds like a simple test to prevent others from making my mistake :)
  3. In any case, even if we cancel this exception, probably better to change the wait(tasks) to gather(*tasks) so the exceptions don't disappear and catch them in the calling method. I tried looking and it seems like _async_set_up_integrations in bootstrap.py may be the right place, but not sure

What do you think is the best behavior? Maybe something like:

  • during startup, we log the exceptions as we don't want to fail the other components
  • when an integration is loaded dynamically, leave the exception so it would fail the load
    As mentioned, I am pretty ignorant about the startup process, so hopefully this makes some sense...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. We should log it in this method where we change from wait to gather. Gather can return exceptions.
  2. Yeah that's bad. I'll add one.
  3. Yeah

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. just added the changes to _async_set_up_integrations
I assume CORE_INTEGRATIONS don't need to be checked as if they fail, we fail anyway, right?

@MartinHjelmare MartinHjelmare changed the title replace asyncio.wait with asyncio.gather since wait ignores exceptions Replace asyncio.wait with asyncio.gather since wait ignores exceptions Mar 29, 2020
@ziv1234
Copy link
Copy Markdown
Contributor Author

ziv1234 commented Mar 30, 2020

also, the same issue exists with async_add_entities with duplicate IDs. an exception gets thrown and never gets caught. This is the thing that fails the mqtt tests.
again, you guys know what the right behavior for the system, but if we don't want to propagate the exception, probably better to replace the exception with an error log.
Let me know what behavior is best and i can make the change.

@balloob
Copy link
Copy Markdown
Member

balloob commented Mar 30, 2020

Was actually just looking at this to add the tests (and got distracted obviously), but here is a quick update:

EntityPlatform should not raise HomeAssistantError on malformed or duplicate entity ID but instead should print an error and ignore entity.

@ziv1234
Copy link
Copy Markdown
Contributor Author

ziv1234 commented Mar 31, 2020

Was actually just looking at this to add the tests (and got distracted obviously), but here is a quick update:

EntityPlatform should not raise HomeAssistantError on malformed or duplicate entity ID but instead should print an error and ignore entity.

I agree. Will make the change tomorrow

@ziv1234
Copy link
Copy Markdown
Contributor Author

ziv1234 commented Mar 31, 2020

ok. made the change in entity_platform so it doesn't throw exception on duplicate unique_id and simply logs and returns.
I edited the list of ignored_exceptions, but the problem is that there are several tests that fail in the CI and succeed locally. Do you know why this can happen?
Hopefully I finally got the list right, but concerned about tests passing locally and not the CI as I try to not push a commit that fails tests...

Comment thread homeassistant/bootstrap.py Outdated
Comment thread homeassistant/helpers/entity_platform.py
Comment thread homeassistant/bootstrap.py Outdated
)
else:
_LOGGER.error(
"Error setting up integration %s - returned False", domain
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entity platform is already dealing with return value of False

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so just remove the log in that case?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

await asyncio.wait(futures.values())
errors = [domain for domain in domains if futures[domain].exception()]
for domain in errors:
_LOGGER.error(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this one, we have set up async_setup_component in such a way that we already wrap every piece of code that we call from an integration with try…catch. So if any of this raises, it's truly unexpected. So we should make sure we pass the exception as exc_info so we can print a stack trace

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. done

Comment thread homeassistant/bootstrap.py Outdated
"Error setting up integration %s - received exception",
domain,
futures[domain].exception(),
**kwargs, # type: ignore
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do it like this and not do exc_info=(type(exception), exception, exception.__traceback__) ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i just copied it from the default exception handler in core.py, but you are correct. i made the change. do you want me to change the one in core.py as well?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, if i understand the python behavior correctly, the one in core.py can also be replaced by exc_info=True, as it happens during the context of the exception, but not sure as i am new to python

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if we're within a catch block, we should let the logger get it themselves.

@balloob balloob merged commit 4dbbf93 into home-assistant:dev Apr 1, 2020
@ziv1234
Copy link
Copy Markdown
Contributor Author

ziv1234 commented Apr 1, 2020

thanks! one thing that is still puzzling me is that there are some tests (at least some of dyson and the localfile test) that fail on the CI if i remove them from the ignore list and they run fine locally, both with pytest and tox.
Now that this has been merged, i can recreate in a separate "dummy" PR and open an issue, since it would be great if the local tests would simulate the CI as well as possible

@balloob
Copy link
Copy Markdown
Member

balloob commented Apr 1, 2020

On CI we run things in parallel with -n auto and based on the number of cores, it means the order of tests can be different, which might impact it.

@ziv1234
Copy link
Copy Markdown
Contributor Author

ziv1234 commented Apr 1, 2020

Ok. I find it failing quite predictably in the CI on specific tests and i wasn't able to get them to fail locally. I opened an issue (#33504) to try and understand because it happens consistently.

Anything else that is different? In general, the tests are usually running independent of each other as each one creates a new hass fixture, right?

@MartinHjelmare
Copy link
Copy Markdown
Member

CI VMs can be slow so time difference can be greater which can matter when playing with time, eg not patching time correctly and firing time changed events.

@lock lock Bot locked and limited conversation to collaborators Apr 2, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

cla-signed core small-pr PRs with less than 30 lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants