Skip to content

Make typing checks more strict#14429

Merged
andrey-git merged 13 commits intohome-assistant:devfrom
andrey-git:typing
Jul 13, 2018
Merged

Make typing checks more strict#14429
andrey-git merged 13 commits intohome-assistant:devfrom
andrey-git:typing

Conversation

@andrey-git
Copy link
Copy Markdown
Contributor

@andrey-git andrey-git commented May 13, 2018

Description:

Make typing checks more strict: add --strict-optional flag that forbids implicit None return type. This flag will become default in the next version of mypy (0.600)

Add homeassistant/util/ to checked dirs.

Checklist:

  • The code change is tested and works locally.
  • Local tests pass with tox. Your PR cannot be merged unless tests pass

@andrey-git andrey-git requested a review from a team as a code owner May 13, 2018 05:42
@homeassistant homeassistant added cla-signed core small-pr PRs with less than 30 lines. labels May 13, 2018
Copy link
Copy Markdown
Member

@MartinHjelmare MartinHjelmare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. One comment.

# Typing imports
# Typing imports that create a circular dependency
# pylint: disable=using-constant-test,unused-import
if False:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@fabaff fabaff mentioned this pull request May 13, 2018
2 tasks
return yaml.load(conf_file, Loader=SafeLineLoader) or OrderedDict()
except yaml.YAMLError as exc:
_LOGGER.error(exc)
_LOGGER.error(str(exc))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exceptions have a __str__ magic method so this shouldn't be needed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy complains that it is expecting str and got an Exception. Rather than add # type: ignore I think an explicit cast is more readable.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think mypy is wrong in this case. I'd prefer the ignore, but you decide, 🤷‍♂️.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I am leaning towards str



def load_json(filename: str, default: Union[List, Dict] = _UNDEFINED) \
def load_json(filename: str, default: Union[List, Dict, int] = _UNDEFINED) \
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it better to use _UNDEFINED than None?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to None

return "1 %s" % unit
elif number > 1:
return "%d %ss" % (number, unit)
return "%d %ss" % (number, unit)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use new style string formatting instead. https://pyformat.info/#number

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Copy Markdown
Member

@pvizeli pvizeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return None is a PEP 8 convention and checked by pylint. Strange that is not type conform.

try:
secrets = load_yaml(secret_path)
if not isinstance(secrets, dict):
raise HomeAssistantError
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a test

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@andrey-git
Copy link
Copy Markdown
Contributor Author

Addressed all comments.

async def wrapper() -> Any:
"""Wrapper coroutine around a callback."""
return target(*args)
task = self.loop.create_task(wrapper())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't wrap it into a slow coroutine.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any suggestions how to properly do that?

  1. If the caller knows that the target is a @callback and is not afraid of exceptions - they can call it directly.

  2. If the caller knows that the target is a @callback, they are afraid of exceptions and don't care about return value - they can call loop.call_soon manually (i.e. the previous implementation)

  3. If the caller knows that the target is a @callback, they are afraid of exceptions and they want the return value - the old implementation didn't provide that, the new one does.

  4. If the caller doesn't know what target is and they need the return value - same as (3)

  5. If the caller doesn't know what target is and they don't need the return value - I agree that the old implementation is better, but it would be also better for coro not to be wrapped in task for such a case.

Generally having a single function with different return types is bad.

Copy link
Copy Markdown
Member

@pvizeli pvizeli May 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally having a single function with different return types is bad.

For typing, yes but not for python itself

This change is very bad for the core and extend the memory footprint. We should hold the core slim and don't blow it up for a not exists type checking. Python is not strict and I hope that will be change in 4.x like c++ but actually I see no reason to make this bad change. In that case, I dislike this.

they can call loop.call_soon manually

And no, user should use our internal job handler or events stuff and not use the raw asyncio functions.

And the real solution is to remove @callback and make all that coreroutine. But it is good like it is now.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I would say that it's kinda a miracle that not returning anything has never caused any issues. The majority of the callbacks in use are actually being used inside helpers/event.py. There we call hass.async_run_job which will run callbacks right away and enqueue the other types of functions.

Let's measure how much overhead this adds with our benchmark module.

hass --script benchmark async_million_state_changed_helper

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On dev

› hass --script benchmark async_million_state_changed_helper
Using event loop: asyncio.unix_events
Benchmark async_million_state_changed_helper done in 4.237771602987777s
Benchmark async_million_state_changed_helper done in 4.250693870999385s
Benchmark async_million_state_changed_helper done in 4.256614304991672s
Benchmark async_million_state_changed_helper done in 4.24649005298852s

With this PR:

› hass --script benchmark async_million_state_changed_helper
Using event loop: asyncio.unix_events
Benchmark async_million_state_changed_helper done in 5.726596449007047s
Benchmark async_million_state_changed_helper done in 5.9516269309970085s
Benchmark async_million_state_changed_helper done in 5.672646453007474s
Benchmark async_million_state_changed_helper done in 5.7938863370072795s

That's just a quick check of running it a couple of times. That's a significant slowdown.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this converted both the listener inside the benchmark as the function defined inside async_track_state_change to async functions)

Copy link
Copy Markdown
Member

@balloob balloob May 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, if I just convert the function inside track_state_change to be async, I get back to ~5.5 seconds per run

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a proper solution is to make 5 new (type-safe) functions for all the usecases I mentioned above, then convert the callers and then delete the functions that are not actually used

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with that approach.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. That is python and not type safe. To make the code complex for nothing is not realy cool. If that is a c++ project, I would agree but not with python...

@andrey-git andrey-git changed the title Make typing checks more strict [WIP] Make typing checks more strict May 27, 2018
@ghost ghost assigned andrey-git Jul 9, 2018
@ghost ghost added the in progress label Jul 9, 2018
@andrey-git andrey-git changed the title [WIP] Make typing checks more strict Make typing checks more strict Jul 10, 2018
@andrey-git
Copy link
Copy Markdown
Contributor Author

I created hass.async_create_task that accepts coroutine and coroutinefunction only and returns scheduled task.

Fixed core callers to use the new function or async_add_executor_job whatever is appropriate.

@pvizeli
Copy link
Copy Markdown
Member

pvizeli commented Jul 10, 2018

So we have a hass.async_create_task and a hass.async_add_job. If we want stay correct, because the only different is that one can callbacks and the other can't, you should rename hass.async_add_job to hass.async_add_callback and support only callbacks.

async_add_executor_job is okay, I do this also in other project and help to read the code.

Anyway, that give a bad mixup. The benefit of this function was a simple caller function for developers, they don't know what he do, and the core make the correct one. As I replace the thread poll and create this function with magic of python, I never want to this splitup. Now it's a bad and wired situation. And the benefit from all? We are typesafe in a none type safe language 👍 contratulation.

All language they I known can handle a function like hass.async_add_job but python should now not able anymore to do this like other language while some one written a PEB...

@balloob
Copy link
Copy Markdown
Member

balloob commented Jul 11, 2018

asyncio itself does not offer a good system to keep track of what is coming from where. We often get reports from random stack traces that lack context and instead point at the task running code inside the event loop. Which task spawned the task that just blew up? And what event caused any of these tasks to start to begin with? (support for this will get significantly better in Python 3.7 with context vars)

By having created a catch all function, we initially made it even worse knowing what came from where or know what is happening by reading the code. It was needed to make the transition to asyncio from a sync world easier (or so I thought with my limited asyncio experience).

We want to be able to track jobs so that we can block until all work is done during startup, shutdown and tests. I think that adding async_create_task is a good idea. However, it should have the same API as loop.create_task() with as added functionality the tracking of jobs.

This is a great article about asyncio and correctness of spawned tasks that I read recently. It made me realize that switching to specific tasks is a good thing. I think that

  • we can replace ~90% of calls to async_add_job with async_add_executor_job.
  • From the leftover 10%, some will be async_create_task, where we want to spawn a task outside of our current context.
  • The final bit of functions is inside the event helpers where we don't know what type of listener function is coming in. I think that we can start enforcing some rules where the sync helpers only accept sync methods and the async helpers async methods or callbacks. This will also allow us to remove async_run_job. All work to deal with @callback can live inside the event helpers.
  • Now we can remove async_add_job.

About typing: I am a fan of adding types to the core. Todays benefit is checking if code is correct and better autocomplete in editors. However, in the future I expect it to help compilers to be able to better optimize our code and allow it to run faster.

@andrey-git
Copy link
Copy Markdown
Contributor Author

@balloob So what are the action items for this PR? I would like to finish making core type safe before refactoring async_add_job callers (if at all)

@balloob
Copy link
Copy Markdown
Member

balloob commented Jul 11, 2018

Let's make async_create_task only accept a single argument: a coroutine. (like the event loop API it wraps). After that I think this PR is ok to go.

async_add_job is called btw 662 times in our code. Let's get that to 0 ! 👍

@MartinHjelmare
Copy link
Copy Markdown
Member

We should start with updating the developer docs.


target: target to call.
"""
if asyncio.iscoroutine(target):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can drop this check.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

raise ValueError(
"async_create_task can be called on coroutine only.")

# If a task is scheduled
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# If we're tracking tasks or one can drop the comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

else:
data = await self.hass.async_add_executor_job(
json.load_json, self.path, None)
json.load_json, self.path, [])
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not technically the same. If we would store an empty list or dictionary in JSON, we would now treat it as no config file being available.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to {} which is the default value.

Any config is expected to have version attribute in the line below.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point

def utc_from_timestamp(timestamp: float) -> dt.datetime:
"""Return a UTC time from a timestamp."""
return dt.datetime.utcfromtimestamp(timestamp).replace(tzinfo=UTC)
return UTC.localize(dt.datetime.utcfromtimestamp(timestamp))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the same? Is a non timezone aware date that is being localized treated as having been UTC time? Or would it treat it as local time?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the implementation of UTC.localize

    def localize(self, dt, is_dst=False):
        '''Convert naive time to local time'''
        if dt.tzinfo is not None:
            raise ValueError('Not naive datetime (tzinfo is already set)')
        return dt.replace(tzinfo=self)

_LOGGER.exception('JSON file reading failed: %s', filename)
raise HomeAssistantError(error)
return {} if default is _UNDEFINED else default
return {} if default is None else default
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This not the same. By removing the sentinel value, it is no longer possible to have None be the default value.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were no uses of None as default value except the place I changed in storage.py

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was no use yet, as the storage helper is very new. I don't think that we should change this.

Copy link
Copy Markdown
Contributor Author

@andrey-git andrey-git Jul 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The storage helper was the lone use. I meant that no one else used None for load_json

By using sentinel value the function becomes not type-safe, as it accepts object and thus could return object if one were to pass it as default value.

Another option is to have {} as the default value and ignore that warning that generates with a comment that we are not changing it, nor do we return it as-is for the caller to modify.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel too strongly I guess.

Any place that was using util.json or util.yaml needs to be rewritten to use the storage helper anyway.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, you could also make the sentinal value an empty dict, that way you can keep it all as dict. Comparison is still done with is so will work the same.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will generate a warning that the default value is mutable.

Lets keep it None

@andrey-git andrey-git merged commit c2fe0d0 into home-assistant:dev Jul 13, 2018
@ghost ghost removed the in progress label Jul 13, 2018
@andrey-git andrey-git deleted the typing branch July 13, 2018 10:24
@balloob balloob mentioned this pull request Jul 13, 2018
awarecan pushed a commit to awarecan/home-assistant that referenced this pull request Jul 16, 2018
## Description:

Make typing checks more strict: add `--strict-optional` flag that forbids implicit None return type. This flag will become default in the next version of mypy (0.600)

Add `homeassistant/util/` to checked dirs.

## Checklist:
  - [x] The code change is tested and works locally.
  - [x] Local tests pass with `tox`. **Your PR cannot be merged unless tests pass**
@balloob balloob mentioned this pull request Jul 20, 2018
michaeldavie pushed a commit to michaeldavie/home-assistant that referenced this pull request Jul 31, 2018
## Description:

Make typing checks more strict: add `--strict-optional` flag that forbids implicit None return type. This flag will become default in the next version of mypy (0.600)

Add `homeassistant/util/` to checked dirs.

## Checklist:
  - [x] The code change is tested and works locally.
  - [x] Local tests pass with `tox`. **Your PR cannot be merged unless tests pass**
girlpunk pushed a commit to girlpunk/home-assistant that referenced this pull request Sep 4, 2018
## Description:

Make typing checks more strict: add `--strict-optional` flag that forbids implicit None return type. This flag will become default in the next version of mypy (0.600)

Add `homeassistant/util/` to checked dirs.

## Checklist:
  - [x] The code change is tested and works locally.
  - [x] Local tests pass with `tox`. **Your PR cannot be merged unless tests pass**
@home-assistant home-assistant locked and limited conversation to collaborators Dec 10, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

cla-signed core small-pr PRs with less than 30 lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants