-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TYP: use overload to refine return type of set_axis #40197
Conversation
@MarcoGorelli I haven't looked at what you did, but did you take a look at the pandas type stubs created by Microsoft. See https://github.com/microsoft/python-type-stubs/blob/main/pandas/core/frame.pyi and https://github.com/microsoft/python-type-stubs/blob/main/pandas/core/generic.pyi and maybe that helps? If those ideas don't work, then I can take a closer look. |
Thanks @Dr-Irv , that's really useful! OK, they've gone down the "lots of overloads" approach, which is probably better |
Updating with the import pandas as pd
reveal_type(pd.DataFrame([1,2,3]).set_axis([2,3,4], axis=0, inplace=True))
reveal_type(pd.DataFrame([1,2,3]).set_axis([2,3,4], axis=0, inplace=False))
reveal_type(pd.DataFrame([1,2,3]).set_axis([2,3,4], inplace=True))
reveal_type(pd.DataFrame([1,2,3]).set_axis([2,3,4], inplace=False))
reveal_type(pd.DataFrame([1,2,3]).set_axis([2,3,4], axis=0))
reveal_type(pd.DataFrame([1,2,3]).set_axis([2,3,4])) $ mypy t.py
t.py:5: note: Revealed type is 'None'
t.py:6: note: Revealed type is 'pandas.core.frame.DataFrame'
t.py:7: note: Revealed type is 'None'
t.py:8: note: Revealed type is 'pandas.core.frame.DataFrame'
t.py:9: note: Revealed type is 'pandas.core.frame.DataFrame'
t.py:10: note: Revealed type is 'pandas.core.frame.DataFrame'
t.py:11: note: Revealed type is 'Union[pandas.core.frame.DataFrame, None]'
t.py:12: note: Revealed type is 'Union[pandas.core.frame.DataFrame, None]' EDITMore thorough: # t.py
import pandas as pd
inplace: bool
reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0))
reveal_type(pd.DataFrame([1,2,3]).reset_index(inplace=False))
reveal_type(pd.DataFrame([1,2,3]).reset_index(drop=True, inplace=True))
reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0, inplace=True))
reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0, drop=False, inplace=True))
reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0, inplace=True, col_fill='a'))
reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0, col_fill='a'))
reveal_type(pd.DataFrame([1,2,3]).reset_index(inplace=True))
reveal_type(pd.DataFrame([1,2,3]).reset_index(inplace=inplace))
reveal_type(pd.DataFrame([1,2,3]).reset_index(drop=True, inplace=inplace))
reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0, inplace=inplace))
reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0, drop=False, inplace=inplace))
reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0, inplace=inplace, col_fill='a'))
reveal_type(pd.DataFrame([1,2,3]).reset_index(inplace=inplace)) output: $ mypy t.py
t.py:5: note: Revealed type is 'pandas.core.frame.DataFrame'
t.py:6: note: Revealed type is 'pandas.core.frame.DataFrame'
t.py:7: note: Revealed type is 'None'
t.py:8: note: Revealed type is 'None'
t.py:9: note: Revealed type is 'None'
t.py:10: note: Revealed type is 'None'
t.py:11: note: Revealed type is 'pandas.core.frame.DataFrame'
t.py:12: note: Revealed type is 'None'
t.py:13: note: Revealed type is 'Union[pandas.core.frame.DataFrame, None]'
t.py:14: note: Revealed type is 'Union[pandas.core.frame.DataFrame, None]'
t.py:15: note: Revealed type is 'Union[pandas.core.frame.DataFrame, None]'
t.py:16: note: Revealed type is 'Union[pandas.core.frame.DataFrame, None]'
t.py:17: note: Revealed type is 'Union[pandas.core.frame.DataFrame, None]'
t.py:18: note: Revealed type is 'Union[pandas.core.frame.DataFrame, None]' |
This frustration is compounded with the methods being in frame.py and series.py in order to have specific docstrings maybe we could add the methods 'dynamically' as attributes calling the decorator functions instead of using them as function decorators to get the specific docstrings for the DataFrame and Series methods and then I think mypy would fall back to generic.py for the type definitions. Thanks @MarcoGorelli for working on this. can you also update to reinstate the function signature overloads and lose the duplicated overload (as discussed on other PR) edit: may need to use |
Hey @simonjayhawkins - I've reinstated the function signature types, not sure which overload you think is duplicated though (doesn't the discussion in the other PR show that in fact that overload is necessary?) here the last overload is even more necessary because in
|
sure. that comment is now outdated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @MarcoGorelli There doesn't seem much enthusiasm for #40202 so not sure how we are going to lock down these improvements or simplify the review process for overloads.
so lets get this in as is. merge when ready pending green
Sure, thanks @simonjayhawkins ! |
TYP: use overload to refine return type of set_axis (pandas-dev#40197)
* try typing set_axis * fixup overloads * remove return type from base definition * searching in the sun for another overload * bool * allow defaults in bool case * type non-overloaded arguments Co-authored-by: Simon Hawkins <[email protected]>
Trying to use overload so that mypy knows that the return type of
.set_axis
is depending on the value ofinplace
.This seems to work - if I have
then I get
which seems correct.
However, running
mypy pandas
throws some errors, which I've ignored (similarly to how this was done in #37137, though this time there's an extra error) and put the errors as comments. @simonjayhawkins @Dr-Irv sorry to ask for your help here, but do you know how to address this?Reading through python/mypy#6580 , my understanding is that we would need to provide exponentially many overloads for this to work without errors. However, by placing the
: Literal[False] = ...
overload first, then ties will be broken with that one and so the correct return type will be inferred (see the abovet.py
experiment)