-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add Series|Expr.is_finite
method
#1341
base: main
Are you sure you want to change the base?
Conversation
return self._from_native_series( | ||
np.isfinite(self._native_series) & ~self._native_series.isna() | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a opinionated choice that na is not finite
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
π€ no sure, wouldn't we want to preserve null values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Behavior is different for different pandas backend dtype. Let me come back with an example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm actually, for classical pandas types, we wouldn't have the option of returning a nullable boolean (if we want to preserve the dtype backend)
π€ gonna think about this a little longer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These would be the output:
data = [float("nan"), float("inf"), 2.0, None]
s = pd.Series(data)
np.isfinite(s)
0 False
1 False
2 True
3 False
dtype: bool
np.isfinite(s.convert_dtypes(dtype_backend="numpy_nullable"))
0 <NA>
1 False
2 True
3 <NA>
dtype: boolean
np.isfinite(s.convert_dtypes(dtype_backend="pyarrow"))
0 False
1 False
2 True
3 False
dtype: bool
While for polars:
pl.Series(data).is_finite()
shape: (4,)
Series: '' [bool]
[
false
false
true
null
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What type of PR is this? (check all applicable)
Related issues
Series|Expr.is_finite
Β #1297Checklist
If you have comments or can explain your changes, please do so below.
As mentioned in the issue itself, pandas and dask treat nan's and null's as same. Actually, even worse, for non nullable backend,
np.isfinite
returns False and for nullable-backends will return<NA>
. I made the opinionated choice to be consistent across different pandas backends and always return False for nulls and nans. I hope the warning in the docstring is enough