-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accept lambda
for other
param
#8256
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a nice addition
@@ -1076,7 +1076,8 @@ def where(self, cond: Any, other: Any = dtypes.NA, drop: bool = False) -> Self: | |||
If a callable, it must expect this object as its only parameter. | |||
other : scalar, DataArray or Dataset, optional | |||
Value to use for locations in this object where ``cond`` is False. | |||
By default, these locations filled with NA. | |||
By default, these locations are filled with NA. If a callable, it must |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could start improving the static typing by adding the callable.
Maybe something like Any | Callable[Self, Self]
for cond and other.
Only question I have right now: is it allowed to return a Dataset as well? Or e.g. a np.ndarray? And then the return type of where is wrong all together ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should restrict the return type of the callable to what we allow passing directly. In other words, scalar
, DataArray
, and Dataset
(if the type spec in the docstring is up-to-date).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the code it seems a bit inconsistent what this function accepts for cond
(other
seems to be fine actually).
For drop=False
(default) we accept anything Dataset compatible but for drop=True
only DataArrays and Datasets (I think).
So we should definitely look into getting this inline, but probably that's out of scope for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But yes, the type should be something along T_Compatible | Callable[Self, T_Compatible]
But then we probably need to check what to do with non DataArrays or Datasets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a go at this, it gets very complicated!
Because where
is a single method on DataWithCoords
there's some mismatch with methods such as .any
, which return unions of types.
T_Compatible
(which doesn't exist yet) probably needs to be generic over da
/ ds
. But I'm not sure if that's even possible.
Overall this arguably highlights the fissures of having two approaches. So I'm not sure whether we should:
- have more on
DataWithCoords
so they're in the same hierarchy (and then someif isinstance
checks, like we have in.where
) - commit to the approach of having methods implemented separately
- keep the status quo — mostly implemented separately with occasional unification (arguably somewhat arbitrary?)
No description provided.