Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept lambda for other param #8256

Merged
merged 3 commits into from
Sep 30, 2023
Merged

Accept lambda for other param #8256

merged 3 commits into from
Sep 30, 2023

Conversation

max-sixty
Copy link
Collaborator

No description provided.

Copy link
Collaborator

@headtr1ck headtr1ck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a nice addition

@@ -1076,7 +1076,8 @@ def where(self, cond: Any, other: Any = dtypes.NA, drop: bool = False) -> Self:
If a callable, it must expect this object as its only parameter.
other : scalar, DataArray or Dataset, optional
Value to use for locations in this object where ``cond`` is False.
By default, these locations filled with NA.
By default, these locations are filled with NA. If a callable, it must
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could start improving the static typing by adding the callable.
Maybe something like Any | Callable[Self, Self] for cond and other.

Only question I have right now: is it allowed to return a Dataset as well? Or e.g. a np.ndarray? And then the return type of where is wrong all together ...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should restrict the return type of the callable to what we allow passing directly. In other words, scalar, DataArray, and Dataset (if the type spec in the docstring is up-to-date).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the code it seems a bit inconsistent what this function accepts for cond (other seems to be fine actually).
For drop=False (default) we accept anything Dataset compatible but for drop=True only DataArrays and Datasets (I think).

So we should definitely look into getting this inline, but probably that's out of scope for this PR.

Copy link
Collaborator

@headtr1ck headtr1ck Sep 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But yes, the type should be something along T_Compatible | Callable[Self, T_Compatible]

But then we probably need to check what to do with non DataArrays or Datasets.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a go at this, it gets very complicated!

Because where is a single method on DataWithCoords there's some mismatch with methods such as .any, which return unions of types.

T_Compatible (which doesn't exist yet) probably needs to be generic over da / ds. But I'm not sure if that's even possible.

Overall this arguably highlights the fissures of having two approaches. So I'm not sure whether we should:

  1. have more on DataWithCoords so they're in the same hierarchy (and then some if isinstance checks, like we have in .where)
  2. commit to the approach of having methods implemented separately
  3. keep the status quo — mostly implemented separately with occasional unification (arguably somewhat arbitrary?)

@max-sixty max-sixty enabled auto-merge (squash) September 30, 2023 18:27
@max-sixty max-sixty merged commit f8ab40c into pydata:main Sep 30, 2023
@max-sixty max-sixty deleted the where-other branch October 14, 2023 07:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants