-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skip mean over empty axis #5207
Conversation
Avoids changing the datatype if the data does not have the requested axis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @dschwoerer !
I think the result is right.
It's implemented as a special-case, which isn't ideal. But I'm not sure how to abstract the rule that applies to mean
but not to std
— something like "does f([x]) == x
". I guess it should apply to .sum
too.
For other reviewers — here's an example of the std
case (which this PR doesn't change, correctly):
In [5]: ds
Out[5]:
<xarray.Dataset>
Dimensions: (pos: 3, time: 2)
Coordinates:
* pos (pos) int64 1 2 3
Dimensions without coordinates: time
Data variables:
data (pos, time) float64 1.0 2.0 3.0 4.0 5.0 6.0
var (pos) int64 2 3 4
In [6]: ds.std('time')
Out[6]:
<xarray.Dataset>
Dimensions: (pos: 3)
Coordinates:
* pos (pos) int64 1 2 3
Data variables:
data (pos) float64 0.5 0.5 0.5
var (pos) float64 0.0 0.0 0.0 # this should be 0.0, unlike `.mean`'s result
doc/whats-new.rst
Outdated
@@ -85,6 +85,10 @@ Breaking changes | |||
as positional, all others need to be passed are keyword arguments. This is part of the | |||
refactor to support external backends (:issue:`4309`, :pull:`4989`). | |||
By `Alessandro Amici <https://github.com/alexamici>`_. | |||
- :py:func:`mean` does not change the data if axis is None. This |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:py:func:mean
is actually a method (there should be other references in the whats-new which can be copied)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't found an example, so I have removed the explicit reference
xarray/tests/test_duck_array_ops.py
Outdated
ds["pos"] = [1, 2, 3] | ||
ds["data"] = ("pos", "time"), [[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]] | ||
ds["var"] = "pos", [2, 3, 4] | ||
ds2 = ds.mean(dim="time") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Convention:
ds2 = ds.mean(dim="time") | |
result = ds.mean(dim="time") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not store the result but use it directly. Is that ok as well?
ds = Dataset() | ||
ds["pos"] = [1, 2, 3] | ||
ds["data"] = ("pos", "time"), [[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]] | ||
ds["var"] = "pos", [2, 3, 4] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI you can construct the dataset in a single expression, but it's also fine as-is (Dataset(dict(pos=..., data=...))
)
* Better testing * Clarify comment * Handle other functions as well, like sum, min, max
I have now changed so that several wrapped functions preserve the data. It is more generic, and hopefully still readable. The flag might also be called |
Excellent! Thanks @dschwoerer . Any thoughts from others before we merge, particularly on the |
Merging — as ever, any additional feedback we can address in another PR. Thank you very much @dschwoerer , we're happy to have you as a contributor! |
Avoids changing the datatype if the data does not have the requested
axis.
pre-commit run --all-files
whats-new.rst