Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plots get labels from pint arrays #5561

Merged
merged 38 commits into from
Jul 21, 2021
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
ec3040e
test labels come from pint units
TomNicholas Jul 2, 2021
6bb4298
values demotes pint arrays before returning
TomNicholas Jul 2, 2021
04d78c5
plot labels look for pint units first
TomNicholas Jul 2, 2021
6fee339
pre-commit
TomNicholas Jul 2, 2021
17c5755
added to_numpy() and as_numpy() methods
TomNicholas Jul 2, 2021
48ba107
remove special-casing of cupy arrays in .values in favour of using .t…
TomNicholas Jul 2, 2021
531385b
merged to_numpy() method in
TomNicholas Jul 2, 2021
9cb2f9b
.values -> .to_numpy()
TomNicholas Jul 2, 2021
ae6e931
lint
max-sixty Jul 2, 2021
dc24d3f
Fix mypy (I think?)
max-sixty Jul 2, 2021
6ce6b05
Merge branch 'main' of https://github.com/pydata/xarray into to_numpy
TomNicholas Jul 3, 2021
04d7b02
Merge branch 'to_numpy' of https://github.com/TomNicholas/xarray into…
TomNicholas Jul 3, 2021
ee34649
added Dataset.as_numpy()
TomNicholas Jul 3, 2021
552b322
improved docstrings
TomNicholas Jul 3, 2021
1215e69
add what's new
TomNicholas Jul 3, 2021
af8a1ee
add to API docs
TomNicholas Jul 3, 2021
e095bf0
linting
TomNicholas Jul 3, 2021
eb7d84d
fix failures by only importing pint when needed
TomNicholas Jul 7, 2021
4d43f17
merge fix for pint import errors
TomNicholas Jul 7, 2021
74c05e3
refactor pycompat into class
TomNicholas Jul 7, 2021
7e5e928
Merge pyompat refactor from branch 'to_numpy' into unit-free-values
TomNicholas Jul 7, 2021
3f85e21
pycompat import changes applied to plotting code
TomNicholas Jul 7, 2021
e397168
what's new
TomNicholas Jul 7, 2021
45245d0
compute instead of load
TomNicholas Jul 8, 2021
27fc4e5
added tests
TomNicholas Jul 8, 2021
3e8cb24
fixed sparse test
TomNicholas Jul 8, 2021
f9d6370
tests and fixes for ds.as_numpy()
TomNicholas Jul 9, 2021
50fdf4c
fix sparse tests
TomNicholas Jul 9, 2021
1c94a97
fix linting
TomNicholas Jul 9, 2021
2d07c0f
tests for Variable
TomNicholas Jul 9, 2021
9673cea
test IndexVariable too
TomNicholas Jul 9, 2021
0d624cc
use numpy.asarray to avoid a copy
TomNicholas Jul 12, 2021
2f1ff46
also convert coords
TomNicholas Jul 14, 2021
afd35e2
Merge branch 'main' into to_numpy
TomNicholas Jul 15, 2021
6d33b35
Force tests again after #5600
TomNicholas Jul 16, 2021
eae95f5
Merge branch 'main' into to_numpy
TomNicholas Jul 16, 2021
36f3bd9
Merge branch 'to_numpy' into unit-free-values
TomNicholas Jul 16, 2021
4c53790
merged main
TomNicholas Jul 21, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions xarray/core/pycompat.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,3 +45,13 @@ def is_duck_dask_array(x):
except ImportError: # pragma: no cover
cupy_version = LooseVersion("0.0.0")
cupy_array_type = ()

try:
# solely for isinstance checks
import pint

pint_version = LooseVersion(pint.__version__)
pint_array_type = (pint.Quantity,)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the reason for the failures (see #4751 (comment)): we import pint here, but pint also tries to import xarray for type checking. Normally, the circular import would be an error but pytest silences that, apparently, resulting in a partially imported module.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this explains the failures in #5568 too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this properly I see what you mean now. So is this PR just stuck then? Why can we not just check the name rather than the explicit type within xarray (for now at least)?

Copy link
Collaborator

@keewis keewis Jul 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could import pint in the functions that use pint_array_type for now (or add a _get_pint_array_type function to pycompat and call it wherever pint_array_type would be used), and I'll try to get pint to not import xarray (because ideally it should not care about what wraps it). Not sure if that's possible, though.

The best long term fix would be to figure out dask/dask#6635

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keewis It would be good to hear your thoughts on that over on the Pint issue tracker. Right now, to maintain the type casting hierarchy, libraries with duck arrays have to either define a list of types they can wrap (and prohibit all others) or a list which are meant to wrap it (and allow wrapping any other). Pint went with the latter since it was the safest approach as a type high on the graph (https://pint.readthedocs.io/en/stable/numpy.html#Technical-Commentary), but I suppose that if that was problematic, it could be flipped around to a list of allowed types to wrap (with a breaking change), as Dask ended up doing (dask/dask#6393).

But, definitely, this ad hoc approach to the type casting hierarchy will continue to be problematic until dask/dask#6635 is resolved definitely.

Copy link
Member Author

@TomNicholas TomNicholas Jul 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be really nice to fix this properly (and I will bring it up in the xarray dev call now), but for this PR then your suggestion seems to work nicely on #5568 @keewis , thanks.

except ImportError: # pragma: no cover
pint_version = LooseVersion("0.0.0")
pint_array_type = ()
24 changes: 17 additions & 7 deletions xarray/core/variable.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
dask_array_type,
integer_types,
is_duck_dask_array,
pint_array_type,
)
from .utils import (
NdimSizeLenMixin,
Expand Down Expand Up @@ -252,19 +253,28 @@ def _as_array_or_item(data):
Importantly, this function does not copy data if it is already an ndarray -
otherwise, it will not be possible to update Variable values in place.

This function mostly exists because 0-dimensional ndarrays with
dtype=datetime64 are broken :(
https://github.com/numpy/numpy/issues/4337
https://github.com/numpy/numpy/issues/7619

TODO: remove this (replace with np.asarray) once these issues are fixed
"""
data = data.get() if isinstance(data, cupy_array_type) else np.asarray(data)

# TODO replace this with an entrypoint to allow duck-array libraries to
# specify how they want to return their values?
if isinstance(data, cupy_array_type):
data = data.get()
elif isinstance(data, pint_array_type):
data = data.magnitude
else:
data = np.asarray(data)

# Exists because 0-dimensional ndarrays with
# dtype=datetime64 are broken :(
# https://github.com/numpy/numpy/issues/4337
# https://github.com/numpy/numpy/issues/7619
# TODO: remove this (replace with np.asarray) once these issues are fixed
if data.ndim == 0:
if data.dtype.kind == "M":
data = np.datetime64(data, "ns")
elif data.dtype.kind == "m":
data = np.timedelta64(data, "ns")

return data


Expand Down
18 changes: 13 additions & 5 deletions xarray/plot/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
import pandas as pd

from ..core.options import OPTIONS
from ..core.pycompat import pint_array_type
from ..core.utils import is_scalar

try:
Expand Down Expand Up @@ -474,12 +475,19 @@ def label_from_attrs(da, extra=""):
else:
name = ""

if da.attrs.get("units"):
units = " [{}]".format(da.attrs["units"])
elif da.attrs.get("unit"):
units = " [{}]".format(da.attrs["unit"])
def _get_units_from_attrs(da):
if da.attrs.get("units"):
units = " [{}]".format(da.attrs["units"])
elif da.attrs.get("unit"):
units = " [{}]".format(da.attrs["unit"])
else:
units = ""
return units

if isinstance(da.data, pint_array_type):
units = " [{}]".format(str(da.data.units))
else:
units = ""
units = _get_units_from_attrs(da)

return "\n".join(textwrap.wrap(name + extra + units, 30))

Expand Down
40 changes: 39 additions & 1 deletion xarray/tests/test_units.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,22 @@
import pandas as pd
import pytest

try:
import matplotlib.pyplot as plt
except ImportError:
pass

import xarray as xr
from xarray.core import dtypes, duck_array_ops

from . import assert_allclose, assert_duckarray_allclose, assert_equal, assert_identical
from . import (
assert_allclose,
assert_duckarray_allclose,
assert_equal,
assert_identical,
requires_matplotlib,
)
from .test_plot import PlotTestCase
from .test_variable import _PAD_XR_NP_ARGS

pint = pytest.importorskip("pint")
Expand Down Expand Up @@ -5564,3 +5576,29 @@ def test_merge(self, variant, unit, error, dtype):

assert_units_equal(expected, actual)
assert_equal(expected, actual)


@requires_matplotlib
class TestPlots(PlotTestCase):
def test_units_in_line_plot_labels(self):
arr = np.linspace(1, 10, 3) * unit_registry.Pa
# TODO make coord a Quantity once unit-aware indexes supported
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't it be possible to specify a non-dimensional coordinate with x="x_coord"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point was that I want this test to eventually test labels on both the x and y axes, but at the moment pint is only involved with the y axis, right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's right. I meant to implement this by converting the dimension coordinate to a non-dimension coordinate:

x_coord = xr.DataArray(
    np.linspace(1, 3, 3) * unit_registry.m, dims="x"
)
da = xr.DataArray(data=arr, dims="x", coords={"x_coord": x_coord}, name="pressure")
da.plot.line(x="x_coord")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I try that I still get a UnitStrippedWarning in .to_index_variable()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like a bug. open an issue for it so we don't forget?

x_coord = xr.DataArray(
np.linspace(1, 3, 3), dims="x", attrs={"units": "meters"}
)
da = xr.DataArray(data=arr, dims="x", coords={"x": x_coord}, name="pressure")

da.plot.line()

ax = plt.gca()
assert ax.get_ylabel() == "pressure [pascal]"
assert ax.get_xlabel() == "x [meters]"

def test_units_in_2d_plot_labels(self):
arr = np.ones((2, 3)) * unit_registry.Pa
da = xr.DataArray(data=arr, dims=["x", "y"], name="pressure")

fig, (ax, cax) = plt.subplots(1, 2)
ax = da.plot.contourf(ax=ax, cbar_ax=cax, add_colorbar=True)

assert cax.get_ylabel() == "pressure [pascal]"