Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpolation with multiple mutlidimensional arrays sharing dims fails #4463

Closed
aulemahal opened this issue Sep 25, 2020 · 5 comments · Fixed by #9881
Closed

Interpolation with multiple mutlidimensional arrays sharing dims fails #4463

aulemahal opened this issue Sep 25, 2020 · 5 comments · Fixed by #9881

Comments

@aulemahal
Copy link
Contributor

What happened:
When trying to interpolate a N-D array with 2 other arrays sharing a common (new) dimension and with one (at least) being multidimensional fails. Kinda a complex edge case I agree. Here's a MWE:

da = xr.DataArray([[[1, 2, 3], [2, 3, 4]], [[1, 2, 3], [2, 3, 4]]], dims=('t', 'x', 'y'), coords={'x': [1, 2], 'y': [1, 2, 3], 't': [10, 12]})
dy = xr.DataArray([1.5, 2.5], dims=('u',), coords={'u': [45, 55]})
dx = xr.DataArray([[1.5, 1.5], [1.5, 1.5]], dims=('t', 'u'), coords={'u': [45, 55], 't': [10, 12]})

So we have da a 3D array with dims (t, x, y). We have dy, containing the values of y along new dimension u. And dx containing the values of x along both u and t. We want to interpolate with:

out = da.interp(y=dy, x=dx, method='linear')

As so to have a new array over dims t and u.

What you expected to happen:
I expected (with the dummy data I gave):

xr.DataArray([[2, 3], [2, 3]], dims=('t', 'u'), coords={'u': [45, 55], 't': [10, 12]})

But instead it fails with ValueError: axes don't match array.

Full traceback:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-19-b968c6f3dae9> in <module>
----> 1 a.interp(y=y, x=x, method='linear')

~/Python/xarray/xarray/core/dataarray.py in interp(self, coords, method, assume_sorted, kwargs, **coords_kwargs)
   1473                 "Given {}.".format(self.dtype)
   1474             )
-> 1475         ds = self._to_temp_dataset().interp(
   1476             coords,
   1477             method=method,

~/Python/xarray/xarray/core/dataset.py in interp(self, coords, method, assume_sorted, kwargs, **coords_kwargs)
   2691                     if k in var.dims
   2692                 }
-> 2693                 variables[name] = missing.interp(var, var_indexers, method, **kwargs)
   2694             elif all(d not in indexers for d in var.dims):
   2695                 # keep unrelated object array

~/Python/xarray/xarray/core/missing.py in interp(var, indexes_coords, method, **kwargs)
    652             else:
    653                 out_dims.add(d)
--> 654         result = result.transpose(*tuple(out_dims))
    655     return result
    656 

~/Python/xarray/xarray/core/variable.py in transpose(self, *dims)
   1395             return self.copy(deep=False)
   1396 
-> 1397         data = as_indexable(self._data).transpose(axes)
   1398         return type(self)(dims, data, self._attrs, self._encoding, fastpath=True)
   1399 

~/Python/xarray/xarray/core/indexing.py in transpose(self, order)
   1288 
   1289     def transpose(self, order):
-> 1290         return self.array.transpose(order)
   1291 
   1292     def __getitem__(self, key):

ValueError: axes don't match array

Anything else we need to know?:
It works if dx doesn't vary along t. I .e.: da.interp(y=dy, x=dx.isel(t=0, drop=True), method='linear') works.

Environment:

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.8.5 | packaged by conda-forge | (default, Jul 31 2020, 02:39:48)
[GCC 7.5.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-514.2.2.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8
LOCALE: en_CA.UTF-8
libhdf5: 1.10.5
libnetcdf: 4.7.4

xarray: 0.16.2.dev9+gc0399d3
pandas: 1.0.3
numpy: 1.18.4
scipy: 1.4.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: None
cftime: 1.1.3
nc_time_axis: 1.2.0
PseudoNetCDF: None
rasterio: 1.1.4
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2.17.2
distributed: 2.23.0
matplotlib: 3.3.1
cartopy: 0.18.0
seaborn: None
numbagg: None
pint: 0.11
setuptools: 49.6.0.post20200814
pip: 20.2.2
conda: None
pytest: None
IPython: 7.17.0
sphinx: None

@fujiisoup
Copy link
Member

Hi @aulemahal

I think you want to interpolate along tas well as x, and y.
If so, you can do

In [8]: da.interp(t=dx['t'], y=dy, x=dx, method='linear')                       
Out[8]: 
<xarray.DataArray (t: 2, u: 2)>
array([[2., 3.],
       [2., 3.]])
Coordinates:
  * t        (t) int64 10 12
    y        (u) float64 1.5 2.5
    x        (t, u) float64 1.5 1.5 1.5 1.5
  * u        (u) int64 45 55

If not, this fails as dx['t'] and da['t'] do not match each other.
The error message can be improved.
A contribution is welcome ;)

@shoyer
Copy link
Member

shoyer commented Sep 26, 2020

I think this gives the correct result?

>>> da.interp(t=da['t'], y=dy, x=dx, method='linear')
<xarray.DataArray (t: 2, u: 2)>
array([[2., 3.],
       [2., 3.]])
Coordinates:
  * t        (t) int64 10 12
    y        (u) float64 1.5 2.5
    x        (t, u) float64 1.5 1.5 1.5 1.5
  * u        (u) int64 45 55

My general thought is that if an axis (like t in this case) is omitted, then should be equivalent to indexing with the existing coordinate. That is how normal indexing in xarray worked, so interpolation should work the same.

@aulemahal
Copy link
Contributor Author

Oh! I should have thought of this. The error message is indeed unclear...
I'll close the issue as the problem is solved, but I'll note that:

da.sel(y=dy, x=dx, method='nearest')

does work, without the need to explicitly pass the shared dimension t. This is why I expected that interp would work the same.

@dcherian
Copy link
Contributor

The error message is indeed unclear.

reopening so that we can fix this.

@shoyer
Copy link
Member

shoyer commented Sep 28, 2020

I'll note that:

da.sel(y=dy, x=dx, method='nearest')

does work, without the need to explicitly pass the shared dimension t. This is why I expected that interp would work the same.

I think interp should work in this case, too!

Let's keep this issue open to track that.

dcherian added a commit to dcherian/xarray that referenced this issue Dec 14, 2024
dcherian added a commit that referenced this issue Dec 19, 2024
* Don't eagerly compute dask arrays in localize

* Clean up test

* Clean up Variable handling

* Silence test warning

* Use apply_ufunc instead

* Add test for #4463

Closes #4463

* complete tests

* Add comments

* Clear up broadcasting

* typing

* try a different warning filter

* one more fix

* types + more duck_array_ops

* fixes

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Apply suggestions from code review

Co-authored-by: Michael Niklas  <[email protected]>

* Apply suggestions from code review

Co-authored-by: Illviljan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Apply suggestions from code review

Co-authored-by: Illviljan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

* Revert "Apply suggestions from code review"

This reverts commit 1b9845d.

---------

Co-authored-by: Illviljan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Michael Niklas <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants