-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
The current indexing induces problems when doing dask operations on chunked arrays:
Although the chunking does not raise any Error:
import xarray as xr
file = "tests/data/r3d_bump.slf"
ds = xr.open_dataset(file)
print(ds)
ds = ds.chunk({"time": -1, "node": 40})
def analyze_block(ds_block: xr.Dataset) -> xr.Dataset:
"""Operate on a single Dask chunk."""
result = ds_block.mean(dim="node")
return result
result = xr.map_blocks(analyze_block, ds)
result.compute()
The actual call to the underlying data:
print(result.Z.values)
raises:
Traceback (most recent call last):
File "./test_dask.py", line 19, in <module>
result.compute()
File "./.venv/lib/python3.12/site-packages/xarray/core/dataset.py", line 791, in compute
return new.load(**kwargs)
^^^^^^^^^^^^^^^^^^
File "./.venv/lib/python3.12/site-packages/xarray/core/dataset.py", line 557, in load
evaluated_data: tuple[np.ndarray[Any, Any], ...] = chunkmanager.compute(
^^^^^^^^^^^^^^^^^^^^^
File "./.venv/lib/python3.12/site-packages/xarray/namedarray/daskmanager.py", line 85, in compute
return compute(*data, **kwargs) # type: ignore[no-untyped-call, no-any-return]
^^^^^^^^^^^^^^^^^^^^^^^^
File "./.venv/lib/python3.12/site-packages/dask/base.py", line 681, in compute
results = schedule(expr, keys, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "./.venv/lib/python3.12/site-packages/xarray/core/indexing.py", line 659, in __array__
return np.asarray(self.get_duck_array(), dtype=dtype, copy=copy)
^^^^^^^^^^^^^^^^^^^^^
File "./.venv/lib/python3.12/site-packages/xarray/core/indexing.py", line 664, in get_duck_array
return self.array.get_duck_array()
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "./.venv/lib/python3.12/site-packages/xarray/core/indexing.py", line 943, in get_duck_array
duck_array = self.array.get_duck_array()
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "./.venv/lib/python3.12/site-packages/xarray/core/indexing.py", line 897, in get_duck_array
return self.array.get_duck_array()
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "./.venv/lib/python3.12/site-packages/xarray/core/indexing.py", line 737, in get_duck_array
array = self.array[self.key]
~~~~~~~~~~^^^^^^^^^^
File "./xarray_selafin/xarray_backend.py", line 187, in __getitem__
return indexing.explicit_indexing_adapter(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "./.venv/lib/python3.12/site-packages/xarray/core/indexing.py", line 1129, in explicit_indexing_adapter
result = raw_indexing_method(raw_key.tuple)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "./xarray_selafin/xarray_backend.py", line 246, in _raw_indexing_method
temp = np.reshape(temp, (self.shape[1], self.shape[2])) # (nplan, nnode)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "./.venv/lib/python3.12/site-packages/numpy/_core/fromnumeric.py", line 324, in reshape
return _wrapfunc(a, 'reshape', shape, order=order)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "./.venv/lib/python3.12/site-packages/numpy/_core/fromnumeric.py", line 57, in _wrapfunc
return bound(*args, **kwds)
^^^^^^^^^^^^^^^^^^^^
ValueError: cannot reshape array of size 3 into shape (5,1452)
This is because Dask might pass keys like (0,)
, (slice(0,1), slice(0,50))
, or even (slice(None), slice(0,50))
.
In those cases, the _raw_indexing_method
logic doesn’t match
Metadata
Metadata
Assignees
Labels
No labels