We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slicing in dask array effectively generates a task per contiguous subslice per chunk.
For the worst case of random indexing this generates a slice/task for every row along this dimension. Dask is currently raising a PerformanceWarning once we detect this situation, see https://github.com/dask/dask/blob/b4b33caed8fc9cf77c9332442ab11cf00f90bb42/dask/array/slicing.py#L630-L641
PerformanceWarning
Worst case example
import dask.array as da import numpy as np x = da.random.random((10, 20), chunks=(10, 10)) idx = np.random.randint(0, x.shape[1], x.shape[1]) y = x[:, idx]
This random access pattern is another shuffle pattern and we should be able to offer an efficient solution to this using our P2P infrastructure
see also pydata/xarray#9220
The text was updated successfully, but these errors were encountered:
It might be necessary / helpful to first deal with dask/dask#11234
Sorry, something went wrong.
As a first step for this I would like to understand how much of the P2P rechunk logic can be reused
No branches or pull requests
Slicing in dask array effectively generates a task per contiguous subslice per chunk.
For the worst case of random indexing this generates a slice/task for every row along this dimension. Dask is currently raising a
PerformanceWarning
once we detect this situation, see https://github.com/dask/dask/blob/b4b33caed8fc9cf77c9332442ab11cf00f90bb42/dask/array/slicing.py#L630-L641Worst case example
This random access pattern is another shuffle pattern and we should be able to offer an efficient solution to this using our P2P infrastructure
see also pydata/xarray#9220
The text was updated successfully, but these errors were encountered: