You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In #8733 (comment) and #9286 (comment) it's become fairly clear that the ChunkManager abstraction isn't quite right - it's "too greedy" as @dcherian said. #9286 will fix this by removing .chunks and .rechunk from the ChunkManager's responsibilities, but the result will be a "ChunkManager" that doesn't explicitly handle chunks!
I think a better name to describe the new interface is a "ComputeManager", as it still handles the creation of lazily-computed parallel arrays, distribution of computation over parts of those arrays, and triggering the materialization of the arrays.
JAX is also an interesting potential use case because there you don't have chunks, but you do still have to manage dividing computation up over multiple devices. See #9286 (comment)
Renaming ChunkManagerEntrypoint to ComputeManagerEntrypoint will be a breaking change but:
a) this is a very advanced feature,
b) the docs for it have a fat "experimental" warning on them,
c) I'm only aware of 2 libraries using this outside of xarray itself: cubed (tagging @tomwhite), and @hmaarrfk's chunked data structure. The dask ChunkManager ships with xarray, so there is no breaking change there. (Users may have to pip install again to re-register entrypoints if upgrading a development version of xarray inside existing environments though.)
I'm separating this out from #9286 because that PR shouldn't be a breaking change, and the follow-up that closes this issue will be the minimal possible breaking change (i.e. just renaming ChunkManagerEntrypoint -> ComputeManagerEntrypoint).
The text was updated successfully, but these errors were encountered:
On a semi-related note, it would be nice to consider separating the concept of triggering computation from returning a computed result. Dask calls these .persist() and .compute(). In Tensorstore, the former is obtained via .read() and the later via .result().
I would personally appreciate a resolution to #9403 so that we can start to test our codebase with numpy "2" series. I'm just trying to isolate breaking changes in our changes and I feel like #9403 is the last thing holding us up! (apart from tensorflow that is).
What is your issue?
In #8733 (comment) and #9286 (comment) it's become fairly clear that the
ChunkManager
abstraction isn't quite right - it's "too greedy" as @dcherian said. #9286 will fix this by removing.chunks
and.rechunk
from theChunkManager
's responsibilities, but the result will be a "ChunkManager" that doesn't explicitly handle chunks!I think a better name to describe the new interface is a "ComputeManager", as it still handles the creation of lazily-computed parallel arrays, distribution of computation over parts of those arrays, and triggering the materialization of the arrays.
JAX is also an interesting potential use case because there you don't have chunks, but you do still have to manage dividing computation up over multiple devices. See #9286 (comment)
Renaming
ChunkManagerEntrypoint
toComputeManagerEntrypoint
will be a breaking change but:a) this is a very advanced feature,
b) the docs for it have a fat "experimental" warning on them,
c) I'm only aware of 2 libraries using this outside of xarray itself: cubed (tagging @tomwhite), and @hmaarrfk's chunked data structure. The dask
ChunkManager
ships with xarray, so there is no breaking change there. (Users may have topip install
again to re-register entrypoints if upgrading a development version of xarray inside existing environments though.)I'm separating this out from #9286 because that PR shouldn't be a breaking change, and the follow-up that closes this issue will be the minimal possible breaking change (i.e. just renaming
ChunkManagerEntrypoint
->ComputeManagerEntrypoint
).The text was updated successfully, but these errors were encountered: