-
Notifications
You must be signed in to change notification settings - Fork 300
Generic lazy data handling. #2356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| """ | ||
| Routines for lazy data handling. | ||
|
|
||
| To avoid replicating implementation-dependent test and conversion code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would rather this was not part of the module doc string; i understand the sentiment, but it doesn't seem like useful long term documentation.
I would prefer
Routines for lazy data handling.
Supporting the Cube's lazy_data interface.
or something similarly positive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you mean.
However, this module is not particularly tied to the cube.data interface as such.
Could we have something like "This module provides lazy array handling, independent of a specific implementation such as dask or biggus." @marqh ?
I'm not so keen on the specific namechecking of dask/biggus, but I think that makes it much clearer what we are actually talking about.
|
It seems to me that this PR would lead us down a route which is gonna cause problems later on. I think that if we want to be agnostic about the underlying implementation of the lazy data provider (i.e. Biggus or Dask), then having the I would suggest having a wrapper class that implements a documented interface, which we could initially derive from the current Biggus interface in addition to methods based on the functions in this PR (except please can we avoid attribute checking? Personally I find it ugly, but maybe that's just me). Most of the methods would be boilerplate that would just pass the operation down to the underlying provider (e.g. add), but for methods not available on the provider we could provide our own implementation. |
I don't think that this PR makes a decision on this matter. I think there there is a key design decision that we have to consider I'm content to take these as is and have that discussion as part of a follow on activity |
DO NOT MERGE
Still in development..
This is a starting point for replacing all Iris' explicit use of biggus concepts with something more detached from the implementation.
( Which I think should be better than explicitly referencing dask everywhere instead. )
At present there is no recognition here that realisation with 'as_concrete_array' might need to make some distinction whether the result should be a masked array (as biggus API does).
For the cube.data property (getter), this is the existing Iris behaviour anyway.