Skip to content

Conversation

@pp-mo
Copy link
Member

@pp-mo pp-mo commented Feb 10, 2017

DO NOT MERGE
Still in development..

This is a starting point for replacing all Iris' explicit use of biggus concepts with something more detached from the implementation.
( Which I think should be better than explicitly referencing dask everywhere instead. )

At present there is no recognition here that realisation with 'as_concrete_array' might need to make some distinction whether the result should be a masked array (as biggus API does).
For the cube.data property (getter), this is the existing Iris behaviour anyway.

"""
Routines for lazy data handling.

To avoid replicating implementation-dependent test and conversion code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would rather this was not part of the module doc string; i understand the sentiment, but it doesn't seem like useful long term documentation.

I would prefer

Routines for lazy data handling.

Supporting the Cube's lazy_data interface.

or something similarly positive

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean.
However, this module is not particularly tied to the cube.data interface as such.
Could we have something like "This module provides lazy array handling, independent of a specific implementation such as dask or biggus." @marqh ?
I'm not so keen on the specific namechecking of dask/biggus, but I think that makes it much clearer what we are actually talking about.

@djkirkham
Copy link
Contributor

It seems to me that this PR would lead us down a route which is gonna cause problems later on. I think that if we want to be agnostic about the underlying implementation of the lazy data provider (i.e. Biggus or Dask), then having the Cube._mydata attribute set to an instance of that provider weds us to the implementation. For example, currently if you do cube1 + cube2, that leads to calling cube1._mydata + cube2._mydata, so you have an implicit dependency on the type of _mydata having an __add__ method. Obviously it's highly like that the implementation will have that method, but there are surely other methods which won't be present in all implementations.

I would suggest having a wrapper class that implements a documented interface, which we could initially derive from the current Biggus interface in addition to methods based on the functions in this PR (except please can we avoid attribute checking? Personally I find it ugly, but maybe that's just me). Most of the methods would be boilerplate that would just pass the operation down to the underlying provider (e.g. add), but for methods not available on the provider we could provide our own implementation.

@marqh
Copy link
Member

marqh commented Feb 13, 2017

I would suggest having a wrapper class that implements a documented interface, which we could initially derive from the current Biggus interface in addition to methods based on the functions in this PR

I don't think that this PR makes a decision on this matter. I think there there is a key design decision that we have to consider
In my view, these helper functions do not take us down a particular path. Perhaps they help to highlight different paths we could follow

I'm content to take these as is and have that discussion as part of a follow on activity

@marqh marqh merged commit 1db8c04 into SciTools:dask Feb 13, 2017
@QuLogic QuLogic modified the milestone: dask Feb 13, 2017
pp-mo added a commit to pp-mo/iris that referenced this pull request Feb 14, 2017
pp-mo added a commit to pp-mo/iris that referenced this pull request Feb 14, 2017
marqh pushed a commit to marqh/iris that referenced this pull request Feb 18, 2017
marqh pushed a commit to marqh/iris that referenced this pull request Feb 23, 2017
marqh pushed a commit to marqh/iris that referenced this pull request Feb 24, 2017
bjlittle pushed a commit to bjlittle/iris that referenced this pull request May 31, 2017
@QuLogic QuLogic modified the milestones: dask, v2.0 Aug 2, 2017
@pp-mo pp-mo deleted the dask_abstract_lazy branch March 18, 2022 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants