Generic lazy data handling. #2356

pp-mo · 2017-02-10T13:38:53Z

DO NOT MERGE
Still in development..

This is a starting point for replacing all Iris' explicit use of biggus concepts with something more detached from the implementation.
( Which I think should be better than explicitly referencing dask everywhere instead. )

At present there is no recognition here that realisation with 'as_concrete_array' might need to make some distinction whether the result should be a masked array (as biggus API does).
For the cube.data property (getter), this is the existing Iris behaviour anyway.

marqh · 2017-02-13T11:05:59Z

lib/iris/_lazy_data.py

+"""
+Routines for lazy data handling.
+
+To avoid replicating implementation-dependent test and conversion code.


i would rather this was not part of the module doc string; i understand the sentiment, but it doesn't seem like useful long term documentation.

I would prefer

Routines for lazy data handling. Supporting the Cube's lazy_data interface.

or something similarly positive

I see what you mean.
However, this module is not particularly tied to the cube.data interface as such.
Could we have something like "This module provides lazy array handling, independent of a specific implementation such as dask or biggus." @marqh ?
I'm not so keen on the specific namechecking of dask/biggus, but I think that makes it much clearer what we are actually talking about.

djkirkham · 2017-02-13T11:35:51Z

It seems to me that this PR would lead us down a route which is gonna cause problems later on. I think that if we want to be agnostic about the underlying implementation of the lazy data provider (i.e. Biggus or Dask), then having the Cube._mydata attribute set to an instance of that provider weds us to the implementation. For example, currently if you do cube1 + cube2, that leads to calling cube1._mydata + cube2._mydata, so you have an implicit dependency on the type of _mydata having an __add__ method. Obviously it's highly like that the implementation will have that method, but there are surely other methods which won't be present in all implementations.

I would suggest having a wrapper class that implements a documented interface, which we could initially derive from the current Biggus interface in addition to methods based on the functions in this PR (except please can we avoid attribute checking? Personally I find it ugly, but maybe that's just me). Most of the methods would be boilerplate that would just pass the operation down to the underlying provider (e.g. add), but for methods not available on the provider we could provide our own implementation.

marqh · 2017-02-13T13:21:31Z

I would suggest having a wrapper class that implements a documented interface, which we could initially derive from the current Biggus interface in addition to methods based on the functions in this PR

I don't think that this PR makes a decision on this matter. I think there there is a key design decision that we have to consider
In my view, these helper functions do not take us down a particular path. Perhaps they help to highlight different paths we could follow

I'm content to take these as is and have that discussion as part of a follow on activity

Generic lazy data handling.

59a63fb

pp-mo added the Status: Work in Progress label Feb 10, 2017

This was referenced Feb 10, 2017

Abstract Iris lazy operations #2344

Closed

Dask abstract lazy cubedata #2365

Merged

marqh reviewed Feb 13, 2017

View reviewed changes

marqh merged commit 1db8c04 into SciTools:dask Feb 13, 2017

marqh removed the Status: Work in Progress label Feb 13, 2017

QuLogic modified the milestone: dask Feb 13, 2017

pp-mo added a commit to pp-mo/iris that referenced this pull request Feb 14, 2017

Generic lazy data handling. (SciTools#2356)

7d64ab9

pp-mo added a commit to pp-mo/iris that referenced this pull request Feb 14, 2017

Generic lazy data handling. (SciTools#2356)

b5fc329

marqh pushed a commit to marqh/iris that referenced this pull request Feb 18, 2017

Generic lazy data handling. (SciTools#2356)

36d7781

marqh pushed a commit to marqh/iris that referenced this pull request Feb 23, 2017

Generic lazy data handling. (SciTools#2356)

5e1fd03

marqh pushed a commit to marqh/iris that referenced this pull request Feb 24, 2017

Generic lazy data handling. (SciTools#2356)

880574d

bjlittle pushed a commit to bjlittle/iris that referenced this pull request May 31, 2017

Generic lazy data handling. (SciTools#2356)

d2e0849

QuLogic modified the milestones: dask, v2.0 Aug 2, 2017

pp-mo deleted the dask_abstract_lazy branch March 18, 2022 15:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generic lazy data handling. #2356

Generic lazy data handling. #2356

Uh oh!

pp-mo commented Feb 10, 2017 •

edited

Loading

Uh oh!

marqh Feb 13, 2017

Uh oh!

pp-mo Feb 13, 2017

Uh oh!

djkirkham commented Feb 13, 2017

Uh oh!

marqh commented Feb 13, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Generic lazy data handling. #2356

Generic lazy data handling. #2356

Uh oh!

Conversation

pp-mo commented Feb 10, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marqh Feb 13, 2017

Choose a reason for hiding this comment

Uh oh!

pp-mo Feb 13, 2017

Choose a reason for hiding this comment

Uh oh!

djkirkham commented Feb 13, 2017

Uh oh!

marqh commented Feb 13, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pp-mo commented Feb 10, 2017 •

edited

Loading