Skip to content

Conversation

@pp-mo
Copy link
Member

@pp-mo pp-mo commented Mar 9, 2017

This is based on the idea of reinstating a common "realisation" call in iris._lazy_data
AKA "as_concrete_data"

NOTE: not merge-ready
This is at the proof-of-principle / what-might-this-look-like stage
Even if adopted, it needs review to see if testing is adequate (and odd test fails fixing)

This version proposes to support both filled and masked results in one routine.
The rationale is that

  • we need solutions for both cube/coord data (masked) and in file formats (filled)
  • we always need to convert a nanarray once realised, as we never use that form itself
  • we don't want to be always using a "standard" masked array form in file formats, as unnecessary use of masked arrays is probably inefficient

@pp-mo
Copy link
Member Author

pp-mo commented Mar 9, 2017

Offline discussion with @bjlittle revealed that this is now competing for usage with the former / possible-new-version "array_nans_to_masked".
We could wind up with "either one, rather than both".

This approach assumes that we actually never want to realise without doing some further conversion (though that may mean just checking it has no NaNs).

@pp-mo
Copy link
Member Author

pp-mo commented Mar 9, 2017

Couple of tests failing at present -- will address....

@pp-mo
Copy link
Member Author

pp-mo commented Mar 9, 2017

I wonder if we should consider an efficiency tweak here, to optionally skip the NaNs test altogether, if the caller "knows" that there can be no missing points.
The test is obviously a big of a drag when there are no NaNs, it has to scan the whole lot to make sure.

NOTE: I think that the use of np.any(np.isnan(x)) may be inherently less efficient than np.ma.is_masked, as you need to create a boolean array. There "might" be a "np.contains_some_nans(x)" call which would not create an auxiliary boolean array and also return True at the first NaN found, but no such thing exists AFAIK.

What to you think @bjlittle ?

@lbdreyer lbdreyer added the dask label Mar 9, 2017
@marqh
Copy link
Member

marqh commented Mar 9, 2017

I'm a little wary of this change, not against, just wary

This is based on the idea of reinstating a common "realisation" call in iris._lazy_data
AKA "as_concrete_data"

this call has a few paths through it, based on entry criteria. I am a little concerned that calling this correctly in places like the PP loader are made more error prone

I can see that the call to data in the cube has similar characteristics to this new function, but other places the replacement isn't so clear.

To help me understand please: is this change fixing identified functional problems or trying to maintain all current behaviour with a different code structure?

@DPeterK
Copy link
Member

DPeterK commented Mar 15, 2017

I like this. Having a common way of calling compute on a lazy array is a good thing, and it will help when we want to restrict how compute is used - for example when controlling the amount of system resource used by default.

@QuLogic QuLogic added this to the dask milestone Mar 15, 2017
@pp-mo
Copy link
Member Author

pp-mo commented Mar 16, 2017

Rebased onto "bjlittle:dask-netcdf-save", i.e. #2411.
( at 4f062a7 )
If that changes again, this will need a further (hopefully simple) rebase.

@lbdreyer
Copy link
Member

Is this ready to be reviewed? If so the first thing that immediately jumps out is that you don't have any tests for as_concrete_data

@lbdreyer
Copy link
Member

As @pp-mo is not available, I am going to see if I can rebase this onto iris/dask branch now that #2411 is merged in

@lbdreyer
Copy link
Member

lbdreyer commented Mar 20, 2017

As @pp-mo is not available, I am going to see if I can rebase this onto iris/dask branch now that #2411 is merged in

It doesn't look like I have permission?? The Allow edits from maintainers. check box possibly only refers to adding commits.

I've created #2447 to replace this PR.

@pp-mo pp-mo deleted the dask_concrete branch March 18, 2022 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants