Skip to content

Conversation

@pp-mo
Copy link
Member

@pp-mo pp-mo commented Jan 31, 2019

Fix the problem with landsea-masked PPFields, where the dask compute() calls PPDataProxy.__getitem__, which can then make a nested 'compute()' call for the landsea-mask data.

Closes #3237

@pp-mo
Copy link
Member Author

pp-mo commented Feb 1, 2019

Ping @bjlittle rebased !!

N.B. the commits are a bit of a mess : Please do not merge without squashing !!

@pp-mo
Copy link
Member Author

pp-mo commented Feb 4, 2019

Passes at last 🍾
Mergeable ? 😉

@pp-mo pp-mo changed the title Dask landsea masks tempfix Dask landsea masks bugfix Feb 5, 2019
@bjlittle bjlittle self-assigned this Feb 19, 2019
@bjlittle
Copy link
Member

@pp-mo Can you target the v2.2.x branch instead?

It would be good to bank this fix in a 2.2.1 release 😉

@bjlittle bjlittle added this to the v2.1.1 milestone Feb 19, 2019
@pp-mo pp-mo changed the base branch from master to v2.2.x February 19, 2019 17:26
@SciTools-assistant SciTools-assistant added the Blocked: CLA needed See https://scitools.org.uk. Submit the form at: https://scitools.org.uk/cla/v4/form label Feb 19, 2019
@pp-mo pp-mo changed the base branch from v2.2.x to master February 19, 2019 17:27
@SciTools-assistant SciTools-assistant removed the Blocked: CLA needed See https://scitools.org.uk. Submit the form at: https://scitools.org.uk/cla/v4/form label Feb 19, 2019
@pp-mo pp-mo changed the base branch from master to v2.2.x February 19, 2019 17:33
@SciTools-assistant SciTools-assistant added the Blocked: CLA needed See https://scitools.org.uk. Submit the form at: https://scitools.org.uk/cla/v4/form label Feb 19, 2019
@pp-mo pp-mo force-pushed the dask_landsea_masks_tempfix branch from b177ba7 to 8b03a2d Compare February 19, 2019 17:33
@pp-mo pp-mo force-pushed the dask_landsea_masks_tempfix branch from 8b03a2d to b853065 Compare February 19, 2019 17:45
@SciTools-assistant SciTools-assistant removed the Blocked: CLA needed See https://scitools.org.uk. Submit the form at: https://scitools.org.uk/cla/v4/form label Feb 19, 2019
@pp-mo
Copy link
Member Author

pp-mo commented Feb 20, 2019

Re-targetted against v2.2.x, for some reason it wouldn't rebase cleanly, so I skwished + cherry-picked it
( also backed out an unintended change to tests/test_pp_module )
So it had to be force-pushed, but there should be nothing you don't recognise from before.

But all passing so @bjlittle it's go !

@lbdreyer lbdreyer modified the milestones: v2.1.1, v2.2.1 Feb 22, 2019
@pp-mo pp-mo requested a review from bjlittle February 22, 2019 15:49
(field.lbuser[3] % 1000) == 30:
land_mask = field

apply_landmask = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pp-mo Minor point - The PP loading logic is pretty heavy going (always has been). So to help the reader I think it would be make a stronger association that apply_landmask = None is the else of if (field.raw_lbpack // 10 % 10) == 2: ... would you mind doing that?

Copy link
Member

@bjlittle bjlittle May 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also change apply_landmask to land_mask_field, after all that's what it actually is. The use of apply_landmask subtly hints (to me) that it's a boolean, and its not - it's None or a field

# reference landmask field, so we can't yield them if they
# are encountered *before* the landmask.
# In that case, defer them, and process them all afterwards at
# the end of the file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pp-mo Nice clarification, thanks!

continue

# Land compressed fields don't have a lbrow and lbnpt.
# Landmask-compressed fields don't have an lbrow and lbnpt.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pp-mo Minor point - Just to match the terminology of your previous comment, could you change this to Land-masked compressed fields..?

data_shape = (field.lbrow, field.lbnpt)
_create_field_data(field, data_shape, land_mask)
_create_field_data(field, (field.lbrow, field.lbnpt),
with_landmask_field=apply_landmask)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pp-mo Could we change this to be:

_create_field_data(field, (field.lbrow, field.lbnpt),
                   land_mask_field=land_mask_field)

field.lbrow, field.lbnpt = mask_shape
_create_field_data(field, (field.lbrow, field.lbnpt), land_mask)
_create_field_data(field, mask_shape,
with_landmask_field=land_mask)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pp-mo Again...

_create_field_data(field, mask_shape,
                   land_mask_field=land_mask)



def _create_field_data(field, data_shape, land_mask):
def _create_field_data(field, data_shape, with_landmask_field=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pp-mo Change with_landmask_field to land_mask_field ... if you agree to the previous comment requests 😉

If 'with_landmask_field' is passed, it is another field : The landmask
field's data is used as a template for this field's data, determining its
size, shape and the locations of valid (non-missing) datapoints.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pp-mo Minor point - Could we slight reword to something along the lines of...

If 'land_mask_field' is passed (not None), then it is the associated landmask of the `field`. 
The landmask, which is also a field, is used as a template for the `field` to determine its 
size, shape and the locations of valid (non-missing) data-points.

loaded_bytes.dtype,
field.bmdi, land_mask)
field.bmdi,
with_landmask_field)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pp-mo Change to land_mask_field...

field.bmdi)
block_shape = data_shape if 0 not in data_shape else (1, 1)
field.data = as_lazy_data(proxy, chunks=block_shape)
if with_landmask_field is None:
Copy link
Member

@bjlittle bjlittle May 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pp-mo Change to with_landmask_field to land_mask_field throughout...

if n_values > 0:
# Note: data field can have excess values, but not fewer.
result[mask] = values[:n_values]
return result
Copy link
Member

@bjlittle bjlittle May 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pp-mo Nice 👍

The only part that seems to be missing compare to the functionality of _data_bytes_to_shaped_array is setting the fill_value of the result. Is that still relevant? I'm thinking yes....?

If so, then does the proxy.mdi require to be also passed into the delayed calc_array function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometime back when I was writing this, I had concluded that the fill-value was not respected by dask anyway, so this wasn't necessary.
I think I found that the da.stack operation does not respect fill values, so in most cases we fail to preserve them on load, i.e. the data arrays come out with default fill-values instead of the BMDI value.

But the suggestions feels appropriate + I think does no harm, so I will put it back in.

@bjlittle
Copy link
Member

@pp-mo Over to you... 😉

@pp-mo
Copy link
Member Author

pp-mo commented May 17, 2019

Thanks for taking this on @bjlittle . Brave 🏆 !
I've now tried to address all your comments, and a tiny bit more tidying besides.
Please re-review.

"""
Modifies a field's ``_data`` attribute either by:
* converting DeferredArrayBytes into a lazy array,
* converting a 'deferred array bytes' tuple into a lazy array,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pp-mo Yup 👍

(field.lbuser[3] // 1000) == 0 and \
(field.lbuser[3] % 1000) == 30:
land_mask = field
land_mask_field = field
Copy link
Member

@bjlittle bjlittle May 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pp-mo We're assuming that the land_mask_field is constant, right? i.e. multiple occurrences are simply duplicates

If that's the case, then we're good... and I'm really hoping that's the case here, otherwise it's going to get complicated quickly 😨

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem here, discussed to clarify. Always snaps the first land/sea mask.

@bjlittle
Copy link
Member

@pp-mo Nice one!
Only one last outstanding query, otherwise LGTM 👍

@bjlittle bjlittle merged commit aad43f7 into SciTools:v2.2.x May 20, 2019
@pp-mo
Copy link
Member Author

pp-mo commented May 20, 2019

Magic, thanks @bjlittle !
Huge relief 😅

lbdreyer pushed a commit to lbdreyer/iris that referenced this pull request May 28, 2019
* Fix landsea-mask data access for UM files.

* Review changes.

* Fixed some stale comments  + unused variables.
lbdreyer pushed a commit to lbdreyer/iris that referenced this pull request Jun 3, 2019
* Fix landsea-mask data access for UM files.

* Review changes.

* Fixed some stale comments  + unused variables.
lbdreyer pushed a commit to lbdreyer/iris that referenced this pull request Jun 3, 2019
* Fix landsea-mask data access for UM files.

* Review changes.

* Fixed some stale comments  + unused variables.
@pp-mo pp-mo mentioned this pull request Mar 3, 2021
@pp-mo pp-mo deleted the dask_landsea_masks_tempfix branch March 18, 2022 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants