-
Notifications
You must be signed in to change notification settings - Fork 300
Dask landsea masks bugfix #3255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
26c5bd2 to
d174d3b
Compare
|
Ping @bjlittle rebased !! N.B. the commits are a bit of a mess : Please do not merge without squashing !! |
|
Passes at last 🍾 |
|
@pp-mo Can you target the It would be good to bank this fix in a |
b177ba7 to
8b03a2d
Compare
8b03a2d to
b853065
Compare
|
Re-targetted against v2.2.x, for some reason it wouldn't rebase cleanly, so I skwished + cherry-picked it But all passing so @bjlittle it's go ! |
lib/iris/fileformats/pp.py
Outdated
| (field.lbuser[3] % 1000) == 30: | ||
| land_mask = field | ||
|
|
||
| apply_landmask = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pp-mo Minor point - The PP loading logic is pretty heavy going (always has been). So to help the reader I think it would be make a stronger association that apply_landmask = None is the else of if (field.raw_lbpack // 10 % 10) == 2: ... would you mind doing that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also change apply_landmask to land_mask_field, after all that's what it actually is. The use of apply_landmask subtly hints (to me) that it's a boolean, and its not - it's None or a field
| # reference landmask field, so we can't yield them if they | ||
| # are encountered *before* the landmask. | ||
| # In that case, defer them, and process them all afterwards at | ||
| # the end of the file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pp-mo Nice clarification, thanks!
lib/iris/fileformats/pp.py
Outdated
| continue | ||
|
|
||
| # Land compressed fields don't have a lbrow and lbnpt. | ||
| # Landmask-compressed fields don't have an lbrow and lbnpt. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lib/iris/fileformats/pp.py
Outdated
| data_shape = (field.lbrow, field.lbnpt) | ||
| _create_field_data(field, data_shape, land_mask) | ||
| _create_field_data(field, (field.lbrow, field.lbnpt), | ||
| with_landmask_field=apply_landmask) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pp-mo Could we change this to be:
_create_field_data(field, (field.lbrow, field.lbnpt),
land_mask_field=land_mask_field)
lib/iris/fileformats/pp.py
Outdated
| field.lbrow, field.lbnpt = mask_shape | ||
| _create_field_data(field, (field.lbrow, field.lbnpt), land_mask) | ||
| _create_field_data(field, mask_shape, | ||
| with_landmask_field=land_mask) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pp-mo Again...
_create_field_data(field, mask_shape,
land_mask_field=land_mask)
lib/iris/fileformats/pp.py
Outdated
|
|
||
|
|
||
| def _create_field_data(field, data_shape, land_mask): | ||
| def _create_field_data(field, data_shape, with_landmask_field=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pp-mo Change with_landmask_field to land_mask_field ... if you agree to the previous comment requests 😉
lib/iris/fileformats/pp.py
Outdated
| If 'with_landmask_field' is passed, it is another field : The landmask | ||
| field's data is used as a template for this field's data, determining its | ||
| size, shape and the locations of valid (non-missing) datapoints. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pp-mo Minor point - Could we slight reword to something along the lines of...
If 'land_mask_field' is passed (not None), then it is the associated landmask of the `field`.
The landmask, which is also a field, is used as a template for the `field` to determine its
size, shape and the locations of valid (non-missing) data-points.
lib/iris/fileformats/pp.py
Outdated
| loaded_bytes.dtype, | ||
| field.bmdi, land_mask) | ||
| field.bmdi, | ||
| with_landmask_field) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pp-mo Change to land_mask_field...
lib/iris/fileformats/pp.py
Outdated
| field.bmdi) | ||
| block_shape = data_shape if 0 not in data_shape else (1, 1) | ||
| field.data = as_lazy_data(proxy, chunks=block_shape) | ||
| if with_landmask_field is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pp-mo Change to with_landmask_field to land_mask_field throughout...
| if n_values > 0: | ||
| # Note: data field can have excess values, but not fewer. | ||
| result[mask] = values[:n_values] | ||
| return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pp-mo Nice 👍
The only part that seems to be missing compare to the functionality of _data_bytes_to_shaped_array is setting the fill_value of the result. Is that still relevant? I'm thinking yes....?
If so, then does the proxy.mdi require to be also passed into the delayed calc_array function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometime back when I was writing this, I had concluded that the fill-value was not respected by dask anyway, so this wasn't necessary.
I think I found that the da.stack operation does not respect fill values, so in most cases we fail to preserve them on load, i.e. the data arrays come out with default fill-values instead of the BMDI value.
But the suggestions feels appropriate + I think does no harm, so I will put it back in.
|
@pp-mo Over to you... 😉 |
|
Thanks for taking this on @bjlittle . Brave 🏆 ! |
| """ | ||
| Modifies a field's ``_data`` attribute either by: | ||
| * converting DeferredArrayBytes into a lazy array, | ||
| * converting a 'deferred array bytes' tuple into a lazy array, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pp-mo Yup 👍
| (field.lbuser[3] // 1000) == 0 and \ | ||
| (field.lbuser[3] % 1000) == 30: | ||
| land_mask = field | ||
| land_mask_field = field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pp-mo We're assuming that the land_mask_field is constant, right? i.e. multiple occurrences are simply duplicates
If that's the case, then we're good... and I'm really hoping that's the case here, otherwise it's going to get complicated quickly 😨
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem here, discussed to clarify. Always snaps the first land/sea mask.
|
@pp-mo Nice one! |
|
Magic, thanks @bjlittle ! |
* Fix landsea-mask data access for UM files. * Review changes. * Fixed some stale comments + unused variables.
* Fix landsea-mask data access for UM files. * Review changes. * Fixed some stale comments + unused variables.
* Fix landsea-mask data access for UM files. * Review changes. * Fixed some stale comments + unused variables.
Fix the problem with landsea-masked PPFields, where the dask compute() calls PPDataProxy.__getitem__, which can then make a nested 'compute()' call for the landsea-mask data.
Closes #3237