-
Notifications
You must be signed in to change notification settings - Fork 300
fix for MaskedConstant errors #2143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I'm not happy about these tests failing. They pass for me, as long as I remove the function 'tests.get_data_path(...)' and use the absolute path instead. However, now that I know that there are tests for util.new_axis that live here, I will add my tests to these files in this PR and close PR#2130. |
lib/iris/util.py
Outdated
| if scalar_coord is not None: | ||
| scalar_coord = src_cube.coord(scalar_coord) | ||
|
|
||
| if isinstance(src_cube.data, np.ma.core.MaskedConstant): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using src_cube.data makes this function load deferred data, which is why you saw the deferred data test fail. Can you formulate this check without loading the data?
| self.assertTrue(new_cube.has_lazy_data()) | ||
| self.assertEqual(new_cube.shape, (1,) + cube.shape) | ||
|
|
||
| def test_masked_unit_array(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like the same test twice in two places, why is it repeated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I don't know where it is supposed to live. Thanks for your comments @ajdawson, I'll fix these tomorrow.
|
@ajdawson I don't think I can do this check without touching the data at all ever, but I can touch the data only if the cube is scalar, which is what I have tried to do here. |
|
Can I just rewind this a little. I'd like to understand the actual problem here, and the use case this change is catering for. Would you mind providing a bit more description of this? |
|
I think your current approach is incorrect, producing a masked array before the reshape won't fix this. I have an alternative implementation that passes existing tests + your new test, but I'd really like to get some more info about the actual problem first (see previous comment). |
|
I'll get the ball rolling with more detail about my concerns: What you are trying to do is convert a A I'd like to know more about the problems caused down the line by having a non-scalar |
|
@ajdawson I received a support request from someone highlighting that new_axis doesn't add a new axis to the mask along with the single-point array. In the examples he sent me, he was trying to add dimensions to the array and then transpose it. He pointed out that if he wanted to take a single point out of an existing cube, then add dimensions and transpose it, it didn't work because it was a Masked Constant. I am in hospital at the moment, so I'm afraid I can't access the example file that he sent me or even get any more information from him about his purposes, but I can do that on Tuesday if you like. |
|
OK, thanks for the extra information, I can now reproduce the case you are talking about. I'd like to think about it some more, but it seems like your approach is the right option in this case. I think all we need to do is tidy up a few details. We can revisit next week or whenever you are ready. |
|
@ajdawson Hi AJ, I am back at work now and eager to move forward with this issue. What are your thoughts? |
| self.assertTrue(new_cube.has_lazy_data()) | ||
| self.assertEqual(new_cube.shape, (1,) + cube.shape) | ||
|
|
||
| def test_masked_unit_array(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test should be a unit test (in the location you had it previously) not an integration test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajdawson Why is this a unit test and not an integration test? I thought that because it was testing the behaviour of an iris function with a specific numpy data type, this would make the test not isolated to a single function. (I'm not being pedantic, I'm just trying to understand testing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my view, this is testing a simple invocation of one function (a code unit) that deosn't rely on a whole stack or anything external, so it is a unit test. Note that the test above it for lazy data is loading some real data from file and using new_axis on it, which is relying on biggus, the iris load mechanism etc. and is integrating several components together and looks a bit like end-user code. You'll notice there is a corresponding unit test for the lazy data case, perhaps compare and contrast for an idea of what is a unit test and what is an integration test, although ultimately the line is fuzzy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you.
lib/iris/util.py
Outdated
| if scalar_coord is not None: | ||
| scalar_coord = src_cube.coord(scalar_coord) | ||
|
|
||
| if src_cube.shape is (): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion is to alter the actual slicing if we have a MaskedConstant rather than trying to correct for it beforehand. I tentatively suggest something like:
if src_cube.has_lazy_data():
new_cube = iris.cube.Cube(src_cube.lazy_data()[None])
else:
if isinstance(src_cube.data, ma.core.MaskedConstant):
new_data = ma.array([np.nan], mask=[True])
else:
new_data = src_cube.data[None]
new_cube = iris.cube.Cube(new_data)with the following caveat: Do we know if this does the right thing when we have lazy data? I haven't checked. I think whatever implementation we choose we should have a test that is the equivalent of your new test but for a lazy array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajdawson This all makes sense, I am happy to implement these changes.
I am not sure if I've tried this with lazy data. I assume if I write a test using a lazy array this would also just be a unit test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so yes
|
|
||
| def test_masked_unit_array(self): | ||
| cube = tests.stock.simple_3d_mask() | ||
| test_cube = cube[0][0][0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The accepted style for this is test_cube = cube[0, 0, 0]
lib/iris/util.py
Outdated
|
|
||
| if src_cube.shape is (): | ||
| if isinstance(src_cube.data, np.ma.core.MaskedConstant): | ||
| src_cube.data = np.ma.array(data=[0], mask=[1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please use ma.array(..., we have an import numpy.ma as ma in this file so we don't need the leading np.. This should be true everywhere in iris.
lib/iris/util.py
Outdated
|
|
||
| if src_cube.shape is (): | ||
| if isinstance(src_cube.data, np.ma.core.MaskedConstant): | ||
| src_cube.data = np.ma.array(data=[0], mask=[1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use np.nan as the data value here? We don;t know what it should be since a MaskedConstant has no value really, so not-a-number seems reasonable. I don't really like this part but I'm not sure if we have another choice...
ajdawson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added my current thoughts on how to progress. Feel free to discuss these before committing to anything.
lib/iris/util.py
Outdated
| scalar_coord = src_cube.coord(scalar_coord) | ||
|
|
||
| if src_cube.shape is (): | ||
| if isinstance(src_cube.data, np.ma.core.MaskedConstant): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment for np.ma.core.MaskedConstant -> ma.core.MaskedConstant.
|
@ajdawson Does the lazy data test look OK to you? |
|
I think the test is OK, but it fails: https://travis-ci.org/SciTools/iris/jobs/161862951#L5040 |
|
Actually I think that might be my fault. I didn't check the dimensions of the lazy array before indexing into it to get a single point. Let me try again. |
|
@ajdawson OK, this has made me a bit confused. The test is failing because the lazy array doesn't have a mask on it, so when I try and check the dimensions of the mask it doesn't know what to check. I can add a mask to the array, but will it then still be lazy? Also, what happens if the single-point array we are passing to new_axis is both lazy and a Masked Constant? Is that a thing? If it is, then it will bypass the fix, right? |
Yes, I think that can happen. I feel like this should really be fixed in biggus, and it might not be possible to make a fix in Iris anyway. It seems to me that there's no way to know whether realising the data would result in a MaskedConstant without actually realising it, so there's no way to know when to apply the fix. I think biggus should be fixed so that |
|
@djkirkham You might be right. |
|
OK, if this is out of our direct control for deferred data I am happy for this fix to only apply to loaded data (i.e. as it currently is), with the condition that an issue is opened on biggus, and is picked up again in iris when possible. You'll need to remove the test for lazy data since it fails, and then I think this can be accepted. |
|
@ajdawson @djkirkham Either of you know why these tests are failing now? |
|
It's because there is a new version of cf_units which is being picked up. See #2144 which will fix these errors |
|
Actually, most of the errors are because the current version of cf_units has a bug. We're releasing a new version now which will fix all but one of the errors, and #2144 will fix the remaining one |
|
Thanks Dan |
|
@marqh Dan informs me that the tests for this PR should pass now that his own changes have been merged. Could you please re-run the Travis CI tests here? Thanks. |
|
Did you forget to rebase (and thus pull in required changes)? Tests are still failing. |
|
@ajdawson Yes I did. |
3de58da to
c73a031
Compare
|
@ajdawson OK, so I rebased onto upstream master, but actually Dan's fix was merged into v1.10.x so the tests are still failing. Have spoken to Pete about it, who is waiting to speak to Mark about it, and then maybe somebody might speak to me about it again... |
|
Err, OK! What usually needs to happen is a merge of the v1.10.x branch into master to pickup bug-fix changes from there. Once that is done you can rebase this onto upstream/master. IWhen you can eventually do this, you may as well squash these commits into a single commit while you are at it. |
c73a031 to
db1ddb8
Compare
|
@ajdawson Are you happy to merge? |
|
Yep! Thanks @corinnebosley |
|
@ajdawson Thank you! |
iris.util.new_axis cannot handle masked constants, so this is a check to turn masked constants into scalar masked arrays, at which point they can be handled correctly.
This fix is checked by a test in PR#2130: #2130