-
Notifications
You must be signed in to change notification settings - Fork 300
ENH: Lazy in-place Cube transpose #1983
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Any ideas anyone? Chers |
We should increase the timeout for nose multiprocessing... |
|
So, can you describe how the inversion would proceed if I invert a 1d dimension coordinate that has a 1d Auxcoord describing it also? |
👍 ... or speed up/remove the offending re-projection tests. |
Certainly, it will fall over as I have called Cheers Thanks both for your interest on this. |
|
This implementation is intentionally constrained. In future it will support multi-dimensional coordinates and multiple coordinates mapped to the same dimension (as/when driven by requirement). |
Thanks both, I hadn't noticed that it was a |
You may want to consider catching this exception and re-raising with a more appropriate error message? |
lib/iris/cube.py
Outdated
| if len(coord_dims) > 1: | ||
| msg = ('Currently multidimensional coordinate inversion is ' | ||
| 'not supported') | ||
| raise RuntimeError(msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is RuntimeError the best class of exception to be used here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps CoordinateMultiDimError would be better? (https://github.com/SciTools/iris/blob/master/lib/iris/exceptions.py#L48)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ajdawson, sure.
@ajdawson I thought I already did? @cpelley: "Certainly, it will fall over as I have called self.coord not self.coords (exception here)..." Perhaps an example might help? >>> print cube
thingness / (1) (bar: 3; foo: 4)
Dimension coordinates:
bar x -
foo - x
Auxiliary coordinates:
bing x -
>>> cube.invert_dims((0,))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/h05/cpelley/git/iris/lib/iris/cube.py", line 2801, in invert_dims
coord = self.coord(contains_dimension=dim)
File "/home/h05/cpelley/git/iris/lib/iris/cube.py", line 1439, in coord
raise iris.exceptions.CoordinateNotFoundError(msg)
iris.exceptions.CoordinateNotFoundError: 'Expected to find exactly 1 coordinate, but found 2. They were: bar, bing.' |
|
See above modified comment. |
I thought the exception was clear enough along with the traceback that says where it was raised (see above example). |
It doesn't tell you what actually went wrong though. The real problem is that in-place inversion doesn't support multiple coordinates spanning a single dimension. The end-user shouldn't have to know that you called the |
lib/iris/cube.py
Outdated
| coord = self.coord(contains_dimension=dim) | ||
| coord_dims = self.coord_dims(coord) | ||
| if len(coord_dims) > 1: | ||
| msg = ('Currently multidimensional coordinate inversion is ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't start the sentence with "Currently", try something along the lines of:
Inversion of multidimensional coordinates is not supported.
If the data is not lazy, that would create a copy (i.e. consume twice as much as necessary).
OK. I followed the following structural layout for util within our own project (with the location describing the context of what objects the utility works on): I bring this up only as I would guess that my adding yet another function to Would you be happy for me to put follow my proposed structure change (making util a sub-package and putting |
|
I have re-written some of the I'll await a response to my last comment before moving Cheers both. |
What if that was no longer the case? |
Are you suggesting that indexing doesn't copy or that you are open to removing the need to copying data by indexing the cube? |
|
I think I'm tempted to limit this PR to the initial commit (ENH: Lazy cube data transpose - 00dae8) unless there are any objections? |
|
Not from me, we can revisit the inversion issue in a separate PR if required. |
|
I have reverted the cube_inverse idea so we are left with the simple transpose change (I'll put a separate PR up after this PR is merged dealing with the invert_dims). Cheers |
|
I don't see why you are modifying tests about regridding, this change has nothing to do with the regrid code as far as I can see. |
See comment. I had started the work already but complete moving of the offending tests would be too much work to justify for this PR. Shall I update the nosetests timeout and raise an issue on this test class for performance? (for when someone is doing work on the regridding) |
|
I don't know if you need to do anything about it in this PR, it has nothing to do with the changes you have introduced. Changes to these tests should probably be in a separate PR where we can review carefully and have some record of the aim and scope of the changes. |
I only did so because possible solutions were discussed on this ticket. Cheers |
| def test_inplace_data_transpose(self): | ||
| target_id = id(self.cube.lazy_data()) | ||
| self.cube.transpose((0, 1)) | ||
| self.assertEqual(id(self.cube.lazy_data().array), target_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By using .array you're relying on a particular implementation of the transpose algorithm. I'd suggest checking the cube is still lazy with has_lazy_data and then resolving the numbers and checking they actually match.
Also, I'd do an actual transpose rather than deliberately doing a "null" transpose.
Putting those together:
data = np.arange(12).reshape(3, 4)
cube = Cube(biggus.NumpyArrayAdapter(data))
cube.transpose()
self.assertTrue(cube.has_lazy_data())
self.assertArrayEqual(data.T, cube.data)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rhattersley.
|
Squash-away my friend! Squash-away! |
08df29a to
fd04aa8
Compare
|
squashed :) Cheers |
My turn to remember the "what's new" entry! 😉 |
@cpelley In other words... the only thing this needs is a what's new entry and it's ready to merge. |
|
Thanks @rhattersley yes sorry I'm on it. |
|
whatsnew done. Cheers |
| @@ -0,0 +1 @@ | |||
| * The transpose method of a Cube now results in a transposed view of the original data rather than a transposed copy. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this really captures the lazy-loading aspect...?
|
I have updated the wording of the whatsnew. Cheers |
|
Can we get this into the v1.10 milestone? |
|
@cpelley realistically no - v1.10 is big enough already and we just need to get it out the door now rather than adding new functionality to the milestone that might hold it up. |
Can you please reconsider? This ticket has gone through a review with the only remaining thing flagged by @rhattersley being the whatsnew entry. This was completed over two weeks ago. |
|
I know I'm late to the party, but this change will mean that when transpose() is called on a cube without lazy data, it will then appear to have lazy data, i.e. cube.has_lazy_data() will return true. |
|
Thanks for your interest @djkirkham. This behaviour was expected: iris.tests.unit.cube.test_Cube.Test_transpose.test_lazy_data To modify the behaviour of >>> arr = biggus.NumpyArrayAdapter(np.arange(4).reshape(2, 2))
>>> cube = iris.cube.Cube(arr)
>>> cube.has_lazy_data()
TrueI should say that I don't have a problem with the above (but then I have taken the meaning of iris Is there a preference for doing some type checking in transpose? Cheers |
|
I'm admittedly a novice when it comes to Iris, but my understanding of the meaning of For me, the usefulness of
My preference would be to use |
|
Thanks @djkirkham I'll have a closer look and see what I can do. Cheers |
I don't see how the term "load" comes into it in the context of my example. Right now, I think I agree in principle with what you expect as the return value though. I have pushed a new commit which now does this. Thanks @djkirkham |
|
Thanks @cpelley, the changes look good.
I would say that's just the way it's implemented, rather than the intended meaning of the method. Correct me if I'm wrong, but currently the only way I agree that it would be a good idea to have the intended meaning documented somewhere. |
|
Ping, the test failures are timeouts. Cheers |
|
ping |
|
Please let me know if there's anything I can do to help to get this in. Cheers |
this looks like a done deal to me, thanks @cpelley |
|
Thanks @marqh, this will make a significant difference to us. Appreciated. |
Using 1, also provides in-place dimension inversion (intentionally constrained implementation).- now moving to a new PR after the merge of this one.