ENH: Lazy in-place Cube transpose #1983

cpelley · 2016-04-25T14:01:09Z

Simply utilises the capabilities of biggus to provide a transposed version of the data rather than making a copy (necessary with the sizes of data we are working with).
~~Using 1, also provides in-place dimension inversion (intentionally constrained implementation).~~ - now moving to a new PR after the merge of this one.

cpelley · 2016-04-25T15:29:02Z

iris.tests.experimental.regrid.test_regrid_area_weighted_rectilinear_src_and_grid.TestAreaWeightedRegrid are failling on travis but I'm not seeing any issue locally :(

Any ideas anyone?

Chers

ajdawson · 2016-04-26T08:36:02Z

failling on travis but I'm not seeing any issue locally :( Any ideas anyone?

We should increase the timeout for nose multiprocessing...

ajdawson · 2016-04-26T08:36:51Z

So, can you describe how the inversion would proceed if I invert a 1d dimension coordinate that has a 1d Auxcoord describing it also?

rhattersley · 2016-04-26T08:58:52Z

We should increase the timeout for nose multiprocessing...

👍 ... or speed up/remove the offending re-projection tests.

cpelley · 2016-04-26T09:05:24Z

So, can you describe how the inversion would proceed if I invert a 1d dimension coordinate that has a 1d Auxcoord describing it also?

Certainly, it will fall over as I have called self.coord not self.coords (exception here).
I can add a test if you really want me to but I see no benefit to adding duplicate testing for Cube.coord.

Cheers

Thanks both for your interest on this.

cpelley · 2016-04-26T09:07:04Z

This implementation is intentionally constrained. In future it will support multi-dimensional coordinates and multiple coordinates mapped to the same dimension (as/when driven by requirement).

cpelley · 2016-04-26T09:11:01Z

We should increase the timeout for nose multiprocessing...

👍 ... or speed up/remove the offending re-projection tests.

Thanks both, I hadn't noticed that it was a TimedOutException raised. I'll look at how how easy they might be to migrate convert to true unittests.

ajdawson · 2016-04-26T09:11:21Z

This implementation is intentionally constrained. In future it will support multi-dimensional coordinates and multiple coordinates mapped to the same dimension (as/when driven by requirement).

~~I wanted to know what happens in the case I specified. Is an exception raised? Where is it raised from?~~ (comments appeared out-of order, I saw the quoted one before the other arrived)

Certainly, it will fall over as I have called self.coord not self.coords

You may want to consider catching this exception and re-raising with a more appropriate error message?

ajdawson · 2016-04-26T09:48:06Z

lib/iris/cube.py

+            if len(coord_dims) > 1:
+                msg = ('Currently multidimensional coordinate inversion is '
+                       'not supported')
+                raise RuntimeError(msg)


Is RuntimeError the best class of exception to be used here?

Perhaps CoordinateMultiDimError would be better? (https://github.com/SciTools/iris/blob/master/lib/iris/exceptions.py#L48)

Thanks @ajdawson, sure.

cpelley · 2016-04-26T09:56:00Z

I wanted to know what happens in the case I specified. Is an exception raised? Where is it raised from?

@ajdawson I thought I already did?

@cpelley: "Certainly, it will fall over as I have called self.coord not self.coords (exception here)..."

Perhaps an example might help?

>>> print cube
thingness / (1)                     (bar: 3; foo: 4)
     Dimension coordinates:
          bar                           x       -
          foo                           -       x
     Auxiliary coordinates:
          bing                          x       -
>>> cube.invert_dims((0,))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/h05/cpelley/git/iris/lib/iris/cube.py", line 2801, in invert_dims
    coord = self.coord(contains_dimension=dim)
  File "/home/h05/cpelley/git/iris/lib/iris/cube.py", line 1439, in coord
    raise iris.exceptions.CoordinateNotFoundError(msg)
iris.exceptions.CoordinateNotFoundError: 'Expected to find exactly 1 coordinate, but found 2. They were: bar, bing.'

ajdawson · 2016-04-26T09:56:44Z

See above modified comment.

cpelley · 2016-04-26T09:59:00Z

You may want to consider catching this exception and re-raising with a more appropriate error message?

I thought the exception was clear enough along with the traceback that says where it was raised (see above example).

ajdawson · 2016-04-26T10:37:52Z

I thought the exception was clear enough

It doesn't tell you what actually went wrong though. The real problem is that in-place inversion doesn't support multiple coordinates spanning a single dimension. The end-user shouldn't have to know that you called the coord method to determine this (or know anything about the functionality and exceptions raised by the coord method). You should raise an exception that describes the immediate problem, in this case something like:

Inversion of dimensions that are spanned by multiple coordinates is not currently supported.
Dimension {} is spanned by ...

ajdawson · 2016-04-26T10:38:57Z

lib/iris/cube.py

+            coord = self.coord(contains_dimension=dim)
+            coord_dims = self.coord_dims(coord)
+            if len(coord_dims) > 1:
+                msg = ('Currently multidimensional coordinate inversion is '


You shouldn't start the sentence with "Currently", try something along the lines of:

Inversion of multidimensional coordinates is not supported.

cpelley · 2016-04-26T14:56:10Z

What's stopping you from indexing your cube with a[::-1] slice?

If the data is not lazy, that would create a copy (i.e. consume twice as much as necessary).

...but invert_dims seems too niche to be on the Cube

OK. I followed the following structural layout for util within our own project (with the location describing the context of what objects the utility works on):
util.coords, util.ndarray, util.cube etc.

I bring this up only as I would guess that my adding yet another function to util will bring up the debate of the complexity and length of utils.

Would you be happy for me to put follow my proposed structure change (making util a sub-package and putting invert_dims within a cube module?
(i.e. iris/util/cube/invert_dims)

cpelley · 2016-04-26T15:05:07Z

I have re-written some of the TestAreaWeightedRegrid tests over in the unit testing area, however, those causing a TimedOutException probably need removal and completely new true unit tests written. I don't think this is trivial enough for this PR as it would require evaluation of new numbers for the area weighted regrid results by hand.

I'll await a response to my last comment before moving invert_dims out of the Cube class.

Cheers both.

rhattersley · 2016-04-26T15:05:57Z

What's stopping you from indexing your cube with a[::-1] slice?

If the data is not lazy, that would create a copy (i.e. consume twice as much as necessary).

What if that was no longer the case?

cpelley · 2016-04-26T15:48:47Z

@rhattersley: What if that was no longer the case?

Are you suggesting that indexing doesn't copy or that you are open to removing the need to copying data by indexing the cube?

cpelley · 2016-04-26T15:55:54Z

I think I'm tempted to limit this PR to the initial commit (ENH: Lazy cube data transpose - 00dae8) unless there are any objections?

ajdawson · 2016-04-26T16:53:23Z

Not from me, we can revisit the inversion issue in a separate PR if required.

cpelley · 2016-04-27T06:57:08Z

I have reverted the cube_inverse idea so we are left with the simple transpose change (I'll put a separate PR up after this PR is merged dealing with the invert_dims).

Cheers

ajdawson · 2016-04-27T07:48:41Z

I don't see why you are modifying tests about regridding, this change has nothing to do with the regrid code as far as I can see.

cpelley · 2016-04-27T08:38:02Z

I don't see why you are modifying tests about regridding, this change has nothing to do with the regrid code as far as I can see.

See comment. I had started the work already but complete moving of the offending tests would be too much work to justify for this PR. Shall I update the nosetests timeout and raise an issue on this test class for performance? (for when someone is doing work on the regridding)

ajdawson · 2016-04-27T08:43:27Z

I don't know if you need to do anything about it in this PR, it has nothing to do with the changes you have introduced. Changes to these tests should probably be in a separate PR where we can review carefully and have some record of the aim and scope of the changes.

cpelley · 2016-04-27T09:19:28Z

Changes to these tests should probably be in a separate PR...

I only did so because possible solutions were discussed on this ticket.
Removed these changes now and created a new PR (#1986 to continue the issue of test failure of the regrid on travis).
I'll squash when your happy.

Cheers

rhattersley · 2016-04-27T10:02:12Z

lib/iris/tests/unit/cube/test_Cube.py

+    def test_inplace_data_transpose(self):
+        target_id = id(self.cube.lazy_data())
+        self.cube.transpose((0, 1))
+        self.assertEqual(id(self.cube.lazy_data().array), target_id)


By using .array you're relying on a particular implementation of the transpose algorithm. I'd suggest checking the cube is still lazy with has_lazy_data and then resolving the numbers and checking they actually match.

Also, I'd do an actual transpose rather than deliberately doing a "null" transpose.

Putting those together:

data = np.arange(12).reshape(3, 4) cube = Cube(biggus.NumpyArrayAdapter(data)) cube.transpose() self.assertTrue(cube.has_lazy_data()) self.assertArrayEqual(data.T, cube.data)

Thanks @rhattersley.

rhattersley · 2016-05-09T12:50:49Z

Squash-away my friend! Squash-away!

cpelley · 2016-05-09T14:29:46Z

squashed :)

Cheers

rhattersley · 2016-05-09T15:16:28Z

ooh just remembered, don't we need to update what's new for this?

My turn to remember the "what's new" entry! 😉

rhattersley · 2016-05-12T07:23:18Z

ooh just remembered, don't we need to update what's new for this?

My turn to remember the "what's new" entry! 😉

@cpelley In other words... the only thing this needs is a what's new entry and it's ready to merge.

cpelley · 2016-05-12T13:06:38Z

Thanks @rhattersley yes sorry I'm on it.

cpelley · 2016-05-20T10:16:16Z

whatsnew done.

Cheers

rhattersley · 2016-05-20T14:06:40Z

docs/iris/src/whatsnew/contributions_1.10/newfeature_2016-May-20_lazy_cube_transpose.txt

@@ -0,0 +1 @@
+* The transpose method of a Cube now results in a transposed view of the original data rather than a transposed copy.


I'm not sure this really captures the lazy-loading aspect...?

cpelley · 2016-06-10T08:52:57Z

I have updated the wording of the whatsnew.

Cheers

cpelley · 2016-06-24T10:29:31Z

Can we get this into the v1.10 milestone?

DPeterK · 2016-06-27T11:28:32Z

@cpelley realistically no - v1.10 is big enough already and we just need to get it out the door now rather than adding new functionality to the milestone that might hold it up.

cpelley · 2016-06-27T11:36:58Z

@cpelley realistically no - v1.10 is big enough already and we just need to get it out the door now rather than adding new functionality to the milestone that might hold it up.

Can you please reconsider? This ticket has gone through a review with the only remaining thing flagged by @rhattersley being the whatsnew entry. This was completed over two weeks ago.

djkirkham · 2016-08-09T07:48:24Z

I know I'm late to the party, but this change will mean that when transpose() is called on a cube without lazy data, it will then appear to have lazy data, i.e. cube.has_lazy_data() will return true.

cpelley · 2016-08-09T09:53:30Z

Thanks for your interest @djkirkham. This behaviour was expected: iris.tests.unit.cube.test_Cube.Test_transpose.test_lazy_data

To modify the behaviour of has_lazy_data (i.e. change the implied meaning of this method) I think is outside the scope of this ticket.
Take for example the following simple example that based on your comment you might also have a problem with?

>>> arr = biggus.NumpyArrayAdapter(np.arange(4).reshape(2, 2))
>>> cube = iris.cube.Cube(arr)
>>> cube.has_lazy_data()
True

I should say that I don't have a problem with the above (but then I have taken the meaning of iris lazy_data to be the behaviour of has_lazy_data, which I think is perfectly reasonable especially given that there is no docstring. Initialising a cube with a biggus array is a very use case in case.

Is there a preference for doing some type checking in transpose?
This would be contrary to current iris behaviour in similar situations like shown in my example above.

Cheers

djkirkham · 2016-08-09T10:30:08Z

I'm admittedly a novice when it comes to Iris, but my understanding of the meaning of has_lazy_data() is "has the data been loaded from its initial data source?". Thus, I don't have a problem with your example, since the data hasn't been "loaded" from the data source, which happens to be a numpy array.

For me, the usefulness of has_lazy_data() comes from the ability to use it to check whether cube operations load in the data. With the current change that ability is lost with respect to transpose.

Is there a preference for doing some type checking in transpose?

My preference would be to use self.lazy_data() if self.has_lazy_data() returns True, and use self.data otherwise, although maybe that defeats the object of this PR.

cpelley · 2016-08-09T12:25:09Z

Thanks @djkirkham I'll have a closer look and see what I can do.

Cheers

cpelley · 2016-08-09T14:04:37Z

...Thus, I don't have a problem with your example, since the data hasn't been "loaded" from the data source, which happens to be a numpy array.

I don't see how the term "load" comes into it in the context of my example. Right now, has_lazy_data simply means "is biggus array". I suggest we add documentation to has_lazy_data to clarify that (new ticket).

I think I agree in principle with what you expect as the return value though. I have pushed a new commit which now does this.

Thanks @djkirkham

djkirkham · 2016-08-09T15:02:19Z

Thanks @cpelley, the changes look good.

Right now, has_lazy_data simply means "is biggus array".

I would say that's just the way it's implemented, rather than the intended meaning of the method. Correct me if I'm wrong, but currently the only way cube._my_data ever gets assigned a biggus array is if it's passed in by the user, and if the data in that array is ever loaded, cube._my_data gets assigned a NumPy array.

I agree that it would be a good idea to have the intended meaning documented somewhere.

cpelley · 2016-08-16T07:56:59Z

Ping, the test failures are timeouts.

Cheers

cpelley · 2016-08-23T13:59:15Z

ping

cpelley · 2016-11-29T11:46:32Z

Please let me know if there's anything I can do to help to get this in.
This would provide very useful capability for CMIP6 due to the sheer size of data involved.

Cheers

marqh · 2016-12-07T13:47:57Z

Please let me know if there's anything I can do to help to get this in.
This would provide very useful capability for CMIP6 due to the sheer size of data involved.

this looks like a done deal to me, thanks @cpelley

cpelley · 2016-12-08T09:17:10Z

Thanks @marqh, this will make a significant difference to us. Appreciated.

cpelley changed the title ~~ENH: Lazy cube data transpose~~ ENH: Cube in-place transform and dimension inversion Apr 25, 2016

cpelley changed the title ~~ENH: Cube in-place transform and dimension inversion~~ ENH: Cube in-place transpose and dimension inversion Apr 25, 2016

cpelley mentioned this pull request Apr 26, 2016

ENH: Added coordinate inversion functionality #1731

Closed

ajdawson reviewed Apr 26, 2016
View reviewed changes

rhattersley reviewed Apr 27, 2016
View reviewed changes

ENH: Lazy cube data transpose

fd04aa8

cpelley force-pushed the LAZY_CUBE_TRANSPOSE branch from 08df29a to fd04aa8 Compare May 9, 2016 14:29

DOC: Whatsnew

08de76f

rhattersley reviewed May 20, 2016
View reviewed changes

DOC: Updated whatsnew wording

c1ca711

MAINT: Changes based on review

5eef732

marqh merged commit 0f432c8 into SciTools:master Dec 7, 2016

QuLogic added this to the v1.12.x milestone Dec 8, 2016

cpelley deleted the LAZY_CUBE_TRANSPOSE branch December 8, 2016 09:17

		@@ -0,0 +1 @@
		* The transpose method of a Cube now results in a transposed view of the original data rather than a transposed copy.

ENH: Lazy in-place Cube transpose #1983

ENH: Lazy in-place Cube transpose #1983

Uh oh!

Conversation

cpelley commented Apr 25, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cpelley commented Apr 25, 2016

Uh oh!

ajdawson commented Apr 26, 2016

Uh oh!

ajdawson commented Apr 26, 2016

Uh oh!

rhattersley commented Apr 26, 2016

Uh oh!

cpelley commented Apr 26, 2016

Uh oh!

cpelley commented Apr 26, 2016

Uh oh!

cpelley commented Apr 26, 2016

Uh oh!

ajdawson commented Apr 26, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ajdawson Apr 26, 2016

Choose a reason for hiding this comment

Uh oh!

ajdawson Apr 26, 2016

Choose a reason for hiding this comment

Uh oh!

cpelley Apr 26, 2016

Choose a reason for hiding this comment

Uh oh!

cpelley commented Apr 26, 2016

Uh oh!

ajdawson commented Apr 26, 2016

Uh oh!

cpelley commented Apr 26, 2016

Uh oh!

ajdawson commented Apr 26, 2016

Uh oh!

ajdawson Apr 26, 2016

Choose a reason for hiding this comment

Uh oh!

cpelley commented Apr 26, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cpelley commented Apr 26, 2016

Uh oh!

rhattersley commented Apr 26, 2016

Uh oh!

cpelley commented Apr 26, 2016

Uh oh!

cpelley commented Apr 26, 2016

Uh oh!

ajdawson commented Apr 26, 2016

Uh oh!

cpelley commented Apr 27, 2016

Uh oh!

ajdawson commented Apr 27, 2016

Uh oh!

cpelley commented Apr 27, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ajdawson commented Apr 27, 2016

Uh oh!

cpelley commented Apr 27, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rhattersley Apr 27, 2016

Choose a reason for hiding this comment

Uh oh!

cpelley May 3, 2016

Choose a reason for hiding this comment

Uh oh!

rhattersley commented May 9, 2016

Uh oh!

cpelley commented May 9, 2016

Uh oh!

rhattersley commented May 9, 2016

cpelley commented Apr 25, 2016 •

edited

Loading

ajdawson commented Apr 26, 2016 •

edited

Loading

cpelley commented Apr 26, 2016 •

edited

Loading

cpelley commented Apr 27, 2016 •

edited

Loading

cpelley commented Apr 27, 2016 •

edited

Loading

cpelley commented Aug 9, 2016 •

edited

Loading

cpelley commented Nov 29, 2016 •

edited

Loading