Trajectory vectorise #2252

pp-mo · 2016-11-30T13:44:06Z

Provides a huge speedup to trajectory interpolation.

Sorry it's so complicated : changes are staged in individual commits, which may help.

Stop Press: now rebased, good to go

pp-mo · 2016-11-30T13:47:36Z

Some performance testing results,
Using this code

#
# Exercise the "UnstructuredNearest" regridding scheme.
# Regridding test gungho data to simple 'normal' grids.
#

from __future__ import print_function
import six
import datetime

import iris
from iris.analysis import UnstructuredNearest
from iris.analysis.trajectory import interpolate as traj_interp
from iris.analysis.cartography import wrap_lons
import numpy as np
import iris.quickplot as qplt
import matplotlib.pyplot as plt

basic_grid_cube = iris.load_cube(iris.sample_data_path('E1.2098.pp'))
filepath = ('/net/home/h05/itpp/Iris/sprints/sprint_20161121_gungho#1/odds/'
            'theta_nodal_xios.nc')
#gungho_cube = iris.load('sample_data/12x12_mesh_5_layers/theta_nodal_xios.nc')[1]
gungho_cube = iris.load(filepath)[0][0,0]

# Fix broken time coord units.
for coord in gungho_cube.coords():
    if 'time' in coord.name():
        coord.units = 'days since 1970-01-01'
        coord.points = [0.0]
        coord.bounds = [0.0, 1.0]

# Hack the 'broken' XY coord units (FOR NOW).
assert gungho_cube.coord(axis='x').name() == 'longitude'
assert gungho_cube.coord(axis='y').name() == 'latitude'
gungho_cube.coord(axis='x').units = 'radians'
gungho_cube.coord(axis='y').units = 'radians'
gungho_cube.coord(axis='x').convert_units('degrees')
gungho_cube.coord(axis='y').convert_units('degrees')
# Also fix these to be 'ordinary' degrees-based ones (just in case).


def test(sub_sample=1):
    grid_cube = basic_grid_cube[::sub_sample, ::sub_sample]


    # Get a scheme object (no args yet).
    unstruct_nn_scheme = UnstructuredNearest()

    # Perform the regrid + time it.
    start_datetime = datetime.datetime.now()
    result_cube = gungho_cube.regrid(grid_cube, unstruct_nn_scheme)

    # Report how long it took.
    end_datetime = datetime.datetime.now()
    elapsed_deltatime = end_datetime - start_datetime
    seconds = elapsed_deltatime.total_seconds()
    msg = 'Time taken : {:12.6f} seconds'
    print(msg.format(seconds))

    return result_cube


for ss in (32, 16, 8, 4, 2, 1):
    print()
    print('Subsample:', ss)
    result_cube = test(ss)
    print( result_cube )

    test_cube = result_cube
    vmin = np.min(test_cube.data)
    vmax = np.max(test_cube.data)
    cmap = plt.get_cmap('magma')
    qplt.pcolormesh(test_cube, vmin=vmin, vmax=vmax, cmap=cmap)
    plt.gca().gridlines(draw_labels=True)
    source_x_co = gungho_cube.coord(axis='x')
    target_x_co = basic_grid_cube.coord(axis='x')
    xx = source_x_co.points[:]
#    x_min = np.min(target_x_co.points)
#    x_mod = target_x_co.units.modulus
    # Wrap source longitudes into the range of the output plot.
    x_min, x_max = plt.gca().get_xlim()
    xx = wrap_lons(xx, x_min, target_x_co.units.modulus)
#    iris.analysis.cartography.wrap_lons()
    yy = gungho_cube.coord(axis='y').points[:]
    # Plot ON TOP the original points as larger filled markers, with their
    # assigned colours (so appear in centre of each region).
    for i, (x, y) in enumerate(zip(xx, yy)):
            data_value = gungho_cube.data[i]
            color_fraction = (data_value - vmin) / (vmax - vmin)
            color_value = cmap(color_fraction)
            plt.plot(x, y, 'o', color=color_value)
        #        plt.text(x, y, str(i), color='black', transform=crs_pc)


    plt.show()

Some timing results...

Branch "master"
16 :    0.809682, 0.802585
8  :    2.992907, 3.004774
4  :   11.647787, 11.414579
2  :   45.513825, 44.803909
1  :  178.349221


Branch "trajectory_vectorise"

16 :    0.037588, 0.037569, 0.036902
8 :     0.040384, 0.040101, 0.039650
4 :     0.050284, 0.050517, 0.051032
2 :     0.097882, 0.093747, 0.097210
1 :     0.381357, 0.283466, 0.391238, 0.273740, 0.273446


        shape       n-points    master      vector      |   master/[uSec/pt]t]  vector[uSec/pt]
        -----       --------    ----        ----        |   ----                ---- 
16 :    10 * 12     120         0.8061335   0.0373530       6717.779167	        311.275
8 :     19 * 24     456         2.9988405   0.0400450       6576.404605	        87.81798246
4 :     37 * 48     1776        11.531183   0.0506110       6492.783221	        28.49718468
2 :     73 * 96     7008        45.158867   0.0962797       6443.902255	        13.73854167
1 :    145 * 192    27840       178.349221  0.3206494       6406.222019	        11.51757902


Headline improvement factor : 
6400 / 12 = 533
More than 500 times speedup !!

marqh · 2016-12-01T10:59:45Z

lib/iris/analysis/_interpolate_private.py



-def _nearest_neighbour_indices_ndcoords(cube, sample_point, cache=None):
+def _nearest_neighbour_indices_ndcoords(cube, sample_points, cache=None):


this function is now quite clever. Would it be plausible to have a unit test which threw some known and soluble problems at this function at asserted that the expected answers were produced?

marqh · 2016-12-01T11:06:14Z

this is all looking good to me. I just want to establish that we have sufficient pragmatic test coverage, then i'm happy to merge

please could we consider adding a bit more integration test coverage to
lib/iris/tests/integration/test_regridding.py
https://github.com/corinnebosley/iris/blob/8a2f2d5da68639d630e5dcb46a1dd6b7f8017aeb/lib/iris/tests/integration/test_regridding.py#L100

at present only nearest is tested. Also, I have seen issue with accidental rotation. perhaps testing a few explicit values within the result array as well would help protect against this being missed, the statistics could be very close in this case

I wonder whether a neat way to deliver this would be to prepare a separate pull request with a couple of extra integration tests and get that merged, then this PR can rebase against that and build our confidence that these optimisations are not changing behaviour

pp-mo · 2016-12-05T16:37:18Z

Stop press : DON'T MERGE YET
First I'm going to submit another PR adding some more comprehensive testcases, as @marqh suggests
so we can have better confidence that this refactor actually doesn't break anything.

Produced that enhanced testing framework now : #2258

A temporary additional PR on my repo, here, is showing that the code here functions okay, when rebased onto #2258
We'll bring that work back here when #2258 is finished + accepted.

marqh · 2016-12-06T09:55:55Z

@pp-mo

#2258 is merged. please may you rebase, so that the tests are rerun including the new ones?

thank you

…points.

…gion of the data.

pp-mo · 2016-12-06T12:27:09Z

#2258 is merged. please may you rebase

Rebased, please re-review @marqh

marqh · 2016-12-06T16:12:53Z

@pp-mo
More than 500 times speedup !!

nice work :)

pp-mo added the Status: Work in Progress label Nov 30, 2016

marqh reviewed Dec 1, 2016

View reviewed changes

pp-mo mentioned this pull request Dec 5, 2016

Trajectory regrid tests #2258

Merged

pp-mo added 10 commits December 6, 2016 10:28

Separate linear+nearest codepaths in trajectory-interpolate.

23599b6

Remove redundant test in _nearest_neighbour_indices_ndcoords.

a51c875

Simplify routine setup slightly.

06591f5

Reform _nearest_neighbour_indices_ndcoords api to do multiple sample …

34bc437

…points.

Replace per-point cube indexing with fancy indexing; load a square re…

fb31f9e

…gion of the data.

pep8 fix.

eab0456

Vectorise _nearest_neighbour_indices_ndcoords and its kdtree operation.

9f02cfc

Small behaviour tweaks + fix tests.

d678a30

Fix should-never-happen error reporting.

766c93e

Revert meaningless change in test_trajectory.

cff042f

pp-mo force-pushed the trajectory_vectorise branch from 1a74df2 to cff042f Compare December 6, 2016 12:07

pp-mo mentioned this pull request Dec 6, 2016

Trajectory vectorise rebased (more tests) pp-mo/iris#38

Closed

marqh merged commit 7f6ae9c into SciTools:master Dec 6, 2016

marqh removed the Status: Work in Progress label Dec 6, 2016

QuLogic added this to the v1.12.x milestone Dec 6, 2016

lbdreyer mentioned this pull request Jan 30, 2017

Whats new improvements and additions of missing items. #2330

Merged

pp-mo deleted the trajectory_vectorise branch October 8, 2020 09:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Trajectory vectorise #2252

Trajectory vectorise #2252

Uh oh!

pp-mo commented Nov 30, 2016 •

edited

Loading

Uh oh!

pp-mo commented Nov 30, 2016

Uh oh!

marqh Dec 1, 2016

Uh oh!

marqh commented Dec 1, 2016

Uh oh!

pp-mo commented Dec 5, 2016 •

edited

Loading

Uh oh!

marqh commented Dec 6, 2016

Uh oh!

pp-mo commented Dec 6, 2016

Uh oh!

marqh commented Dec 6, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants



		def _nearest_neighbour_indices_ndcoords(cube, sample_point, cache=None):
		def _nearest_neighbour_indices_ndcoords(cube, sample_points, cache=None):

Trajectory vectorise #2252

Trajectory vectorise #2252

Uh oh!

Conversation

pp-mo commented Nov 30, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pp-mo commented Nov 30, 2016

Uh oh!

marqh Dec 1, 2016

Choose a reason for hiding this comment

Uh oh!

marqh commented Dec 1, 2016

Uh oh!

pp-mo commented Dec 5, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marqh commented Dec 6, 2016

Uh oh!

pp-mo commented Dec 6, 2016

Uh oh!

marqh commented Dec 6, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pp-mo commented Nov 30, 2016 •

edited

Loading

pp-mo commented Dec 5, 2016 •

edited

Loading