Skip to content

Conversation

@pp-mo
Copy link
Member

@pp-mo pp-mo commented Nov 30, 2016

Provides a huge speedup to trajectory interpolation.

Sorry it's so complicated : changes are staged in individual commits, which may help.

Stop Press: now rebased, good to go

@pp-mo
Copy link
Member Author

pp-mo commented Nov 30, 2016

Some performance testing results,
Using this code

#
# Exercise the "UnstructuredNearest" regridding scheme.
# Regridding test gungho data to simple 'normal' grids.
#

from __future__ import print_function
import six
import datetime

import iris
from iris.analysis import UnstructuredNearest
from iris.analysis.trajectory import interpolate as traj_interp
from iris.analysis.cartography import wrap_lons
import numpy as np
import iris.quickplot as qplt
import matplotlib.pyplot as plt

basic_grid_cube = iris.load_cube(iris.sample_data_path('E1.2098.pp'))
filepath = ('/net/home/h05/itpp/Iris/sprints/sprint_20161121_gungho#1/odds/'
            'theta_nodal_xios.nc')
#gungho_cube = iris.load('sample_data/12x12_mesh_5_layers/theta_nodal_xios.nc')[1]
gungho_cube = iris.load(filepath)[0][0,0]

# Fix broken time coord units.
for coord in gungho_cube.coords():
    if 'time' in coord.name():
        coord.units = 'days since 1970-01-01'
        coord.points = [0.0]
        coord.bounds = [0.0, 1.0]

# Hack the 'broken' XY coord units (FOR NOW).
assert gungho_cube.coord(axis='x').name() == 'longitude'
assert gungho_cube.coord(axis='y').name() == 'latitude'
gungho_cube.coord(axis='x').units = 'radians'
gungho_cube.coord(axis='y').units = 'radians'
gungho_cube.coord(axis='x').convert_units('degrees')
gungho_cube.coord(axis='y').convert_units('degrees')
# Also fix these to be 'ordinary' degrees-based ones (just in case).


def test(sub_sample=1):
    grid_cube = basic_grid_cube[::sub_sample, ::sub_sample]


    # Get a scheme object (no args yet).
    unstruct_nn_scheme = UnstructuredNearest()

    # Perform the regrid + time it.
    start_datetime = datetime.datetime.now()
    result_cube = gungho_cube.regrid(grid_cube, unstruct_nn_scheme)

    # Report how long it took.
    end_datetime = datetime.datetime.now()
    elapsed_deltatime = end_datetime - start_datetime
    seconds = elapsed_deltatime.total_seconds()
    msg = 'Time taken : {:12.6f} seconds'
    print(msg.format(seconds))

    return result_cube


for ss in (32, 16, 8, 4, 2, 1):
    print()
    print('Subsample:', ss)
    result_cube = test(ss)
    print( result_cube )

    test_cube = result_cube
    vmin = np.min(test_cube.data)
    vmax = np.max(test_cube.data)
    cmap = plt.get_cmap('magma')
    qplt.pcolormesh(test_cube, vmin=vmin, vmax=vmax, cmap=cmap)
    plt.gca().gridlines(draw_labels=True)
    source_x_co = gungho_cube.coord(axis='x')
    target_x_co = basic_grid_cube.coord(axis='x')
    xx = source_x_co.points[:]
#    x_min = np.min(target_x_co.points)
#    x_mod = target_x_co.units.modulus
    # Wrap source longitudes into the range of the output plot.
    x_min, x_max = plt.gca().get_xlim()
    xx = wrap_lons(xx, x_min, target_x_co.units.modulus)
#    iris.analysis.cartography.wrap_lons()
    yy = gungho_cube.coord(axis='y').points[:]
    # Plot ON TOP the original points as larger filled markers, with their
    # assigned colours (so appear in centre of each region).
    for i, (x, y) in enumerate(zip(xx, yy)):
            data_value = gungho_cube.data[i]
            color_fraction = (data_value - vmin) / (vmax - vmin)
            color_value = cmap(color_fraction)
            plt.plot(x, y, 'o', color=color_value)
        #        plt.text(x, y, str(i), color='black', transform=crs_pc)


    plt.show()

Some timing results...

Branch "master"
16 :    0.809682, 0.802585
8  :    2.992907, 3.004774
4  :   11.647787, 11.414579
2  :   45.513825, 44.803909
1  :  178.349221


Branch "trajectory_vectorise"

16 :    0.037588, 0.037569, 0.036902
8 :     0.040384, 0.040101, 0.039650
4 :     0.050284, 0.050517, 0.051032
2 :     0.097882, 0.093747, 0.097210
1 :     0.381357, 0.283466, 0.391238, 0.273740, 0.273446


        shape       n-points    master      vector      |   master/[uSec/pt]t]  vector[uSec/pt]
        -----       --------    ----        ----        |   ----                ---- 
16 :    10 * 12     120         0.8061335   0.0373530       6717.779167	        311.275
8 :     19 * 24     456         2.9988405   0.0400450       6576.404605	        87.81798246
4 :     37 * 48     1776        11.531183   0.0506110       6492.783221	        28.49718468
2 :     73 * 96     7008        45.158867   0.0962797       6443.902255	        13.73854167
1 :    145 * 192    27840       178.349221  0.3206494       6406.222019	        11.51757902


Headline improvement factor : 
6400 / 12 = 533
More than 500 times speedup !!



def _nearest_neighbour_indices_ndcoords(cube, sample_point, cache=None):
def _nearest_neighbour_indices_ndcoords(cube, sample_points, cache=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is now quite clever. Would it be plausible to have a unit test which threw some known and soluble problems at this function at asserted that the expected answers were produced?

@marqh
Copy link
Member

marqh commented Dec 1, 2016

this is all looking good to me. I just want to establish that we have sufficient pragmatic test coverage, then i'm happy to merge

please could we consider adding a bit more integration test coverage to
lib/iris/tests/integration/test_regridding.py
https://github.com/corinnebosley/iris/blob/8a2f2d5da68639d630e5dcb46a1dd6b7f8017aeb/lib/iris/tests/integration/test_regridding.py#L100

at present only nearest is tested. Also, I have seen issue with accidental rotation. perhaps testing a few explicit values within the result array as well would help protect against this being missed, the statistics could be very close in this case

I wonder whether a neat way to deliver this would be to prepare a separate pull request with a couple of extra integration tests and get that merged, then this PR can rebase against that and build our confidence that these optimisations are not changing behaviour

@pp-mo pp-mo mentioned this pull request Dec 5, 2016
@pp-mo
Copy link
Member Author

pp-mo commented Dec 5, 2016

Stop press : DON'T MERGE YET
First I'm going to submit another PR adding some more comprehensive testcases, as @marqh suggests
so we can have better confidence that this refactor actually doesn't break anything.

Produced that enhanced testing framework now : #2258

A temporary additional PR on my repo, here, is showing that the code here functions okay, when rebased onto #2258
We'll bring that work back here when #2258 is finished + accepted.

@marqh
Copy link
Member

marqh commented Dec 6, 2016

@pp-mo

#2258 is merged. please may you rebase, so that the tests are rerun including the new ones?

thank you

@pp-mo pp-mo force-pushed the trajectory_vectorise branch from 1a74df2 to cff042f Compare December 6, 2016 12:07
@pp-mo
Copy link
Member Author

pp-mo commented Dec 6, 2016

#2258 is merged. please may you rebase

Rebased, please re-review @marqh

@marqh
Copy link
Member

marqh commented Dec 6, 2016

@pp-mo
More than 500 times speedup !!

nice work :)

@marqh marqh merged commit 7f6ae9c into SciTools:master Dec 6, 2016
@QuLogic QuLogic added this to the v1.12.x milestone Dec 6, 2016
@pp-mo pp-mo deleted the trajectory_vectorise branch October 8, 2020 09:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants