Skip to content

Conversation

@weiji14
Copy link
Owner

@weiji14 weiji14 commented Nov 14, 2018

Transition to a tool better suited for interpolating a grid from points 🚚. PDAL has served us well, with clear visibility in its JSON pipeline files that also offer a lot of flexibility. Moving to Generic Mapping Tools (GMT) might mean that we lose that clear visibility (no JSON pipelines 😿), but it will give us better control over the interpolation method that was limited in PDAL's case. Specifically, the key change is to move from an Inverse Distance Weighted interpolation in PDAL's gdal writer to the more powerful spline interpolation functions offered by GMT (e.g. GMT surface)

TODO:

  • See if updating pdal binaries to 1.8.0 with python-pdal 2.1.0 works Nope
  • Codify the original PDAL pipeline in our Behavioural Driven Development (BDD) test framework (see Setup Behaviour Driven Development #51) (19784f8)
  • Create new pipeline JSON parser similar to PDAL's (59ab910)
    • Added pyproj library for coordinate transformations (c8a6d4a)
  • Add more datasets that were missed out before:
    • Bingham2018PIG - 2007t1.txt and 2007tr.txt (59ab910)
    • Shi2010CRESIS - 2017_Antarctica_Basler.csv (74c4e69, 7a48d6a)
  • Refactor the interpolation feature from PDAL to GMT using BDD.
    • Get gmt.surface from gmt-python (a84dfd6)
    • Grid xyz data using adjustable tension continuous curvature splines (i.e. surface) (ddc1da3)
    • Convert the output netcdf .nc grid to a geotiff .tif NetCDF is much smaller, 16MB for 20xx_Antarctica_DC8.nc compared to 725MB for 20xx_Antarctica_DC8.tif
    • Encode EPSG:3031 projection metadata into netcdf4 file somehow (TODO next time)
    • Make pretty plots!

References:

  • Holt, J. W., Blankenship, D. D., Morse, D. L., Young, D. A., Peters, M. E., Kempf, S. D., … Corr, H. F. J. (2006). New boundary conditions for the West Antarctic Ice Sheet: Subglacial topography of the Thwaites and Smith glacier catchments. Geophysical Research Letters, 33(9). https://doi.org/10.1029/2005GL025561
  • Lythe, M. B., & Vaughan, D. G. (2001). BEDMAP: A new ice thickness and subglacial topographic model of Antarctica. Journal of Geophysical Research: Solid Earth, 106(B6), 11335–11351. https://doi.org/10.1029/2000JB900449
  • Smith, W. H. F., & Wessel, P. (1990). Gridding with continuous curvature splines in tension. GEOPHYSICS, 55(3), 293–305. https://doi.org/10.1190/1.1442837
  • Wessel, P., Smith, W. H. F., Scharroo, R., Luis, J., & Wobbe, F. (2013). Generic Mapping Tools: Improved Version Released. Eos, Transactions American Geophysical Union, 94(45), 409–410. https://doi.org/10.1002/2013EO450001

@weiji14 weiji14 added the enhancement ✨ New feature or request label Nov 14, 2018
@weiji14 weiji14 added this to the v0.4.0 milestone Nov 14, 2018
@weiji14 weiji14 self-assigned this Nov 14, 2018
Getting the new version of pdal. See https://github.com/PDAL/PDAL/releases/tag/1.8.0 for full release notes. May want to give pdal-python another chance later.
@weiji14 weiji14 changed the title WIP Change from PDAL's IDW interpolation to a GMT based spline interpolation WIP Change from IDW to spline interpolation Nov 14, 2018
Making the parser into a function so that it can be (re)used inside behave later on. Also turned the pandas table print code into a lambda function.
Retrying pdal-python installation. Adding gcc since a compiler is needed, and using personal fork of pdal-python 2.1.0 with the D_GLIBCXX_USE_CXX11_ABI=0 flag added to setup.py to fix an ImportError issue.
Revert to origin branch which resolves some "undefined symbol: _ZNK4pdal..." issues.
Did not realize that my local copy of bed_WGS84_grid.txt was corrupted, so changing the hash. Also remove extra whitespace in the data_list.yml file. Patches #48.
First step in breaking down the oneliner pdal bash script into more chunks. Main addition is a processing_pipeline function that uses pdal-python currently, but is in serious need of refactoring to GMT soon. Also added a summary flag to _integration_test_ipynb that is helpful for work in progress behavioural testing.
Temporarily adding a @Skip and @wip tag to the new feature, because there is still a problem with `import pdal` inside the docker image. ImportError: home/jovyan/.conda/envs/deepbedmap/lib/./libgdal.so.20: undefined symbol: _ZN11xercesc_3_211InputSource11setEncodingEPKt
Out of the frying pan and into the fire. Remove pdal dependency, and go for straight GMT with python-gmt installed from git repo! Compilation still requires gcc, which I've decided to take from the full bionic buildpack-deps image instead of installing from conda as in 2a87b7d.
Proj4 based conversion.
@weiji14 weiji14 force-pushed the pdal_to_gmt branch 2 times, most recently from 40a4036 to fb762ed Compare November 19, 2018 08:23
New CRESIS csv file was uploaded a few days ago, and the unit test caught it! Good ol' unintended consequences of unit testing :D Fix our unit test by choosing another blank file, and add this new dataset to our data_list for good measure. Relates to #21.
Repurposing the pdal pipeline file for reading our data. Using pandas and pyproj to do the reading and reprojection work. Much longer code than pdal equivalent but feels faster, and it returns a nicely formatted pandas.DataFrame with xyz columns.

Also added 2007tx.json for 2007t1.txt and 2007tr.txt that were originally left out in #7.
Allowing finer grained creation of interpolated terrain surfaces. Using own personal fork of gmt-python with working gmt.surface! Requires pinning netcdf4 down to 1.4.1 from 1.4.2, don't ask.

Well actually, the netcdf4 pinning may have something to do with how xarray handles netcdf4 files in the backend. Smaller netcdf files stored in format: 'classic' can be opened either way, but bigger netcdf files stored in format: 'netCDF-4' will throw an error with python netCDF4 1.4.2 somehow.
All the CRESIS outputs had same x and y columns... SImple syntax error fix. Good thing the doctest made it obvious, starting to really appreciate these unit tests for untraditional reasons. Patches 59ab910.
Better topographic raster by using Spline Interpolation with Tension! Conversion of point data to grid data done via GMT 6 library (in development), see https://doi.org/10.1002/2013EO450001 for GMT 5. Preprocessing of points using GMT `blockmedian` which is more suitable for topography (instead of GMT `blockmean`), as detailed in the original BEDMAP1 paper by Lythe et al. 2001 p. 11340 at https://doi.org/10.1029/2000JB900449. Final individual 250m high resolution grids created using GMT `surface`'s interpolation with a continuous curvature spline, using a tension factor of 0.35 as suggested by Holt et. al 2006 in https://doi.org/10.1029/2005GL025561.

Left TODO is inserting projection information (metadata) into the netcdf4 grid outputs so that the files can be distributed. Also need to recreate plots, and maybe find a faster way to run `surface` (currently takes ~30min each for the CRESIS datasets).
Checks that the full processing pipeline works. Covers processing from raw text file to clean xyz to interpolation to output file opening. Modifies 19784f8.
Slightly more coverage and regenerate grid plots! Explicitly setting the mask flag '-M' in GMT `surface` to 2c (see https://gmt.soest.hawaii.edu/doc/latest/surface.html#m) meaning a 5x5 cell neighbourhood mask. This has an effect similar to (but not exactly like) the window_size option in pdal we used before (see https://pdal.io/stages/writers.gdal.html#window-size). Plots re-rendered to show new 2007tx.nc plus updated areas for 201x_Antarctica_Basler.nc (now includes 2017_Antarctica_Basler.csv after 7a48d6a).
Final part of #52! Increasing number of training image tiles from 2111 to 2480! Couple of changes for that to work. The grid spacing -I flag in GMT now uses 250+e instead of 250 so that output cell size is precisely 250 metres. Mask_cell_radius was increased from 2 to 3 (i.e. 7x7 pixel neighbourhood mask) to get more tiles along the Siple Coast area.

Tiling functions refactored for speed, fixing rounding errors, and linted using black. Note that we are still using the default gridline registration https://gmt.soest.hawaii.edu/doc/latest/GMT_Docs.html#gridline-registration when doing interpolation. Unit tests added for get_window_bounds and selective_tile functions!!
@weiji14 weiji14 changed the title WIP Change from IDW to spline interpolation Change from IDW to spline interpolation Nov 24, 2018
@weiji14 weiji14 merged commit ab0295e into master Nov 24, 2018
@weiji14 weiji14 deleted the pdal_to_gmt branch November 24, 2018 08:02
@weiji14
Copy link
Owner Author

weiji14 commented Nov 24, 2018

🎉

weiji14 added a commit that referenced this pull request Nov 27, 2018
Super Resolution Generative Adversarial Network retrained following new grids created at ab0295e of #52 in v0.4.0. Number of training tiles have increased from 2111 to 2480, and the model is now trained using keras 'fit' instead of 'train_on_batch' to handle the larger volume of data (by training in small batches). JSON model architecture is revealed! But model weights withheld for now.

Visual inspection of results actually looks more reasonable, although the tiles themselves still appears grainy. Probably looks better in 3D ;) Need to find a better metric instead of using PSNR, and continue to improve both the quality of the data inputs and the model itself.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement ✨ New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant