discuss these cell logic functions #1

mdsumner · 2022-08-31T13:05:14Z

I have R versions of the cell logic from raster,terra in vaster and similar for tiling in grout, both unpolished:

https://github.com/hypertidy/vaster

https://github.com/hypertidy/grout

I think these make sense on their own, grid logic independent of any file or data handling and I'd like to see (or build myself) python and other lang versions. I'm also interested to contribute some of this to GDAL itself, there's at least a few cases I would use it for features in the lib-apps I want, but that needs a broader review atm.

what is the minimal or sensible set of functions
funs are of dim, extent or a combination, should also have objects that provide methods (or closures that record a dim+extent)
should funs be vectorized, i.e.multiple sets of dim+extent
what index conversions needed for tiles-as-children rasters, netcdf vs gdal indexes etc
grid alignment, compare gdal projwin to raster snap in,out,near - and ability for gdalwarp to act as RasterIO with a snap option

These are just brain dump ideas atm for things I've been doing in R and want more broadly in gdal and elsewhere 🙏

geoarrow/geoarrow#24 (comment)

paleolimbot · 2022-08-31T14:19:54Z

Keep dumping here! I had a bunch scraped out here, too: https://github.com/paleolimbot/grd/blob/master/R/cell.R

mdsumner · 2022-09-01T00:42:25Z

I didn't forget about those ... not entirely anyway! But, it's been very illuminating to strip down to just the bare essentials, and then see that some functions are only a function of dimension, some are of extent and dimension, and some compare extents (basically the snap stuff).

I see you have pretty serious snap options in grd ... I'm resistant to having an object that is also for data and vis for this functionality (a 0-dim array is a nice trick but makes me uncomfortable) - which is why I didn't just run with grd ... but, I'm also drawn to having an OOP solution - I guess there are at root functions for grid logic, and then there's a heirarchy of tools -objects that do variously

knows its raster-ness (extent + dimension)
knows only its alignment (the origin + resolution) - on reflection I guess that's what a (shear == 0) geotransform is ... hmm
knows only its dimension (a bare image, with a default extent - variously [0,1] or [0,dim] depending on context)
knows the above stuff and is ready to bare-metal read/vis/stream from sources that have these properties

I'm fleshing out my interactions with this logic as I slowly become independent from raster - recently I wrote raster::trim() from scratch, just to see what the logic is like - and like many vis and extraction and reprojection tasks for a given map, there's very often a back-and-forth, get enough data to find the "nearblack" margin, then apply that to a warp-streamed subset read. (That's a data-dependent task though, and perhaps better done by gdal with nearblack anyway - some of these things I've been thinking of a GDAL-api hooks that don't exist yet and I could write).

I'm interested very much in getting this family of grid logic that's entirely independent of data - things like polygon extractions from netcdf time series, what you really want is the 2D cell index of those polygons, then batch those into netcdf chunks - and the key idea here is that the indexing logic and query plan is entirely independent of the actual data source. I'm low level fleshing this out with a colleague in the climate model space, and he has very large workflows of interest, it's not just me and my tools ;)

mdsumner · 2022-09-01T00:44:58Z

and like, GDAL is crasy fast to rasterize polygons, as is {fasterize} - but I don't want a polygon-value burned tif as output, I want a table of cell index and polygon ID that I use for this plan-query batching - and for that I need index-converters from global cell (extent+dimension) to chunk cell (tiled arithmetic converts a global cell to a chunk-in-memory index).

more thoughts than code atm, but I have a lot of these pieces around :)

mdsumner · 2022-09-04T07:17:39Z

at some point I'll fold in the logic for netcdf from tidync, and flesh out the translators I've been talking about, and then explore what's needed for a proper api vs just R funs

paleolimbot · 2022-09-05T02:37:18Z

I made a place for "cell logic" for you to get started! PR into https://github.com/paleolimbot/geoarrow-cpp/blob/main/src/geoarrow/index_math.hpp (and make sure to add tests into https://github.com/paleolimbot/geoarrow-cpp/blob/main/src/geoarrow/index_math_test.cc !). If you're interested, I'm happy to set up a meeting to set up your VSCode to get started 😄

mdsumner · 2022-09-05T08:22:20Z

👌

mdsumner · 2022-09-07T01:12:53Z

I definitely need the hand-holding! I think it would be valuable :)

paleolimbot · 2022-09-07T12:38:02Z

Let's do it! It's tough for me to meet outside 8am - 4pm America/Halifax because of the kids or we can work through it via Twitter message. The gist of it is: open up geoarrow-cpp in VSCode, install the CMake extension, then open the "command palette" (Control-Shift-P) and choose CMake: configure, then CMake: build, then Cmake: run tests.

mdsumner · 2022-09-14T03:31:56Z

related pydata/xarray#5081

mdsumner · 2022-09-18T05:13:43Z

just reading Danielle's blog with a couple of rasterization steps, we could use a sparse cell approach - not profound or anything but a clear example for some crossover discussion: https://blog.djnavarro.net/posts/2022-08-23_visualising-a-billion-rows/

mdsumner · 2022-09-18T12:56:37Z

all the more reason for me to get these funs in here, I keep realising implications, and variations on the index conversions 🙏

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

discuss these cell logic functions #1

discuss these cell logic functions #1

mdsumner commented Aug 31, 2022 •

edited

Loading

paleolimbot commented Aug 31, 2022

mdsumner commented Sep 1, 2022 •

edited

Loading

mdsumner commented Sep 1, 2022

mdsumner commented Sep 4, 2022

paleolimbot commented Sep 5, 2022

mdsumner commented Sep 5, 2022

mdsumner commented Sep 7, 2022

paleolimbot commented Sep 7, 2022

mdsumner commented Sep 14, 2022

mdsumner commented Sep 18, 2022

mdsumner commented Sep 18, 2022

discuss these cell logic functions #1

discuss these cell logic functions #1

Comments

mdsumner commented Aug 31, 2022 • edited Loading

paleolimbot commented Aug 31, 2022

mdsumner commented Sep 1, 2022 • edited Loading

mdsumner commented Sep 1, 2022

mdsumner commented Sep 4, 2022

paleolimbot commented Sep 5, 2022

mdsumner commented Sep 5, 2022

mdsumner commented Sep 7, 2022

paleolimbot commented Sep 7, 2022

mdsumner commented Sep 14, 2022

mdsumner commented Sep 18, 2022

mdsumner commented Sep 18, 2022

mdsumner commented Aug 31, 2022 •

edited

Loading

mdsumner commented Sep 1, 2022 •

edited

Loading