Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement data workflow with subdivision/indexing/cache #347

Open
twelch opened this issue Sep 8, 2024 · 0 comments
Open

Reimplement data workflow with subdivision/indexing/cache #347

twelch opened this issue Sep 8, 2024 · 0 comments
Milestone

Comments

@twelch
Copy link
Contributor

twelch commented Sep 8, 2024

Need:

Previously there was a workflow that would subdivide vector data using PostGIS subdivision, use bundle-features to publish the data with an index (custom), and VectorDataSource to quickly fetch vector data using the index, given a bounding box and optionally merge the subdivided polygon back into a larger one. fetched results would be cached client-side to avoid extra re-fetches.

The previous solution:

  • existed before Flatgeobuf and Cloud-optimized geotiff's with built-in index existed and were stable.
  • only worked for polygon data, and not all polygons (were still some error cases)
  • had a lot of manual scripted steps to manage it
  • has suffered from bit rot, the bundle-features code no longer works, particularly the Slonik Postgres interface.

The problem:

  • global datasets like political boundaries and shoreline have too large of polygons to be efficiently fetched for a lot of countries.

Immediate vector workaround solution:

  • Subdivide dataset manually using whatever makes sense, doesn't have to be PostGIS, perhaps https://github.com/jvail/spl.js/
  • publish subdivided flatgeobuf as normal (import:data/publish:data) and use getFeatures to fetch
  • union result using Turf (likely much slower than VectorDataSource which can stitch back together subdivided quickly). Can we repurpose old union code?

Full replacement solution would:

  • support both vector and raster
  • JS client - build on top of or into Geoblaze and Flatgeobuf clients.
  • client-side cache - will need to cache using a unique way to identify the range of data fetched (byte range? for flatgeobuf). And a way to read it back out client-side. Support reuse of cache across lambda invocations via shared global memory
  • provide client-side caching
  • Union - previous implementation could do this very quickly. Consider options - Turf could certainly do it.

References:

@twelch twelch added this to the 8.0 milestone Sep 8, 2024
@twelch twelch changed the title Reimplement vector data workflow with subdivision/indexing/cache Reimplement data workflow with subdivision/indexing/cache Sep 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant