-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: caching intermediate operations. #124
Comments
Hey @brendancol, I'd say Caching could potentially help with the one polygon, many rasters scenario. We could cache the rasterized geometry to avoid re-rasterizing. Since rasterizing is a significant chunk of the work (rough 20%?), that would likely be worth the memory footprint of storing them across raster bands. My work on multiband support has really stalled out: #73 - there are design barriers internally and numpy behavior that makes it difficult to implement cleanly. But caching rasterized geometries would make a good addition should it ever come to fruition. |
We run into the same issue: we have 28 bands, which means that rasterization happens 28 times again. Note I'm willing to create a PR, but I'd like to get feedback on the idea before diving into the details. |
@johanvdw @brendancol so if the optional mini_raster was supplied, we would skip the rasterization step? At a high level, that seems like a reasonable approach. You'd still have to call the |
Hey hey. Great stuff.
Question:
When usingpython-rasterstats
with one polygon and many rasters (or vice versa), do you see a clear spot where intermediate steps can be cached? Examples: the rasterization of the polygon, or the reading of the value raster?The text was updated successfully, but these errors were encountered: