-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Statistics operation load the entire raster in memory #187
Comments
The original stats ops from JAI do compute statistics on a tile by tile basis, in a loop, so are not affected by this issue. A drawback of the fix will be that the stats computation is going to switch from parallel (using all tile scheduler threads) to linear. On a busy server doing multiple requests that's a good thing, a batch job having all the machine to use might suffer slowdowns though. |
Statistics operation load the entire raster in memory #187
Should we add a parameter to control this behavior like, lazy VS eager computation (default to lazy)? |
It's not lazy vs eager, it's parallel vs sequential, but still done all in one shot in the getProperty call. Long story short, it seems the operation would need its own local, temporary thread pool to have some control on the computation. So the new parameter could indeed be the number of threads used to compute the stats, defaulting to one. |
let's leave this open and get back to it in time. At least we patched this for the time being. |
Both simple and zonal states compute the statistics by aggregating them on tiles computation, and the property calculation calls getTiles() to make that happen.
It's a concise way to compute all tiles, however, it also means all rastes are computed and retained as getTiles() returns a Raster[], in other words, the entire raster gets loaded into memory.
The text was updated successfully, but these errors were encountered: