You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Issue occurs when calling the zonal_stats function and passing a numpy array and affine, rather than a geo-referenced raster. If you have previously used numpy.ma to mask a portion of the data, that portion of the data is still included in stats calculation. I was expecting that pixels masked in this manner would be ignored during stats calculation, in the same way that nodata pixels are.
My current workaround is to set these pixels explicitly to the nodata value, which then results in them being discounted accordingly, but this is a little in-elegant, and more to the point I think the expectation is that, if zonal_stats supports numpy data, it should respect any masks already applied. This was certainly my expectation, and I've seen one or two other reports online of people being caught out by this.
To Reproduce
The below example shows roughly how to reproduce. Naturally it's trivial, and you would never do this in anger, you would just pass the raster directly to the zonal_stats function. Assume in any actual example, we are doing something else to the raster data in numpy-land other than simply reading it and masking it. Note that it does require you provide your own test data (a raster and intersecting geometry).
importrasteriofromrasterstatsimportzonal_statsfromnumpyimportmawithrasterio.open("somerasterdataset.tif") asmy_raster:
my_data=my_raster.read(1) # 8-bit, unsigned int, values 0-255nodata=my_raster.nodataaffine=my_raster.transformmasked_data=ma.masked_greater(my_data, 100)
stats=zonal_stats(
geometry,
masked_data,
affine=affine,
nodata=nodata,
stats=['mean', 'count']
all_touched=True,
) # Still includes pixels > 100, so mean is too highmy_data[my_data>100] =nodatastats=zonal_stats(
geometry,
my_data,
affine=affine,
nodata=nodata,
stats=['mean', 'count']
all_touched=True,
) # Works as I expect
The text was updated successfully, but these errors were encountered:
Describe the bug
Issue occurs when calling the
zonal_stats
function and passing a numpy array and affine, rather than a geo-referenced raster. If you have previously used numpy.ma to mask a portion of the data, that portion of the data is still included in stats calculation. I was expecting that pixels masked in this manner would be ignored during stats calculation, in the same way that nodata pixels are.My current workaround is to set these pixels explicitly to the nodata value, which then results in them being discounted accordingly, but this is a little in-elegant, and more to the point I think the expectation is that, if zonal_stats supports numpy data, it should respect any masks already applied. This was certainly my expectation, and I've seen one or two other reports online of people being caught out by this.
To Reproduce
The below example shows roughly how to reproduce. Naturally it's trivial, and you would never do this in anger, you would just pass the raster directly to the zonal_stats function. Assume in any actual example, we are doing something else to the raster data in numpy-land other than simply reading it and masking it. Note that it does require you provide your own test data (a raster and intersecting geometry).
The text was updated successfully, but these errors were encountered: