Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handling remote files (or setting rasterio/gdal context) #208

Open
knaaptime opened this issue Jan 29, 2020 · 2 comments
Open

handling remote files (or setting rasterio/gdal context) #208

knaaptime opened this issue Jan 29, 2020 · 2 comments

Comments

@knaaptime
Copy link

Hi,
thanks for this great package. I'm wondering if it's possible to pass an open raster (as opposed to simply the path) to zonalstats (without converting to ndarray first).

My use case is that I'm reading a raster file from a public s3 bucket, which requires the following rasterio context:

with rasterio.env.Env(aws_unsigned=True):
	r = rasterio.open("s3://path-to-raster.tif)

which works great by itself. But now I need to calculate zonal stats and I cannot make rs.zonal_stats use the rasterio environment. Would it be possible to modify the rs.zonal_stats so that it accepts an open rasterio.DatasetReader? (or allow passing the context)

e.g.

with rasterio.env.Env(aws_unsigned=True):
	open_raster = rasterio.open("s3://path-to-raster)
	zonal_gjson = rs.zonal_stats(
	                geodataframe, open_raster, prefix="Type_", geojson_out=True, categorical=True
	            )
@sdtaylor
Copy link
Contributor

sdtaylor commented Jan 30, 2021

I've been doing s3 stuff with rasterio and it handles an s3 path without the need for a context mananger, as long as it's a free bucket. rasterstats doesn't change the path at all so it seems to work fine without any adjustment

import rasterstats
from shapely.geometry import Polygon

s3_path = 's3://landsat-pds/L8/139/045/LC81390452014295LGN00/LC81390452014295LGN00_B1.TIF'
p=Polygon([(465764.5, 2428342.0), (465104.1, 2404237.9), (492840.3, 2402586.9)])
rasterstats.zonal_stats(p, s3_path)
Out[4]: [{'min': 10150.0, 'max': 13163.0, 'mean': 10490.125291650986, 'count': 372020}]

It can be wrapped in a special rasterio session for a requester pays bucket too. Like with this naip imagery.

from rasterio.session import AWSSession
import rasterio
import rasterstats
from shapely.geometry import Polygon

p=Polygon([(525794.7, 4876393.3), (525711.3, 4875371.7),(526910.1, 4875413.4)])
s3_path = 's3://naip-visualization/wy/2019/60cm/rgb/44104/m_4410459_se_13_060_20190828.tif'

session = AWSSession(requester_pays=True)
with rasterio.Env(session):
        stats = rasterstats.zonal_stats(p, s3_path)

stats
[{'min': 59.0, 'max': 220.0, 'mean': 118.68374314241665, 'count': 1696115}]

rasterstats only reads with the rasterio window method, so as long as the s3 files are cloud optimized geotiffs then it will not download the full raster.

@dennisgsmith
Copy link

dennisgsmith commented Mar 17, 2022

For anyone looking to do this in a mock S3 tool like minio, I'll save you a google search:

rasterio/rasterio#1293 (comment)

Just add the following variables to your rasterio.Env:

import rasterstats
import rasterio
from shapely.geometry import Polygon

s3_path = 's3://landsat-pds/L8/139/045/LC81390452014295LGN00/LC81390452014295LGN00_B1.TIF'
p = Polygon([(465764.5, 2428342.0), (465104.1, 2404237.9), (492840.3, 2402586.9)])

with rasterio.Env(AWS_HTTPS='NO', GDAL_DISABLE_READDIR_ON_OPEN='YES', AWS_VIRTUAL_HOSTING=False, AWS_S3_ENDPOINT='localhost:9000'):
    stats = rasterstats.zonal_stats(p, s3_path)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants