Discover how state-of-the-art geospatial vision models can help your organization unlock the power of earth observation data for ecosystem monitoring.
Geospatial Foundation Models (GeoFMs) are Vision Transformers (ViTs) tailored for geospatial data. These large, generative ML models are pre-trained on massive geospatial imagery datasets. This makes them general purpose models that can be leveraged for a wide array of geospatial applications. These include:
- Semantic Search: Quickly map any surface type with a RAG-inspired geospatial embedding search
- Change Detection: Analyze time series of geospatial embeddings to identify surface disruptions over time
- Fine-tuning: Fine-tune a regression, classification or segmentation model for specialized ML tasks (requires labeled data)
This demonstrator showcases the power of GeoFMs for ecosystem monitoring. It is powered by Clay-v01, a state-of-the-art GFM available on HuggingFace. Satellite imagery used in this demo is from ESA's Sentinel-2 mission and is hosted on the AWS Registry of Open Data.
For illustrative purposes this solution uses the example of monitoring deforestation in the Amazon rainforest. However, it can be readily adopted for all kinds of other use cases. The Amazon rainforest is one of the most biodiverse ecosystems in the world and is considered critical in tackling climate change. Yet, there is evidence that the Amazon forest system could soon reach a tipping point, leading to large-scale collapse. Generative vision models for geospatial data - so called Geospatial Foundation Models (GeoFMs) - offer a new and powerful technology for mapping the earth's surface at a continental scale, providing stakeholders with the tooling to detect and monitor ecosystem change like forest degradation.
The application follows a three-tiered architecture comprising a Frontend, an AI/ML and Analytics Tier, and a Data Tier. Additionally, it incorporates two asynchronous SageMaker pipelines for embedding generation and GeoFM fine-tuning. The main components are:
- Solara Frontend - A react web application built with Solara, hosted on AWS Fargate, and served through CloudFront CDN. Authentication is handled by Amazon Cognito.
- Serve Geospatial Imagery Lambda - Provides a tile server using TiTiler with Amazon ElastiCache for low-latency geospatial imagery serving.
- Search Similar Regions function - Performs similarity searches based on a chip ID and vector search parameters, utilizing Amazon OpenSearch Service.
- Detect Change function - Conducts change detection by analyzing time series of embedding vectors retrieved from Amazon OpenSearch Service.
- Geo Embedding Generation Pipeline - A SageMaker Pipeline for retrieving, preprocessing Sentinel-2 data, and generating geospatial embeddings using a pre-trained GeoFM from HuggingFace Hub. These embedding are then loaded into a LanceDB vector database to enable semantic search.
To get started, perform the following steps:
-
Run Geospatial ML Pipeline for your area of interest:
- Navigate to
sagemaker-pipelines
folder. - Follow steps laid out in the respective notebooks in the
prerequisites
folder:- Build and push the
clay_gpu_docker_image
to ECR - Build and push the
geospatial_processing_image
to ECR
- Build and push the
- Go to
embedding_generation
folder and runembedding_generation_pipeline.ipynb
to instantiate and execute the SageMaker Pipeline for your Area of Interest (AOI) - Wait until the pipeline has completed successfully
- Retrieve the
config.json
file from the output bucket. You can find it at the following path of the main data bucket namedaws-geofm-data-bucket-{AWS_ACCOUNT_NUMER}-{AWS_REGION}-{env_name}
:
output/ └── <aoi_name>/ └── consolidated-output/ └── <MGRS_grid>/ └── config_<aoi_name>.json
- Navigate to
-
Deploy the UI stack:
- Navigate to the UI folder
- Update the config file by pasting the contents of the previously retrieved
config.json
- follow the instructions from the
ui/geofm-demo-stack/README.md
file to deploy the UI CDK Stack
The solution uses Authorization@Edge for serverless authorization of viewers using Amazon CloudFront, Lambda@Edge and Amazon Cognito. Since Lambda@Edge replicas are usually deleted within a few hours, you’ll need to wait that long after deleting the solution before you can manually delete the associated Lambda function from AWS Lambda.
- Karsten Schroer
- Bishesh Adhikari
- Iza Moise