Skip to content

A collection of scripts for processing imagery

License

Notifications You must be signed in to change notification settings

linz/topo-imagery

Repository files navigation

Topo Imagery

GitHub Actions Status Coverage: 100% branches Dependabot Status License Conventional Commits Code Style Imports: isort Checked with mypy Code Style: prettier

Description

This is a collection of Python scripts used for processing topographic data in and for the cloud (AWS).

The associated Docker container is provided to run the Python scripts which use the GDAL library. It is based on osgeo/gdal:ubuntu-small-* Docker image.

The Docker container is available in GitHub Packages.

Usage

The scripts have been implemented to be run inside the Docker container only. This is mainly because of the GDAL dependency.

Local

  • Build the Docker image:
docker build --tag=topo-imagery .
  • Running standardising_validate.py script

This script standardises TIFF files to COGs with a creation of a STAC Item file per TIFF containing the metadata. The input TIFF file paths have to be passed through a json file in the following format:

[
  {
    "output": "tile_name",
    "input": ["./path/to/file.tiff"]
  }
]

where output is the desired output tile name and input is the path to one or several TIFFs. If more than one TIFF, the system will try to retile them into one single output file.

Some test data are available in /scripts/tests/data/ along with the expected output.

Run docker run topo-imagery python standardise_validate.py --help to get the list of the expected arguments.

  • Example of local execution. This example uses the test data available on this repo and create the output will be created in a ~/tmp/ on the local machine (volume share with Docker):
docker run -v ${HOME}/tmp/:/tmp/:rw topo-imagery python standardise_validate.py --preset webp --from-file ./tests/data/aerial.json --collection-id 123 --start-datetime 2023-01-01 --end-datetime 2023-01-01 --target /tmp/ --source-epsg 2193 --target-epsg 2193 --gsd 10 --create-footprints=true

To use an AWS test dataset (input located in an AWS S3 bucket), log into the AWS account and add the following arguments to the docker run command:

-v ${HOME}/.aws/credentials:/root/.aws/credentials:ro -e AWS_PROFILE=your-profile

In the cloud

This package is designed to be run in a Kubernetes cluster using a workflow system. More information can be found in the linz/topo-workflows repository.

Versioning

GitHub Actions automatically handles publishing a container to the GitHub Package Registry (ghcr) and in a private AWS Elastic Container Registry (ECR).

A new container is published every time a change is merged to the master branch. This container will be tagged with the following:

  • latest
  • github version (example: v1.1.0-2-ga1154e8)

A new container is also published when a release is merged to master (see section bellow). This container will be tagged with the following:

  • latest
  • vX (example: v1)
  • vX.Y (example: v1.2)
  • vX.Y.Z (example: v1.2.4)

You can see the tags in the GitHub Packages page.

Releases

Managing

googleapis/release-please is used to support the release process. Based on what has been merged to master (fix, feat, feat!, fix! or refactor!), the library generates a changelog based on the commit messages and creates a Pull Request. This is triggered by this GitHub Action.

Publishing

To publish a release, the Pull Request opened by release-please bot needs to be merged:

  1. Open the PR and verify that the CHANGELOG contains what you expect in the release. If the latest change you expect is not there, double-check that a GitHub Actions is not currently running or failed.
  2. Approve and merge the PR.
  3. Once the Pull Request is merged to master a GitHub Action it creates the release and publish a new container tagged for this release.