Atlas of Living Australia #4918
Labels
💻 aspect: code
Concerns the software code in the repository
🌟 goal: addition
Addition of new feature
🟩 priority: low
Low priority and doesn't need to be rushed
☁️ provider: audio
Audio provider
☁️ provider: images
Image provider
🧱 stack: catalog
Related to the catalog and Airflow DAGs
Source API Endpoint / Documentation
https://support.ala.org.au/support/solutions/articles/6000196714-how-to-download-occurrence-records
Provider description
Atlas of Living Australia (ALA) aggregates open datasets from several sources around Australia. If you exclude iNaturalist Australia, they have over 2,000,000 images and nearly 40,000 sounds.
I don't know how many of those are openly licensed, but at a quick glance, every individual record I clicked on was some variation of CC licensed. According to their image-specific search tool only 4003 images are all rights reserved. A large number are "unrecognised" licenses, but here is an example of one that has a CC license URI in the rights field: https://images.ala.org.au/image/8486cc13-4da9-4dd3-a0f4-5a3d1feea1dc
There are some unrecognised that also just do not have a license listed. I suspect the vast majority are CC license URIs though.
Licenses Provided
CC licences
Provider API Technical info
The organisation of data from ALA is similar to Europeana in that it's a collection of other sources, but also is a source itself.
There is an API, here's an example (page size set to 1): https://biocache-ws.ala.org.au/ws/occurrences/search?q=*%3A*&disableAllQualityFilters=true&qualityProfile=ALA&fq=multimedia%3A%22Image%22&fq=-data_resource_uid%3A%22dr1411%22&qc=-_nest_parent_%3A*&pageSize=1
However, I think more powerful is the fact that they offer bulk downloads of individual queries. If you visit the "advanced search" page for the above query (https://biocache.ala.org.au/occurrence/search?q=*%3A*&disableAllQualityFilters=true&qualityProfile=ALA&fq=multimedia%3A%22Image%22&qc=-_nest_parent_%3A*&fq=-data_resource_uid%3A%22dr1411%22), there is a download button, which lets you export a CSV. The "download" is asynchronous, in that you trigger an export on their end, they generate a zip, and then you get back a link later.
The API for that is documented here: https://docs.ala.org.au/openapi/index.html?urls.primaryName=occurrences#/Download
We'd need a DAG that completes this flow:
ALA has their own image proxying with various sizes of thumbnails.
Note that each "occurrence" may have more than one image! The "occurrenceID" only points to the "main" image, I think? The other UUIDs in
images
all have proxied image URLs provided by ALA and are distinct on the ones that I saw this happening for.Checklist to complete before beginning development
Implementation
The text was updated successfully, but these errors were encountered: