Skip to content

Latest commit

 

History

History
186 lines (128 loc) · 16.5 KB

data-catalogue.md

File metadata and controls

186 lines (128 loc) · 16.5 KB

Data Catalogue

Hackathon logo

Introduction

For the 2025 hackathon, to solve the data challenge announced on the evening of 6 March 2025, you should use open Earth Observation data available in the Copernicus Space Data Ecosystem (‘hackathon primary datasets’). You can complement them with:

  • European Statistics datasets made available on the hackathon data platform for all hackathon teams (‘hackathon auxiliary datasets’),
  • Your own dataset, which should be representative of data available in other EU countries and fit within the size limits defined for the hackathon datasets (‘own hackathon datasets’).

The purpose of this data catalogue is to support hackathon participants in their preparation for the event.

The data catalogue has been prepared by Eurostat, in collaboration with and based on the material developed by DG DEFIS and the European Space Agency. Please note that this is the final version of the data catalogue.

Structure of this Data Catalogue

  1. Quick recap on the use of Earth Observation Data for official statistics
  2. Earth Observation Data available at the Copernicus Data Space Ecosystem (CDSE)
  3. Auxiliary Hackathon Data
  4. Teams’ Own Hackathon Datasets
  5. How to set up an individual CDSE account?
  6. Link to the hackathon platform documentation with sample codes for data access and processing

1: Quick Recap on the Use of Earth Observation Data for Official Statistics

Earth Observation Data show substantial potential for producing new, more timely and granular statistical outputs and reducing response burden. However, to be able to use this data to produce statistics, new methods and tools have to be deployed that require new skills and competencies from statisticians and close collaboration with data scientists, geospatial agency experts and researchers.

To be used for official statistics purposes, Earth Observation data require preprocessing along the value-added chain as presented in the diagramme and table below. All hackathon teams need to closely reflect on what Earth Observation Data inputs available at the CDSE they will be working on and what statistics production stage(s) their application needs to address, taking into consideration the data challenge, event’s time constraints and resources capacity limits.

Slide1 Slide2

2: Earth Observation Data Available at the Copernicus Data Space Ecosystem (CDSE) – Hackathon primary datasets

The available data collections on the CDSE and CREO-DIAS platforms, are structured in the following categories:

This document should give an impression on the availability of data - for full information, please refer to the CDSE and CREO-DIAS documentation.

Sentinel Data:

For operational needs of the Copernicus programme, several Sentinel satellites have been developed and launched. They specialize in different Earth Observation services and produce diverse sets of data 1 with the following focus:

Sentinel Mission Focus Products and Interesting Applications More Information
Sentinel-1 High-resolution images of all landmasses, coastal zones, and shipping routes worldwide; vignettes of the global ocean Radar data; monthly mosaics (identification of built-up area, water presence); a consistent long-term data archive for applications based on long time series S1 Mission
Interactively explore Sentinel-1, Sentinel-1 monthly mosaics
Sentinel-2 Land services, including the monitoring of vegetation, soil and water cover, as well as the observation of inland waterways and coastal areas Optical data from 2015 – onwards; quarterly mosaics (identification of vegetation, green areas, arable land, agriculture activities, forests, water and wetness, green roofs); land monitoring, agriculture, emergency management, risk mapping, security, forestry, climate change, disaster control, marine and humanitarian relief operations S2 Mission
Interactively explore Sentinel-2, vegetation indices, Sentinel-2 quarterly mosaics
Sentinel-3 Sea surface topography, sea and land surface temperature, and ocean and land surface color with high accuracy and reliability Support to ocean forecasting systems, environmental monitoring, and climate monitoring S3 Mission
Interactively explore Sentinel-3 OLCI, OLCI Land and SLSTR
Sentinel-5P Daily global observations of key atmospheric constituents Monitoring and forecasting air quality, the ozone layer, and climate change S5P Mission
Interactively explore Sentinel-5p weekly NO2 mosaic

:::{Note} You will find several Jupyter Notebook examples on how to extract information from these data collections here: https://documentation.dataspace.copernicus.eu/Usecase.html :::

Copernicus Contributing Missions (CCM):

Copernicus Sentinel missions and provide the following types of datasets:

  • Optical High Resolution (HR) and Very High Resolution (VHR) images
  • Synthetic-Aperture Radar (SAR) imagery with different regional, European and worldwide coverages. Of interest could be in this context in particular Very High-Resolution mosaics and composites e.g. VHR-IMAGE-2021 (as well as VHR imagery for past reference years: 2012, 2015, 2018) and, Digital Elevation Models, e.g. Copernicus DEM and Vegetation phenology and productivity products.

Complementary Data:

They cover high-resolution satellite imagery from various providers and data offerings from different Copernicus services, including:

:::{Note} For a more complete picture, and detailed information on every available product, please refer to the Data Documentation. :::

3: Relevant official statistics data and services - Hackathon auxiliary datasets

The auxiliary statistical datasets will be made available for all participants via API and object storage. They will cover:

:::{Note} Here is a sample code how to access and use the NUTS data in the platform: NUTS data notebook. To run this notebook, an .s3cfg like this sample file is needed with credentials to the source bucket.
:::

  • Population grid (geopackage and raster, 1km resolution)
    The population grid is a geographical dataset showcasing 13 different population-related variables (e.g. total population, its breakdown by sex) produced on the 1 km2 grid during the Census 2021 by the EU Member States. The projection is the ETRS89-LAEA grid. For more information: https://ec.europa.eu/eurostat/web/gisco/geodata/population-distribution/geostat

:::{Note} Here is a sample code how to access and use the population grid data in the platform: Census data notebook. To run this notebook, an .s3cfg like this sample file is needed with credentials to the source bucket.
:::

:::{Note} Here is a sample code how to access and use the reference grid data in the platform: Reference grid notebook. To run this notebook, an .s3cfg like this sample file is needed with credentials to the source bucket.
:::

:::{Note} Here is a sample code how to access and use the GHS data in the platform: GHS notebook. To run this notebook, an .s3cfg like this sample file is needed with credentials to the source bucket.
:::

:::{Note} Here is a sample code how to access and use the Land cover data in the platform: Land cover notebook. To run this notebook, an .s3cfg like this sample file is needed with credentials to the source bucket.
:::

:::{Note} Here is a sample code how to access and use the Field polygons in the platform: Field polygons notebook. To run this notebook, an .s3cfg like this sample file is needed with credentials to the source bucket.
:::

:::{Note} Here is a sample code how to access and use the Natura 2000 data in the platform: Natura 2000 notebook. To run this notebook, an .s3cfg like this sample file is needed with credentials to the source bucket.
:::

  • Climate Data Store The Climate Data Store provides authoritative information about the past, present and future climate in Europe and the rest of the World. For more information see: https://cds.climate.copernicus.eu/#!/home

:::{Note} Here is a sample code how to access and use data from the climate data store in the platform: Climate data store notebook. To run this notebook, an .cdsapirc like this sample file is needed with credentials to the API.
:::

4: Own Hackathon Datasets

You can either:

  • upload the datasets to your individual CDSE account at the time of your convenience , respecting all relevant rules with regard to the size or format, or
  • upload the data for your hackathon teams project onto the hackathon customised data platform through the S3 cloud bucket during the event, also respecting the relevant file size and format requirements. For raster data, you can ingest them to Sentinel Hub in order to use them programmatically, similarly as Sentinel, Global Land Cover, etc.

:::{Important} Own datasets used in your hacking work should have (close) equivalents in other EU countries so that the project could be replicable at the EU level :::

5: How to set up a CDSE account?

Each participant has to set up his/her individual CDSE account and explore various datasets and standard functionalities: https://identity.dataspace.copernicus.eu/auth/realms/CDSE/protocol/openid-connect/auth?client_id=cdse-public&response_type=code&scope=openid&redirect_uri=https%3A//dataspace.copernicus.eu/account/confirmed/1

:::{Note} During the registration if you mark that you want to access Copernicus Contributing Missions data, please choose ‘Not applicable’ in the field for ‘Copernicus Service Project Name’. :::

For more information, see also: https://dataspace.copernicus.eu/news/2024-12-9-how-and-why-use-copernicus-data-space-ecosystem-hackathons

6: Link to the hackathon platform documentation with sample codes for data access and processing

The computational resources available during the hackathon is described in this repository under the following link:

https://eurostat.github.io/eubd2025_docs/

Footnotes

  1. For more information on the Copernicus programme, see: https://sentiwiki.copernicus.eu/web/copernicus-programme