Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scope out SDG indicator database ingestion workload #5

Open
7 tasks
kzollove opened this issue Jun 20, 2023 · 3 comments
Open
7 tasks

Scope out SDG indicator database ingestion workload #5

kzollove opened this issue Jun 20, 2023 · 3 comments

Comments

@kzollove
Copy link
Collaborator

kzollove commented Jun 20, 2023

The SDG indicator database will be a very valuable resource, specifically for the IDSR / CODATA use case.

SDG has a well-documented API for accessing the SDG indicator database.
https://unstats.un.org/SDGAPI/swagger/

Scope out a plan for ingesting the SDG indicator database into Gaia data source and variable source records.

First figure out the logical approach from gaia perspective:

  • What subset of countries to bring in?
  • What subset of indicators to bring in?
  • How to break down data sources (years, countries, regions, other?)

Then handle the technical side of things (API support in gaia)

  • GaiaDB data source records support API ingest

Then handle the scripting for the API calls

  • How to structure an API call to use in the data_source record?
  • How to structure an API call to get variable source information? This would be used to help generate the infromation for the variable_source record, but the call itself would (likely) not be stored

Depends on:

@rtmill
Copy link
Collaborator

rtmill commented Jun 23, 2023

At a glance, this seems to be country-level at the most granular? If that is the case, do we have use cases for content that broad?

@jaygee-on-github
Copy link

@rtmill, actually there are small area estimates for many of the SDGs. These estimates can go to the regional and "traditional authority" (aka community) levels. If you are ok with this kind of granularity, maybe I can proceed with the" logical approach from gaia perspective" as outlined by @kzollove. What do you think?

@jaygee-on-github
Copy link

@kzollove, @tibbben, @rtmill...

Here is a framework I propose for hosting the SDG indicators:

image

Each indicator would have its own catalog entry. The catalog entry would refer to a dataset in which the rows are regions and the columns successively disaggregate an indicator. Depending on the indicator, a disaggregation might not include males or females as appropriate.

Note that in this framework a region flattens a hierarchy. Depending on granularity, a country may contain areas which may contain an areas. In this event there would be a row for each country / area / area.

In the first sprint we would NOT tackle the SDMX SDG API. Instead we would prepare one or more datasets -- again, one for each indicator.

So far I haven't tried to specify the table, column and cell level metadata we might support.

@kzollove kzollove transferred this issue from OHDSI/GIS Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🏷TODO
Development

No branches or pull requests

3 participants