Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Dataset Profile example (CO2 dataset) #84

Merged
merged 25 commits into from
Aug 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions dataset/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# SPDX Dataset Profile Examples

This repository includes demonstrations of [SPDX documents](https://spdx.dev)
for a Dataset Profile.

## Format of examples

Directories of the form `example##` are structured as follows:

- `content/`: contains the example's content (data files, related source code,
etc.)
- `spdx3.0/`: contains one or more SPDX documents for the example
- `README.md`: more details about the particular example

## Examples

| ## | Data | Sources | SPDX | Comments |
|----|------|---------|------|----------|
| [01](./example01/) | 2 CSV files | - | 1 document | An example of a simple dataset in tabular format. |
33 changes: 33 additions & 0 deletions dataset/example01/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Example 01

## Description

An example of a simple dataset in tabular format.

```text
content
├── codebook.csv
└── data.csv
```

Both `codebook.csv` and `data.csv` are plain text files in CSV (comma-separated
values) format.

The file `data.csv` contains records of gas emission data for each year in a
country. It has a header on the first line that defines the column names.
Each record consists mostly of numerical data with some categorical data.

The file `codebook.csv` contains the column names from the header of
`data.csv`, together with their description, unit, and source.

The content of this example is an excerpt of the Our World in Data CO2 and
Greenhouse Gas Emissions dataset. It is available in full, under Creative
Commons Attribution 4.0 International License, at
<https://github.com/owid/co2-data/>.

This simplified
[Unified Modeling Language (UML)](https://en.wikipedia.org/wiki/Unified_Modeling_Language)
class diagram illustrates Example 01. Long string values are truncated and the
spdxIds are shortened (by removing the UUID suffix), for brevity.

[![A diagram of a bill of materials of Dataset Example 01](./spdx3.0/example01.png "A diagram of a bill of materials of Dataset Example 01")](./spdx3.0/example01.png)
Loading