Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
ade27b8
chore: update rollup, extract changelog
domoritz May 31, 2020
f2ec610
chore: move sources
domoritz May 31, 2020
f78bcfd
chore: move sources (2)
domoritz May 31, 2020
024cf5a
feat: update co2 data
domoritz May 31, 2020
660f0bc
convert birdstrikes data to CSV
domoritz Jun 1, 2020
6893dd7
rename weball26 to political-contributions
domoritz Jun 1, 2020
279591b
delete climate.json
domoritz Jun 1, 2020
97b26e5
delete sf temps and seattle tempas and replace it with seattle-weathe…
domoritz Jun 1, 2020
8893ca9
fix date formatting in seattle-weather.csv
domoritz Jun 1, 2020
3b18e6b
convert movies.json to movies.csv
domoritz Jun 1, 2020
9781fd1
update changelog and sources
domoritz Jun 1, 2020
b9145be
correct changelog
domoritz Jun 1, 2020
461209d
chore: update urls.ts
domoritz Jun 1, 2020
4a80ac9
chore: bump package version
domoritz Jun 1, 2020
fd9338e
chore: rebuild the URLs file
domoritz Jun 2, 2020
6c01b08
chore: move hourly normals data
domoritz Jun 2, 2020
067bc2a
docs: document versioning policy
domoritz Jun 2, 2020
e9c1158
docs: update weather sources
domoritz Jun 2, 2020
113a2e6
docs: encourage the use of CDN
domoritz Jun 2, 2020
d9757c8
feat: remove iris dataset
domoritz Jun 12, 2020
e8995b9
Bump to beta 2
domoritz Jun 12, 2020
e49a522
fix: move penguins data
domoritz Jun 15, 2020
dbd2c19
bump
domoritz Jun 15, 2020
84cc707
chore: rebuild
domoritz Jun 15, 2020
4bfda0a
fix: remove NA from penguins
domoritz Jun 15, 2020
6fc7a3a
feat: change palmer penguins to json
domoritz Jun 16, 2020
9991af8
chore: bump
domoritz Jun 16, 2020
c123cae
Merge branch 'master' into v2
domoritz Jun 16, 2020
af04748
feat: switch back to movies.json
domoritz Jun 16, 2020
6a3178f
chore: bump
domoritz Jun 16, 2020
77a4600
chore: rename penguins file
domoritz Jun 16, 2020
efc9d8d
chore: bump
domoritz Jun 16, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
138 changes: 138 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
### Version 2.0

- Add `football.json`. Thanks to @eitanlees!
- Add `penguins.json`.
- Add `seattle-weather-hourly-normals.csv`.
- Update `weather.csv` and `seattle-weather.csv` with better encoded weather condition, indicating more rain. Thanks to @visnup!
- Update co2-concentration data and add seasonally adjusted CO2 field.
- Switch to ISO 8601 dates in `seattle-weather.csv`.
- Rename `weball26.json` to `political-contributions.json`.
- Convert `birdstrikes.json` to `birdstrikes.csv` and use ISO 8601 dates.
- Convert `movies.json` to use column names with spaces use ISO 8601 dates.
- Remove `climate.json`.
- Replace `seattle-temps.csv` with more general `seattle-weather-hourly-normals.csv`.
- Remove `sf-temps.csv`.
- Remove `graticule.json`. Use graticule generator instead.
- Remove `points.json`.
- Remove `iris.json`. Use `penguins.json` instead.

### Version 1.31

- Strip BOM from `windvectors.csv`.

### Version 1.30

- Update `seattle-temps` with better sourced data.
- Update `sf-temps` with better sourced data.

### Version 1.29

- Add `ohlc.json`. Thanks to @eitanlees!

### Version 1.28

- Add `annual-precip.json`. Thanks to @mattijn!

### Version 1.27

- Add `volcano.json`.

### Version 1.26

- Add `uniform-2d.json`.

### Version 1.22

- Add `windvectors.csv`. Thanks to @jwoLondon!

### Version 1.20

- Add `us-unemployment.csv`. Thanks to @palewire!

### Version 1.19

- Remove time in `weather.csv`.

### Version 1.18

- Fix typo in city name in `us-state-capitals.json`

### Version 1.17

- Made data consistent with respect to origin by making them originated from a Unix platform.

### Version 1.16

- Add `co2-concentration.csv`.

### Version 1.15

- Add `earthquakes.json`.

### Version 1.14

- Add `graticule.json`, London borough boundaries, borough centroids and tube (metro) rail lines.

### Version 1.13

- Add `disasters.csv` with disaster type, year and deaths.

### Version 1.12

- Add 0 padding in zipcode dataset.

### Version 1.11

- Add U district cuisine data

### Version 1.10

- Add weather data for Seattle and New York.

### Version 1.9

- Add income, zipcodes, lookup data, and a dataset with three independent geo variables.

### Version 1.8

- Remove all tabs in `github.csv` to prevent incorrect field name parsing.

### Version 1.7

* Dates in `movies.json` are all recognized as date types by datalib.
* Dates in `crimea.json` are now in ISO format (YYYY-MM-DD).

### Version 1.6

* Fix `cars.json` date format.

### Version 1.5

* Add [Gapminder Health v.s. Income](data/gapminder-health-income.csv) dataset.
* Add generated Github contributions data for punch card visualization.

### Version 1.4

* Add Anscombe's Quartet dataset.

### Version 1.3

* Change date format in weather data so that it can be parsed in all browsers. Apparently YYYY/MM/DD is fine. Can also omit hours now.

### Version 1.2

* Decode origins in cars dataset.
* Add Unemployment Across Industries in US.

### Version 1.1.1

* Fixed the date parsing on the CrossFilter datasets -- an older version of the data was copied over on initial import. A script is now available via `npm run flights N` to re-sample `N` records from the original `flights-3m.csv` dataset.

### Version 1.1

* Add `seattle-weather` dataset. Transformed with https://gist.github.com/domoritz/acb8c13d5dadeb19636c.

### Version 1.0, October 8, 2015

* Initial import from Vega and Vega-Lite.
* Change field names in `cars.json` to be more descriptive (`hp` to `Horsepower`).
141 changes: 11 additions & 130 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,22 @@

[![npm version](https://img.shields.io/npm/v/vega-datasets.svg)](https://www.npmjs.com/package/vega-datasets)
[![Build Status](https://github.com/vega/vega-datasets/workflows/Test/badge.svg)](https://github.com/vega/vega-datasets/actions)
[![](https://data.jsdelivr.com/v1/package/npm/vega-datasets/badge?style=rounded)](https://www.jsdelivr.com/package/npm/vega-datasets)

Collection of datasets used in Vega and Vega-Lite examples. This data lives at https://github.com/vega/vega-datasets.
Collection of datasets used in Vega and Vega-Lite examples. This data lives at https://github.com/vega/vega-datasets and https://cdn.jsdelivr.net/npm/vega-dataset.

Common repository for example datasets used by Vega related projects. Keep changes to this repository minimal as other projects (Vega, Vega Editor, Vega-Lite, Polestar, Voyager) use this data in their tests and for examples.

The list of sources is in [sources.md](https://github.com/vega/vega-datasets/blob/master/sources.md).
The list of sources is in [SOURCES.md](https://github.com/vega/vega-datasets/blob/master/SOURCES.md).

To access the data in Observable, you can import `vega-dataset`. Try our [example notebook](https://observablehq.com/@vega/vega-datasets). To access these datasets from Python, you can use the [Vega datasets python package](https://github.com/jakevdp/vega_datasets). To access them from Julia, you can use the [VegaDatasets.jl julia package](https://github.com/davidanthoff/VegaDatasets.jl).

## Versioning

We use semantic versioning. However, since this package serve datasets we have additional rules about how we version data.

We do not change data in patch releases except to resolve formatting issues. Minor releases may change the data but only update datasets in ways that do not change field names or file names. Minor releases may also add datasets. Major versions may change file names, file contents, and remove or update files.

## How to use it

### NPM
Expand Down Expand Up @@ -45,136 +52,10 @@ console.log(cars);

### HTTP

You can also get the data directly via HTTP served by GitHub like:
You can also get the data directly via HTTP served by GitHub or jsDelivr (a fast CDN) like:

https://vega.github.io/vega-datasets/data/cars.json
https://vega.github.io/vega-datasets/data/cars.json or with a fixed version (recommended) such as https://cdn.jsdelivr.net/npm/vega-datasets@1.31/data/cars.json.

## Development process

Install dependencies with `yarn`. To make a release, create a new tagged version with `yarn version` and then push the tag. The CI will automatically make a release.

## Changelog

### Version 2.0

- Update `weather.csv` and `seattle-weather.csv` with better encoded weather condition, indicating more rain.
- Remove `graticule.json`.
- Add `football.json`.
- Add `penguins_size.csv`.

### Version 1.30

- Update `seattle-temps` with better sourced data.
- Update `sf-temps` with better sourced data.

### Version 1.29

- Add `ohlc.json`. Thanks to @eitanlees!

### Version 1.28

- Add `annual-precip.json`. Thanks to @mattijn!

### Version 1.27

- Add `volcano.json`.

### Version 1.26

- Add `uniform-2d.json`.

### Version 1.22

- Add `windvectors.csv`. Thanks to @jwoLondon!

### Version 1.20

- Add `us-unemployment.csv`. Thanks to @palewire!

### Version 1.19

- Remove time in `weather.csv`.

### Version 1.18

- Fix typo in city name in `us-state-capitals.json`

### Version 1.17

- Made data consistent with respect to origin by making them originated from a Unix platform.

### Version 1.16

- Add `co2-concentration.csv`.

### Version 1.15

- Add `earthquakes.json`.

### Version 1.14

- Add `graticule.json`, London borough boundaries, borough centroids and tube (metro) rail lines.

### Version 1.13

- Add `disasters.csv` with disaster type, year and deaths.

### Version 1.12

- Add 0 padding in zipcode dataset.

### Version 1.11

- Add U district cuisine data

### Version 1.10

- Add weather data for Seattle and New York.

### Version 1.9

- Add income, zipcodes, lookup data, and a dataset with three independent geo variables.

### Version 1.8

- Remove all tabs in `github.csv` to prevent incorrect field name parsing.

### Version 1.7

* Dates in `movies.json` are all recognized as date types by datalib.
* Dates in `crimea.json` are now in ISO format (YYYY-MM-DD).

### Version 1.6

* Fix `cars.json` date format.

### Version 1.5

* Add [Gapminder Health v.s. Income](data/gapminder-health-income.csv) dataset.
* Add generated Github contributions data for punch card visualization.

### Version 1.4

* Add Anscombe's Quartet dataset.

### Version 1.3

* Change date format in weather data so that it can be parsed in all browsers. Apparently YYYY/MM/DD is fine. Can also omit hours now.

### Version 1.2

* Decode origins in cars dataset.
* Add Unemployment Across Industries in US.

### Version 1.1.1

* Fixed the date parsing on the CrossFilter datasets -- an older version of the data was copied over on initial import. A script is now available via `npm run flights N` to re-sample `N` records from the original `flights-3m.csv` dataset.

### Version 1.1

* Add `seattle-weather` dataset. Transformed with https://gist.github.com/domoritz/acb8c13d5dadeb19636c.

### Version 1.0, October 8, 2015

* Initial import from Vega and Vega-Lite.
* Change field names in `cars.json` to be more descriptive (`hp` to `Horsepower`).
Loading