-
Notifications
You must be signed in to change notification settings - Fork 3
/
README.Rmd
96 lines (64 loc) · 3.55 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# covid19R
<!-- badges: start -->
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://www.tidyverse.org/lifecycle/#experimental)
[![CRAN status](https://www.r-pkg.org/badges/version/covid19R)](https://CRAN.R-project.org/package=covid19R)
[![Travis build status](https://travis-ci.org/Covid19R/covid19R.svg?branch=master)](https://travis-ci.org/Covid19R/covid19R)
<!-- badges: end -->
The goal of covid19R is to provide a single package that allows users to access all of the tidy covid-19 datasets collected by data packages that implement the covid19R tidy data standard. It provides access to multiple data sets that meet a tidy data standard.
To learn more abou the Covid19R project, check [our extensive documentation](https://covid19r.github.io/documentation) about data standards, how to get your data added to this list, and more.
## Installation
<!--
You can install the released version of covid19R from [CRAN](https://CRAN.R-project.org) with:
``` r
install.packages("covid19R")
```
-->
You can install the development version from [github](https://github.com/) with:
``` r
remotes::install_github("covid19r/covid19r")
```
## Getting the Data Information
To see what datasets are available, use `get_covid19_data_info()`
```{r info}
library(covid19R)
data_info <- get_covid19_data_info()
head(data_info) %>% knitr::kable()
```
## Accessing data
Once you have figured out what dataset you want, you can access it with `get_covid19_dataset()`
```{r}
library(dplyr)
nytimes_states <- get_covid19_dataset("covid19nytimes_states")
nytimes_states %>%
filter(date == max(date)) %>%
filter(data_type == "cases_total") %>%
arrange(desc(value)) %>%
head()
```
## The covid19R Data Standard
While many data sets have their own unique additional columns (e.g., Latitude, Longitude, population, etc.), all datasets have the following columns and are arranged in a long format:
* date - The date in YYYY-MM-DD form
* location - The name of the location as provided by the data source. The counties dataset provides county and state. They are combined and separated by a `,`, and can be split by `tidyr::separate()`, if you wish.
* location_type - The type of location using the covid19R controlled vocabulary. Nested locations are indicated by multiple location types being combined with a `_
* location_code - A standardized location code using a national or international standard. In this case, FIPS state or county codes. See https://en.wikipedia.org/wiki/Federal_Information_Processing_Standard_state_code and https://en.wikipedia.org/wiki/FIPS_county_code for more
* location_code_type The type of standardized location code being used according to the covid19R controlled vocabulary. Here we use `fips_code`
* data_type - the type of data in that given row. Includes `total_cases` and `total_deaths`, cumulative measures of both.
* value - number of cases of each data type
## Vocabularies
The `location_type`, `location_code_type`, and `data_type` from datasets and `spatial_extent` from the data info table all have their own controlled vocabularies. Others might be introduced as the collection of packages matures. To see the possible values of a standardized vocabulary, use `get_covid19_controlled_vocab()`
```{r vocab}
get_covid19_controlled_vocab("location_type") %>%
knitr::kable()
```