-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdatasets.html
87 lines (82 loc) · 8.77 KB
/
datasets.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Datasets</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link href="css/bootstrap.min.css" rel="stylesheet">
<link href="css/custom.css" rel="stylesheet">
</head>
<body class="markdown github">
<header class="navbar-inverse navbar-fixed-top">
<div class="container">
<nav role="navigation">
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#bs-example-navbar-collapse-1">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a href="index.html" class="navbar-brand">J298 Data Journalism</a>
</div> <!-- /.navbar-header -->
<!-- Collect the nav links, forms, and other content for toggling -->
<div class="collapse navbar-collapse" id="bs-example-navbar-collapse-1">
<ul class="nav navbar-nav">
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Class notes<b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="week1.html">What is data?</a></li>
<li><a href="week2.html">Types of stories</a></li>
<li><a href="week3.html">Working with spreadsheets</a></li>
<li><a href="week4.html">Acquiring, cleaning, and formatting data</a></li>
<li><a href="week5.html">R, RStudio, and the tidyverse</a></li>
<li><a href="week6.html">Data journalism in the tidyverse</a></li>
<li><a href="week7.html">Don't let the data lie to you</a></li>
<li><a href="week8.html">Databases and SQL</a></li>
<li><a href="week9.html">Finding stories using maps</a></li>
<li><a href="week10.html">Maps meet databases</a></li>
<li><a href="week11.html">More PostGIS</a></li>
<li><a href="week12.html">R practice</a></li>
<li><a href="week13.html">PostGIS practice</a></li>
<li><a href="week14.html">More fun with R</a></li>
</ul>
</li>
<li><a href="software.html">Software</a></li>
<li><a href="datasets.html">Data</a></li>
<li><a href="questions.html">If you get stuck</a></li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Email instructors<b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="mailto:[email protected]">Peter Aldhous</a></li>
<li><a href="mailto:[email protected]">Amanda Hickman</a></li>
</ul>
</li>
</ul>
</div><!-- /.navbar-collapse -->
</nav>
</div> <!-- /.navbar-header -->
</header>
<div class="container all">
<h1 id="datasets"><a name="datasets" href="#datasets"></a>Datasets</h1><p>Click on the title links to download the data. Email your instructors if you have any problems downloading.</p><h2 id="class-exercises"><a name="class-exercises" href="#class-exercises"></a>Class exercises</h2><h4 id="[download-week-1](./data/week1.zip)"><a name="[download-week-1](./data/week1.zip)" href="#[download-week-1](./data/week1.zip)"></a><a href="./data/week1.zip">Download Week 1</a></h4><ul>
<li><p><code>berkeley_collisions.csv</code> Data on injury and fatal traffic accidents in Berkeley from 2006 to 2014, from the <a href="http://tims.berkeley.edu/">Transportation Injury Mapping System</a>. The data comes from the California Highway Patrol’s <a href="http://iswitrs.chp.ca.gov/Reports/jsp/userLogin.jsp">Statewide Integrated Traffic Records System</a> and was then geocoded for mapping by UC Berkeley’s Safe Transportation Research & Education Center.</p>
</li><li><p><code>mlb_salaries_2015.csv</code> Salaries of players in Major League Baseball at the start of the 2015 season, from the <a href="http://www.seanlahman.com/baseball-archive/statistics/">Lahman Baseball Database</a>.</p>
</li></ul><h4 id="[download-week-3](./data/week3.zip)"><a name="[download-week-3](./data/week3.zip)" href="#[download-week-3](./data/week3.zip)"></a><a href="./data/week3.zip">Download Week 3</a></h4><ul>
<li><code>calpads_cohort16_alameda.csv</code> The State of California publishes quite a bit of <a href="https://www.cde.ca.gov/ds/sd/sd/filescohort.asp">high school graduation</a> data statewide, here filtered for Alemeda county only.</li><li><code>USGS_2.5_month.csv</code> USGS publishes <a href="https://earthquake.usgs.gov/earthquakes/feed/v1.0/csv.php">real time earthquake data</a>.</li><li><code>311_Cases_Dec2017.csv</code> San Francisco’s 311 call records, from <a href="https://data.sfgov.org/City-Infrastructure/311-Cases/vw6y-z8j6">SF’s Open Data Portal</a>, filtered for cases opened between 12/01/2017 12:00:00 AM and 01/01/2018 12:00:00 AM.</li></ul><h4 id="[download-week-4](./data/week4.zip)"><a name="[download-week-4](./data/week4.zip)" href="#[download-week-4](./data/week4.zip)"></a><a href="./data/week4.zip">Download Week 4</a></h4><ul>
<li><p><code>techexports.xls</code> <a href="http://data.worldbank.org/indicator/TX.VAL.TECH.CD">High-technology exports</a> from 1990 to 2015, in current US dollars, from the UN Comtrade database, supplied via the World Bank. High-technology exports include products in aerospace, computers, pharmaceuticals, scientific instruments, and electrical machinery.</p>
</li><li><p><code>ucb_stanford_2014.csv</code> Data on federal government grants to UC Berkeley and Stanford University in 2014, downloaded from <a href="https://www.usaspending.gov/Pages/Default.aspx">USASpending.gov</a>.</p>
</li><li><p><code>alerts-actions_2017.xls</code> Records of <a href="http://www.mbc.ca.gov/Publications/Disciplinary_Actions/">disciplinary alerts issued and actions taken</a> by the Medical Board of California in 2017.</p>
</li></ul><h4 id="[download-week-5-&-6](./data/week5.zip)"><a name="[download-week-5-&-6](./data/week5.zip)" href="#[download-week-5-&-6](./data/week5.zip)"></a><a href="./data/week5.zip">Download Week 5 & 6</a></h4><ul>
<li><code>ca_discipline.csv</code> Disciplinary alerts and actions issued by the Medical Board of California from 2008 to 2017. Processed from downloads available <a href="http://www.mbc.ca.gov/Publications/Disciplinary_Actions/">here</a>.</li><li><code>ca_medicare_opioids.csv</code> Data on prescriptions of opioid drugs under the Medicare Part D Prescription Drug Program by doctors in California, from 2013 to 2015. Filtered from the national data downloads available <a href="https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Medicare-Provider-Charge-Data/Part-D-Prescriber.html">here</a>. This is the public release of the data that ProPublica used FOIA to obtain for earlier years for the story we discussed in Week 2. </li><li><code>npi_license.csv</code> Crosswalk file to join <a href="https://npiregistry.cms.hhs.gov/">National Provider Identifier</a> codes to state license numbers, processed from the download available <a href="http://www.nber.org/data/npi-state-license-crosswalk.html">here</a> to include license numbers potentially matching California doctors.</li></ul><h4 id="[download-week-12](./data/week12.zip)"><a name="[download-week-12](./data/week12.zip)" href="#[download-week-12](./data/week12.zip)"></a><a href="./data/week12.zip">Download Week 12</a></h4><ul>
<li><p><code>pfizer.csv</code> Payments made by Pfizer to doctors across the United States in the second half on 2009. Contains the following variables:</p>
<ul>
<li><code>org_indiv</code> Full name of the doctor, or their organization.</li><li><code>first_plus</code> Doctor’s first and middle names.</li><li><code>first_name</code> <code>last_name</code>. First and last names.</li><li><code>city</code> <code>state</code> City and state.</li><li><code>category of payment</code> Type of payment, which include <code>Expert-led Forums</code>, in which doctors lecture their peers on using Pfizer’s drugs, and `Professional Advising.</li><li><code>cash</code> Value of payments made in cash.</li><li><code>other</code> Value of payments made in-kind, for example puschase of meals.</li><li><code>total</code> value of payment, whether cash or in-kind.</li></ul>
</li><li><p><code>fda.csv</code> Data on warning letters sent to doctors by the US Food and Drug Administration, because of problems in the way in which they ran clinical trials testing experimental treatments. Contains the following variables:</p>
<ul>
<li><code>name_last</code> <code>name_first</code> <code>name_middle</code> Doctor’s last, first, and middle names.</li><li><code>issued</code> Date letter was sent.</li><li><code>office</code> Office within the FDA that sent the letter.</li></ul>
</li></ul>
</div> <!-- /.container all -->
<script src="https://code.jquery.com/jquery.min.js"></script>
<script src="js/bootstrap.min.js"></script>
</body>
</html>