File organization
There are three folders in the script/
folder:
database_building/
includes scripts that build or impute indices based on initial input data: 1) LGU plant variety licenses 1980-2020 and 2) LGU Agricultural College research awards 2000-2020.
requested_data/
pulls down the input data that has been cleaned/organized by student researchers (green)external_indices/
downloads/scrapes data from publicly available databases (federal_funding
andip
) to create indices of federal funding for awards and for university-relevant PVP, PP, [ONE DAY ALL GRIN?] (yellow)license_based_indices/
isolates data from the requested data to build two indices, one withcompany
data, with attributes from the D&B database, and one withinnovations
data, where licensed innovations are matched to the IP external indices. Also insideinnovations
is the reading in and developing of a money to classify attributes of the innovations. (blue and grey)
-
merging_cleaning/
brings together licensing and company data, innovation and award data, and innovation and inventor data (orange and light green) -
analysis/
summarizes the merges so far
All data stored privately on Box. Data sources include:
- Money: Federal money is downloaded from USDA NIFA's recent awards database and LGU awards were requested from the 25 LGUs that provided licensing data in 2022\
- IP: PVP database is downloaded from USDA PVPO as an Excel spreadsheet; US Patent data was downloaded and filtered using the patentR and patentsview packages that wrap around the Patents View API; OSSI database was scraped from the OSSI webpage, where the Wayback Machine was used to look at archive and make sure nothing was missing\
- License: Licensing data, read in from Google Drive, was requested from all 50 LGUs between 2021-2022\
- Awards: Awards data, read in from Google Drive, was requested from all 50 LGUs between 2021-2022
Contact
Liza Wood
belwood[at]ucdavis.edu