- Benjamin Kosko, [email protected], bkosko
- Victor Lin, [email protected], victorHLin
- Chun-Fu Yeh, [email protected], YehCF
- Sarah Payne, [email protected], paynesa
- chart.js3.6.1
- cors2.8.5
- express4.17.1
- mysql2.18.1
- node-fetch3.0.0
- nodemon2.0.12
- supertest6.1.6
- jest27.1.0
- ant-design/charts1.2.14
- fortawesome/fontawesome-svg-core1.2.36
- fortawesome/free-solid-svg-icons5.15.4
- fortawesome/react-fontawesome0.1.16
- testing-library/jest-dom5.14.1
- testing-library/react11.2.7
- testing-library/user-event12.8.3
- antd4.16.13
- antd-button-color1.0.4
- bootstrap5.1.3
- canvasjs1.8.3
- chart.js3.6.1
- colormap2.3.2
- d37.1.1
- d3-format3.0.1
- datamaps0.5.9
- font-awesome4.7.0
- query-string7.0.1
- react17.0.2
- react-bootstrap2.0.3
- react-chartjs-24.0.0
- react-d3-library1.1.8
- react-dom17.0.2
- react-loading2.0.3
- react-promise-tracker2.1.0
- react-promise-tracker2.1.0
- react-router-dom5.3.0
- react-scripts4.0.3
- react-usa-map1.5.0
- react-vis1.11.7
- reactstrap9.0.1
- shards-react1.0.3
- web-vitals1.1.2
- R: tidyverseanddpylr
Open two terminal windows. In one, type the following commands:
cd server
npm install
npm startIn the other type:
cd client
npm install
npm startIn a few moments, the server should be running and a browser window should pop up. If no window pops up, open your
browser and go to http://localhost:3000/.
Elections Data Wrangling Place preprocess_voting.R and 1976-2020-senate.csv in the same directory (both are in the preprocessing/voting_preprocessing
directory by default). Then either execute the R script on the command line or open preprocess_voting.R in RStudio, set the session's working
directory to the source file location, and execute the entire script.
Stock Data Wrangling Run stock_preprocess.ipynb to preprocess the original table downloaded from (https://www.kaggle.com/shannanl/sp500-dataset?select=sp500+agg.csv). Then, the preprocessed stock table can be retrieved.
COVID/Vaccine Data Wrangling In DataGrip, we replaced all slashes with hyphens so all the dates (in both files) followed the MM-DD-YYYY format. Vaccination data also had several negative values that needed to be corrected; we did this by sorting by case numbers, and then taking the absolute value of the clearly wrong 6 negative values.
Yelp data Wrangling
Get yelp_academic_dataset_business.json, yelp_academic_dataset_user.json and yelp_academic_dataset_review.json from https://www.yelp.com/dataset.
To wrangle yelp_academic_dataset_review.json and yelp_academic_dataset_user.json, you need to execute chunk.sh first. It will chunk the original file to smaller size files to speed up wrangling time. 
Usage of chunk.sh: 
./chunck.sh {your_file_name}
Follow the program instruction to input the number of rows you want to store in a file. After chunking data, put all chunk files to the directory, and modify the path variable with directory path in yelp_review.py and yelp_user.py. Then execute yelp_review.py to create the csv file for Review table and execute yelp_user.py to create the csv file for User table.
To wrangle yelp_academic_dataset_business.json, you just need to modify file path in both yelp_business.py and yelp_categories.py. yelp_business.py will create the csv file for Business table. yelp_categories.py will create two csv files, one for Categories table and the other for Business_Categories table.