Skip to content

Latest commit

 

History

History
40 lines (26 loc) · 1.87 KB

README.md

File metadata and controls

40 lines (26 loc) · 1.87 KB

2017 NHTS Data Competition

Improving Commute Time with Data Analysis

Authors: Bonny Nyaga, P.E. and Walter Yu, P.E.

Abstract

The average commute time within each U.S. census division has a large impact on its economy, productivity, infrastructure and environment. Longer commute times cause lost wages for workers with longer commute times, additional wearing of highway infrastructure and environmental impacts. As a result, this study evaluates commute patterns with the NHTS dataset and whether public transportation or additional transportation planning could reduce commute times based on data analysis.

Introduction

This study outlines the data, methods and results used to identify commute patterns. Specifically, it seeks to answer the following questions:

  1. Which are the census divisions with the most trips per household?
  2. What are the average commute distance and time within those divisions?
  3. Could public transportation or transportation planning reduce commute times?
  4. What are some recommendations for improving commute times based on demographic data?

NHTS Data Analysis

This study analyzes the households, trips and vehicles tables of the NHTS dataset to evaluate commute trends by census division. Specifically, the tables were analyzed to evaluate average commute distance and time.

Tools and Process

The tools and process listed below were used to analyze data and provide recommendations:

  1. Jupyter Notebook - Exploratory data analysis and visualization were completed using this notebook.
  2. Python Modules - The modules listed below will need to be installed in order to run this notebook:
  • Pandas
  • NumPy
  • SciPy
  • Seaborn
  • Matplotlib
  • StatsModels
  • Scikit-Learn

Installation

Clone repository, then run notebook with Python and Jupyter Notebook.