Skip to content

yangus1265/assignment

Repository files navigation

README

This repo contains the following:

  • assignment_1.ipynb: Jupyter notebook containing the data exploration, ETL, and modeling process related to problem 1 of the assignment.
  • assignment_2.ipynb: Jupyter notebook containing the pandas and SQL answers to question 2 of the assignment.
  • final_prediction.pickle: the modeling output that is used to create the predictions from problem 1.
  • run.sh: the shell script that you can run in order to get a prediction.

The process is run by the following:

                              ./run.sh input.csv output.csv

input.csv is the path of the input to the model. The model was trained from the following inputs:

                               |  Variables                |
                               |---------------------------|
                               |  Ticket number            |
                               |  Issue Date               |
                               |  Issue time               |
                               |  Meter Id                 |
                               |  Marked Tim               |
                               |  RP State Plate           |
                               |  Plate Expiry Date        |
                               |  VIN                      |
                               |  Make                     |
                               |  Body Style               |
                               |  Color                    |
                               |  Location                 |
                               |  Route                    |
                               |  Agency                   |
                               |  Violation code           |
                               |  Violation Description    |
                               |  Fine amount              |
                               |  Latitude                 |
                               |  Longitude                |

output.csv is the path of the output of the model. The model will output the following:

                               | not_top_25 |  top_25   |
                               | probability|probability|

the probability ranging from 0 to 1. An actual input sample and output sample is included: input.csv, output.csv

TODO:

  • create flask server allow for localized API call

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published