Skip to content

Latest commit

 

History

History
75 lines (59 loc) · 2.87 KB

README.md

File metadata and controls

75 lines (59 loc) · 2.87 KB

Task Checklist

  • Get the data
  • Modify the data for analysis
  • Produce basic plots/graphs for understanding the data
  • Apply simple kmeans methods
  • Apply hierarchical clustering
  • Apply EM
  • Apply a new method based on research
  • Test and Compare methods
  • Produce visualizations
  • Create Final Report
  • Crete presentation

Project Plan

Week of 6/7

  • Change topic from sentiment analysis to stock market clustering and predictions.
  • Create informal project proposal.

Week of 6/14

  • Determine as a group which stocks we would like to perform our analysis on. Currently, we are looking forward to analyzing SP500 stocks.
  • Determine as a group what time periods we would like to look at in order to avoid outlier years.
  • Get all group members familiar with scikit-learn and R through individual exploration.
  • Gather all data from the stocks and convert into a format needed for analysis

Week of 6/21

  • Produce visualization graphics using dummy data
  • Create and test different models created using different algorithms

Week of 6/28

  • Create a visualization that demonstrates our results
  • If we get good results play with the data and attempt to do predictions on stock prices given related stocks. This would be a form of a supervised learning done by altering the data to be given stock prices of the cluster and have to predict what our stock will be.
  • Begin work on the project progress report.

Week of 7/5

  • Finish project progress report.
  • Attempt to use alternative algorithms to cluster the data.
  • Begin work on final project report.

Week of 7/12

  • Finish final project report.
  • Begin working on the project presentation.

Individual Tasks

Task 1: Gathering data using R or Python (everyone)

  • Gather data using R or Python techniques
  • Saving the data in correct file formats for future analysis

Task 2: Determining the most important attributes to use and what types of machine learning techniques should be implemented (in short Data manipulation)

  • Analyze importance of each attribute
  • Adding or removing attributes
  • Determine what type of algorithms would work best

Task 3: Generating and testing models.

  • Design and create the optimal models using basic and advanced algorithms
  • Support Vector Machines
  • K-Nearest Neighbor
  • Expectation Maximization
  • Density-Based Clustering
  • Test the methods on the data
  • Modify and optimize the methods based on the testing

Task 4: Visualizing results

  • Finding trends in the data results
  • Creating charts and graphs to visualize the trends
  • Creating network structure to represent similarities between different stocks

Task 5: Writing the final report

  • Combine the visual results along with the concluding ideas to form a final report

Task 6: Create the Presentation

  • Use charts and graphs to present the trends found in our data and analysis results