The objective of this project was to implement the various techniques learned throughout the course (Statistical Data Analysis) to extract information from data. The data which I used is a multivariate time-series. This report discusses how I approached the data and the different methods used during the analysis and prediction phases. I used the normal test to check for normality, the Durbin Watson Test to check for collinearity, the Granger Causality Test to check for association between the variables, and the Augmented-Dickey Fuller Test to check for stationarity. Furthermore, I observed the features of the data using time series analysis and applied regression techniques like VAR(OLS), and ARIMA to predict future outcomes.
- Basic preprocessing steps like adding column names, changing the epochs to DateTime format
- Drop the columns which don’t add valuable information to the data (like, experiment)
- Split the data into training and validation sets
- Identify missing values and verify the quality of the data
- Determine likely approaches to modeling, which might yield a predictive function