Comparing Deep Learning and Statistical Models for Stock Price Prediction

This project explores stock price prediction using time series analysis, comparing the performance of Long Short-Term Memory (LSTM), a deep learning model, with AutoRegressive Integrated Moving Average (ARIMA), a statistical model. The primary focus is on forecasting HP Inc.'s future stock prices based on historical data. The objective is to highlight the strengths and weaknesses of both approaches in predicting stock trends.

Introduction

Predicting stock prices is a challenging task due to the inherent complexity and volatility of financial markets. Traditional statistical methods like ARIMA have been widely used for time series forecasting. However, with the advent of deep learning, models like LSTM have shown promise in capturing long-term dependencies and non-linear patterns in data.

In this project, we compare these two models—ARIMA and LSTM—to determine which approach offers better performance in predicting the stock prices of HP Inc. The analysis provides insights into the predictive power and limitations of each model.

Dataset

The dataset used in this project is sourced from HP Inc.'s historical stock prices and is available in the Dataset folder of this repository under the file name HPQ.csv.

Dataset Features:

Date
Open
High
Low
Close
Volume

Methodology

Data Preprocessing

Loading Data: The dataset is read using Pandas and checked for missing values and anomalies.
Data Cleaning:
- Handled missing values.
- Converted date columns to datetime objects.
Feature Selection: Focused on the 'Close' price for prediction as it represents the final price at which the stock is traded on a given day.
Normalization:
- For LSTM, data was normalized using MinMaxScaler to improve model performance.
- ARIMA, being a statistical model, required stationarity checks and differencing to stabilize variance.

ARIMA Model

Stationarity Check:
- Used Augmented Dickey-Fuller (ADF) test to check for stationarity.
- Applied differencing to achieve stationarity if needed.
Model Fitting:
- Determined optimal p, d, q parameters using ACF and PACF plots.
- Fitted the ARIMA model on the training data.
Forecasting:
- Generated forecasts and compared them against the actual values.

LSTM Model

Data Preparation:
- Created sequences of past stock prices to predict the next price.
- Split the data into training and testing sets.
Model Architecture:
- Used an LSTM layer followed by Dense layers.
- Configured with appropriate loss functions and optimizers.
Training:
- Trained the model on the training set and validated on the testing set.
Prediction:
- Made predictions and inverse transformed the results to the original scale.

Evaluation Metrics

Both models were evaluated using the following metrics:

Mean Absolute Error (MAE): Measures the average magnitude of errors in a set of predictions.
Root Mean Squared Error (RMSE): Penalizes larger errors more significantly than MAE.
R-squared (R2 Score): Indicates the proportion of variance in the dependent variable predictable from the independent variables.

Results and Discussion

ARIMA Model:
- Strengths: Simplicity, interpretability, and good performance on linear data.
- Weaknesses: Struggles with non-linear patterns and requires stationarity.
LSTM Model:
- Strengths: Captures complex, non-linear relationships and long-term dependencies.
- Weaknesses: Requires more computational resources and time to train.

Performance Comparison:

The LSTM model showed superior performance in capturing non-linear trends and provided better predictive accuracy compared to the ARIMA model.
However, ARIMA's simplicity and faster computation make it suitable for quick, linear trend analysis.

Conclusion

This project demonstrates the comparative strengths and weaknesses of LSTM and ARIMA models in stock price prediction. While LSTM outperforms ARIMA in terms of predictive accuracy, ARIMA's simplicity and interpretability make it a viable option for certain applications. The choice of model depends on the specific requirements of the forecasting task.

How to Run

Clone the Repository:

git clone https://github.com/mohiuddin-khan-shiam/Comparing-Deep-Learning-and-Statistical-Models-for-Stock-Price-Prediction.git
cd Comparing-Deep-Learning-and-Statistical-Models-for-Stock-Price-Prediction

Install Dependencies:
```
pip install -r requirements.txt
```
Run the Notebook: Open LSTM_vs_Arima.ipynb in Jupyter Notebook or Google Colab and run all cells.

Dependencies

Python 3.x
numpy
pandas
matplotlib
statsmodels
scikit-learn
tensorflow/keras

License

This project is licensed under the MIT License.

For any questions or feedback, feel free to reach out via GitHub Issues.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Dataset		Dataset
LSTM_vs_Arima.ipynb		LSTM_vs_Arima.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Comparing Deep Learning and Statistical Models for Stock Price Prediction

Table of Contents

Introduction

Dataset

Methodology

Data Preprocessing

ARIMA Model

LSTM Model

Evaluation Metrics

Results and Discussion

Conclusion

How to Run

Dependencies

License

About

Releases

Packages

Languages

mohiuddin-khan-shiam/Comparing-Deep-Learning-and-Statistical-Models-for-Stock-Price-Prediction

Folders and files

Latest commit

History

Repository files navigation

Comparing Deep Learning and Statistical Models for Stock Price Prediction

Table of Contents

Introduction

Dataset

Methodology

Data Preprocessing

ARIMA Model

LSTM Model

Evaluation Metrics

Results and Discussion

Conclusion

How to Run

Dependencies

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages