This repository contains a Jupyter notebook that implements and optimizes several machine learning models on a dataset.
-
Linear Regression Model: This model is used to predict a continuous outcome variable (also called the dependent variable) based on one or more predictor variables (also known as independent variables).
-
Linear Regression Model Optimized using RFE (Recursive Feature Elimination): This is a feature selection method that fits a model and removes the weakest feature (or features) until the specified number of features is reached.
-
Linear Regression Model Optimized using SVR (Support Vector Regression): This model applies the principles of Support Vector Machines to a regression problem. It uses the same concepts like margin and maximum margin.
-
Random Forest: This is a versatile machine learning method capable of performing both regression and classification tasks. It is a type of ensemble learning method, where a group of weak models combine to form a powerful model.
-
Random Forest Optimized using Grid Search: This model uses Grid Search to find the optimal hyperparameters of a Random Forest model in order to improve its performance.
-
k-Nearest Neighbors (k-NN): This is a simple and intuitive model that predicts the target of a new instance based on the targets of its 'k' closest instances in the feature space.
-
k-NN Optimized using Grid Search: This model uses Grid Search to find the optimal hyperparameters of a k-NN model in order to improve its performance.
-
Support Vector Machines (SVM): SVMs can model non-linear relationships using the kernel trick, and they work well in high-dimensional spaces.
-
SVM Optimized using Grid Search: This model uses Grid Search to find the optimal hyperparameters of an SVM model in order to improve its performance.
-
XGBoost: This is an implementation of gradient boosted decision trees designed for speed and performance.
-
XGBoost Optimized using Grid Search: This model uses Grid Search to find the optimal hyperparameters of an XGBoost model in order to improve its performance.
-
Neural Network Regression (NNR): This model uses a neural network for regression tasks. It can model complex, non-linear relationships.
-
NNR Optimized using Grid Search: This model uses Grid Search to find the optimal hyperparameters of an NNR model in order to improve its performance.
- Clone this repository.
- Install the necessary libraries mentioned in requirements.txt.
- Run the Jupyter notebook.
- Python 3.7+
- Jupyter
- scikit-learn
- pandas
- numpy
- matplotlib
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
MIT