Create Python scripts to process, visualize, and model accelerometer and gyroscope data to create a machine learning model that can classify barbell exercises and count repetitions.
py -m venv venv
- For Command Prompt - cmd
C:\Users\User\Machine-Learning-Fitness-Tracker> venv\Scripts\activate.bat
(venv) C:\Users\User\Machine-Learning-Fitness-Tracker>
- For Git Bash - bash
User@User MINGW64 ~/Machine-Learning-Fitness-Tracker (main)
$ source venv/Scripts/activate
(venv)
User@User MINGW64 ~/Machine-Learning-Fitness-Tracker (main)
$
- For Command Prompt - cmd
(venv) C:\Users\User\Machine-Learning-Fitness-Tracker> venv\Scripts\deactivate.bat
C:\Users\User\Machine-Learning-Fitness-Tracker>
- For Git Bash - bash
(venv)
User@User MINGW64 ~/Machine-Learning-Fitness-Tracker (main)
$ deactivate
User@User MINGW64 ~/Machine-Learning-Fitness-Tracker (main)
$
What was actually installed
pip install numpy
pip install pandas
pip install ipykernel==6.17.1
pip install ipython==8.7.0
pip install jupyter-client==7.4.7
pip install jupyter-core==5.1.0
pip install matplotlib
pip install math
pip install spicy
pip install scikit-learn-intelex
pip install seaborn
What was should be installed
pip install numpy==1.23.5
pip install pandas==1.5.2
pip install ipykernel==6.17.1
pip install ipython==8.7.0
pip install jupyter-client==7.4.7
pip install jupyter-core==5.1.0
pip install matplotlib==3.6.2
Installing Current Versions of ALL PIP files.
pip install numpy
pip install pandas
pip install ipykernel
pip install ipython
pip install jupyter-client
pip install jupyter-core
pip install matplotlib
Data was collection from 5 participants: A, B, C, D, and E.
Data was collection from 01/11/2019 (2019/01/11) to 01/16/2019 (2019/01/16)
Equipment used was MbientLab's wristband sensor research kit
- The wristband mimics the placement and orientation of a watch while allowing for controlled experiments. Data was collected using the default settings of the sensors: accelerometer: 12.500HZ and gyroscope: 25.000Hz.
Acc_Y
- This is accerlation in the verticle direction, Up and Down.
Acc_x
- This is accerlation in the horizontal direction, Left to Right (West to East).
Acc_z
- This is accerlation in the horizontal direction, Front to Back (North to South).
- Data for this project was collected in a
.csv
format. - Understanding the CSV files (measurement, participant, exercise, intensity)
- RAW data can be found in the following folder for this project
/data/raw/MetaMotion
- Create a python file that will import the RAW data from the .csv files and turn it into an data set that can be exportable and thus used through the project.
- File for this step can be found
/src/data/make_dataset.py
Steps within themake_dataset.py
goes as:- Read single CSV file
- List all data in data/raw/MetaMotion
- Extract features from filename
- Read all files
- Working with datetimes
- Turn into function
- Merging datasets
- Resample data (frequency conversion)
- Export dataset
- Export Dataset, for this project we are using pickle (
.pkl
) files to hold dataset files. The pickle (.pkl
) file for this step can be found/data/interim/01_data_processed.pkl
- Do not forget to add a
__init__.py
file for themake_dataset.py
. The__init__.py
for this project can be found/src/data/__init__.py
- Create plot settings for that match what is needed to plot the data from created pickle (
.pkl
) from Step 1, found./data/interim/01_data_processed.pkl
. Theplot_settings.py
can be found/src/visualization/plot_settings.py
- Write a python script that will create figures based on the perviously created pickle (
.pkl
) from Step 1, found/data/interim/01_data_processed.pkl
. Thevisualize.py
can be found/src/visualization/visualize.py
Steps within thevisualize.py
goes as:- Load data
- Plot single columns
- Plot all exercises
- Adjust plot settings
- Compare medium vs. heavy sets
- Compare participants
- Plot multiple axis
- Create a loop to plot all combinations per sensor
- Combine plots in one figure
- Loop over all combinations and export for both sensors
- Loop over all combinations and export for both sensors data types. The created graphs can be found in the following folder as
.png
files/reports/figures/
- Do not forget to add a
__init__.py
file, it will help thevisualize.py
andplot_settings.py
files. The__init__.py
for this project can be found/src/visualization/__init__.py
- Write a python script that will remove the Outliers from the existing Sensor Data found in a pickle (
.pkl
) from Step 1, found/data/interim/01_data_processed.pkl
. Label this python fileremove_outliers.py
, which can be found/src/features/remove_outliers.py
Steps within theremove_outliers.py
goes as:- Load data
- Plotting outliers
plot_binary_outliers()
- Interquartile Range (Distribution Based)
- Insert IQR function
mark_outliers_iqr()
- Chauvenets Criteron (Distribution Based)
mark_outliers_chauvenet()
- Local outlier factor (distance based)
- Insert Local Outlier Factor (LOF) function
mark_outliers_lof()
- Check outliers grouped by label
- Choose method and deal with outliers
- Export new dataframe
- Export the new Dataset, that has outliers from using pickle (
.pkl
) files to hold dataset files. The pickle (.pkl
) file for this step can be found/data/interim/02_outliers_removed_chauvenets.pkl
- Do not forget to add a
__init__.py
file, it will help theremove_outliers.py
files. The__init__.py
for this project can be found/src/features/__init__.py
- Import from GitHub code or copy following python files, as there hold functions we will utilized in the following steps.
/src/features/DataTransformation.py
/src/features/FrequencyAbstraction.py
/src/features/TemporalAbstraction.py
- Now we will write a python file that will first filter subtle noise (not outliers) and identify parts of the data that explain most of the variance. Then add numerical, temporal, frequency, and cluster features. Label this file
build_features.py
. For this project, thebuild_features.py
can be found/src/features/build_features.py
Steps within thebuild_features.py
goes as:- Load data
- Dealing with missing values (imputation)
- INTERPOLATE will fill in the gap were the data is missing.
- Calculating set duration
- Butterworth lowpass filter - Low Pass Filter
LowPassFilter()
- Principal component analysis PCA
PrincipalComponentAnalysis()
- Sum of squares attributes
- Temporal abstraction
NumericalAbstraction()
- Frequency features - Fourier Transformation
FourierTransformation()
- Visualizing Results
- Dealing with overlapping windows
- Clustering - K-means Clustering
KMeans()
- Export dataset
- Export the new dataset noting the clusters created as a using pickle (
.pkl
) files to hold dataset files. The pickle (.pkl
) file for this step can be found/data/interim/03_data_features.pkl
- No need to create a
__init__.py
file for this folder because we did so in Step 4.
- Import from GitHub code or copy following python files, as there hold functions we will utilized in the following steps.
/src/models/LearningAlgorithms.py
- Now we will write a python file that will use the data and train model to predict either an exercise is one of 6 exercises we have be analysising thus far. Label this file
build_features.py
. For this project, thetrain_model.py
can be found/src/models/train_model.py
Steps within thetrain_model.py
goes as:- Set Up
- Plot settings
- Import Data Frame
- df = pd.read_pickle("../../data/interim/03_data_features.pkl")
- Create a training and test set
- Split feature subsets
- Basic features
- Square features
- PCA features
- Time features
- Frequency features
- Cluster features
- Perform forward feature selection using simple decision tree
- Use the
ClassificationAlgorithms()
- Use the
- Grid search for best hyperparameters and model selection
- Import Grid search code
- Create a grouped bar plot to compare the results
- Select best model and evaluate results
- Start with one of the Training modesl from
ClassificationAlgorithms()
- Define and create confusion matrix
- create confusion matrix for cm
- Start with one of the Training modesl from
- Select train and test data based on participant
- Use best model again and evaluate results
- Try a simpler model with the selected features
- Find out which feature set and which model work best together for which exercise based on the results from the confusion matrix.
- Set Up
- Now we will write a python file. Label this file
count_repetitions.py
. For this project, thecount_repetitions.py
can be found/src/features/count_repetitions.py
Steps within thecount_repetitions.py
goes as:- Set Up
- Plot settings
- Load data
- Split data
- Visualize data to identify patterns
- Configure LowPassFilter
- Apply and tweak LowPassFilter
- Create function to count repetitions
- Create benchmark dataframe
- Evaluate the results
- Set Up