Skip to content

mathewsrc/Fine-Tuning-Pretrained-Image-Classification-Model-with-AWS-SageMaker-and-TensorFlow

Repository files navigation

Github Actions

Fine-Tuning Pretrained Image Classification Model with AWS SageMaker and TensorFlow

Project developed for AWS Machine Learning Engineer Scholarship offered by Udacity (2023)

In recent years, deep learning has revolutionized the field of computer vision with its ability to accurately classify images. One of the most popular techniques for image classification is using convolutional neural networks (CNNs), which have shown excellent results in comparison with others approaches such as full connected neural networks. However, training these models from scratch can be computationally intensive and time-consuming. To overcome this, another approach called transfer learning has been used and has become increasingly popular.

This project, uses Amazon Web Services (AWS) SageMaker and Tensorflow to fine-tune a pretrained model for binary image classification. The dataset used in this project can be found at https://www.kaggle.com/datasets/deepcontractor/is-that-santa-image-classification. In addition, SageMaker Debugger was used to measure performance of training job, system resource usage, and for framework metrics analysis.

Source: DALL-E

Project overview

Note: model was trained and tested by using custom training and inference scripts.

Project Features

  • AWS SageMaker
  • AWS SageMaker Debugger
  • Tensorflow Framework version 2.9
  • AWS S3
  • SageMaker Hyperparameter Tuning
  • SageMaker Endpoints
  • SageMaker Profiler

Dataset

The IS THAT SANTA? (Image Classification) dataset consists of 1230 images of Santa Claus and random images. This dataset is structured as follows:

image

For more information see: IS THAT SANTA? (Image Classification)

Setup

AWS SageMaker

Notebook enviroment (Kernel)

Image: Tensorflow 2.10.0 Python 3.9 CPU optimized
Instance type: ml.t3.medium

Kaggle

Before we can access and download the Kaggle dataset, it is necessary to have a Kaggle account and a Kaggle API token (https://www.kaggle.com/account). Next paste the kaggle.json file in AWS SageMaker Studio as follows and execute the code snipped below to move file to root:

Move file to project root

mkdir ~/.kaggle
cp kaggle.json ~/.kaggle/
chmod 600 ~/.kaggle/kaggle.json

The code bellow shows how to donwload dataset and unzip file

kaggle datasets download -d deepcontractor/is-that-santa-image-classification
unzip is-that-santa-image-classification.zip

As Tensorflow does not support jpg we need to convert images from jpg to jpeg. For this step, we can use a bash script that uses ImageMagick (https://imagemagick.org/index.php).

To install ImageMagick run the following code developed by ARolek (https://gist.github.com/ARolek/9199329) on terminal:

Download the most recent package

wget http://www.imagemagick.org/download/ImageMagick.tar.gz

Uncompress the package

tar -vxf ImageMagick.tar.gz

Install the devel packages for png, jpg, tiff. these are dependencies of ImageMagick

sudo yum -y install libpng-devel libjpeg-devel libtiff-devel

Configure ImageMagick without X11. this is a server without a display (headless) so we don't need X11

cd ImageMagick
./configure --without-x
make && make install

Now we can use a bash script on terminal to convert images:

./convert_jpg_to_jpeg.sh -d -r is_that_santa/

Note: the dataset name was manually renamed to is-that-santa. The -d flag in convert_jpg_jpeg.sh stands for delete the orinal images and the -r for recursively converts images.

Now we can upload files to AWS s3:

aws s3 cp is_that_santa s3://{bucke-name}/datasets/ --recursive > /dev/null

Note: replace the {bucket-name} with your own bucket name. --recursive > /dev/null is optinal.

Python requirements and install

Requirements

tensorflow==2.10.1
smdebug==1.0.12
kaggle==1.5.12

Install

pip install -r requirements.txt

Or with MakeFile make command

make install

Model prediction: Prediction vs Actual

Debugger and Profiler outputs

For More details see: https://github.com/mathewsrc/Fine-Tuning-Pretrained-Image-Classification-Model-with-AWS-SageMaker-and-TensorFlow/blob/master/train_and_deploy.ipynb

References

https://www.tensorflow.org/tutorials/images/transfer_learning

https://www.tensorflow.org/tutorials/quickstart/advanced

https://docs.aws.amazon.com/sagemaker/latest/dg/model-access-training-data.html

https://github.com/aws/amazon-sagemaker-examples/blob/main/hyperparameter_tuning/

https://github.com/awslabs/sagemaker-debugger/blob/master/examples/tensorflow2/scripts/tf_keras_gradienttape.py

https://github.com/awslabs/sagemaker-debugger/blob/master/docs/tensorflow.md

https://github.com/aws/sagemaker-python-sdk/blob/master/doc/amazon_sagemaker_debugger.rst

https://docs.aws.amazon.com/sagemaker/latest/dg/debugger-built-in-rules.html

https://github.com/awslabs/sagemaker-debugger/blob/master/docs/api.md#tensorflow-specific-hook-api

https://docs.aws.amazon.com/sagemaker/latest/dg/debugger-enable-tensorboard-summaries.html

https://docs.aws.amazon.com/sagemaker/latest/dg/debugger-analyze-data.html

https://docs.aws.amazon.com/sagemaker/latest/dg/debugger-access-data-profiling-default-plot.html

https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/deploying_tensorflow_serving.html

About

Use AWS SageMaker to finetune a pretrained model that can perform image classification

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published