A deep learning project for cotton plant disease detection using tensorflow
It mainly focus on the diseases which occur only on leaves. However, more research is done on diseases that occur on stem, flowers, buds and boll.
The diseases identified by this model are:
. Diseases caused by aphids,
. Diseases caused by army worms,
. Bacterial Blight,
. Powdery Mildew and
. Target sport.
The data used in this project contains images of all the 5 types of diseases listed above including those of healthy leaves for comparison with the diseased ones.
Below is an example of a healthy cotton plant's leaf:
Batch size is set to 32
image height set to 180 and
image width set to 180
The data is split into training and validation
Training set is given 80% of the data and
Validation set is given 20% of the data
The dataset is classified into six classes based on the plant's images of different diseases and the healthy ones.
This classes include; Aphids, Army Worms, Bacterail Blight, Healthy leaf, Powdery Mildew and Target Spot.
The image below shows the classes of the dataset:
Below are some images from the training dataset
The dataset is configured for performance with two functions
data.cache() and
data.prefetch()
The RGB channel values are standardized to [0,1] range by the use of tf.keras.Rescalling
A Keras model is created and compiled. Below is the summary of the model
The model is then trained for 10 epochs as shown below
The results are not remarkable with validation accuracy being only 0.6170 despite training accuracy being 0.9895
Plots on accuracy and loss for training and validation sets are created and below are the results
From visualizing the training results above, the training accuracy is high but the validation accuracy is very low. The same applies to loss; the training loss is lower than the validation loss.
This shows that the model did not fit well causing a problem of overfitting that resulted into huge margins between training and validation results.
Some measures are taken to solve the overfitting problem below.
Two methods are used to solve overfitting:
- Data Augmentation- this creates modified copies of the dataset using existing data to artificially increase the training set.
- Dropout - This is a layer that randomly sets input units to 0 with a frequency of rate at each step during training time, which helps prevent overfitting.
Below is an example of augmented images:
The code snippet below shows a new model with a dropout layer
The new model trains with remarkable results. The training accuracy is 80% and the validation accuracy is 70%.
Plotting a graph of Accuracy and loss, the training and validation results are closer to each other indicating that the model fit well as shown in the image below.
A new image is given to the model for prediction, the model predicts the image's class with a high degree of accuracy and confidence.
The model is saved and served with tensorflow serving in docker during production.
There are a lot of crop diseases that affect different crops. In this project I focused on those that affect cotton plant specifically on the leaves. This model has done a good job of training and classifying images of five diseases that affect leaves of a cotton plant after which it can then detect a disease if new data is given to it based on those five classes of diseases. I can conclude that it is very possible to train a deep learning model to detect different types of crop diseases when given enough data to train on.