Skip to content

Image and video classification practice with OpenCV and YOLOv3.

Notifications You must be signed in to change notification settings

jaszmine/openCV4-practice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

openCV4-practice opencvLogo

Image and video classification practice with OpenCV.

Frameworks: DNN, Caffe, Darknet
Libraries/Modules: cmake, numpy, OpenCV's config, and dlib
Algorithms: YOLOv3

Note: Check Reference Videos and Model config Release


Contents


Concepts

  • AI vs. ML vs. DL
  • Training Networks
    • Hidden Layers
    • Weights
    • Loss Function
    • Back Propogation
  • Pre-trained Networks

Back to top


Deep Learning Applications

  • Image Classification
  • Self-driving cars
  • Handwriting transcription
  • Speech recognition
  • Language translation

Back to top


OpenCV Overview

  • An open-source computer vision and machine learning software library
  • Applications
    • Facial recognition
    • Object identification
    • Human action classification
    • Camera movement tracking
  • Natively written in C++, can use wrappers for Python and Java
  • No framework-specific limitations
  • An internal representation of models - can optimize code easier
  • Has its own deep learning implementation - minimum external dependencies
  • Uses BGR color format (instead of RGB)

Back to top


OpenCV's DNN

  • Deep Neural Network Module
  • NOT an entire deep learning framework
  • Inference:
    • When only a forward pass occurs (no back propgation so no default learning)
    • Engine example: input -> pretrained model -> result
    • Makes coding easier - no training means no GPUs needed
  • OpenCV 4's DNN module supports:
    • Caffe
    • TensorFlow
    • Darknet
    • ONNX format

Back to top


DNN Process

  • Load pre-trained models from other DL frameworks
  • Pre-process images using blobFromImages()
  • Pass blobs through loaded pre-trained model to get output predictions (blob -> model -> inference)
  • Read the Model
    • cv2.dnn.readNetFromCaffe(protext, caffeModel)
    • loads models and weights
  • Create a Four-Dimensional Blob
    • blob = cv2.dnn.blobFromImage(image, [scalefactor], [size], [mean], [swapRB], [crop], [ddepth])
  • Input the Blob into the Network
    • net.setInput(blob)
  • Forward pass throught the Network
    • outp = net.forward()
    • produces an output prediction after a forward pass
  • Summary of steps
  1. images
  2. blobFromImage()
  3. Blob
  4. Trained Model
  5. Inference
  • Returns: 4D Tensor(NCHW) - # of images, # of channels, height, width
blobFromImage() Parameter Description
image Input image (1, 3, or 4 channels)
scalefactor Multiplier for image values
size Spatial size for output image
mean Scalar with mean values that are subtracted from BGR channels
swapRB Flag that swaps channels (from RGB to BGR)
crop Flag to crop image after resize
ddepth Depth of ouput blob (CV_32F or CV_8U)

Back to top


Setup/Installation

  • Install Python and Anaconda
  • Setup Virtual Environment (In Anaconda Terminal)
    • Create: conda create --name ocv4 python--3.6
    • Activate: activate ocv4
    • Install cmake: pip install cmake
    • Install numpy: pip install numpy
    • Install OpenCV contrib module: pip install opencv-contrib--python=-4.0.1.24
    • Install dlib: conda install -c conda-forge dlib or pip install dlib
  • Check if everything installed properly
    • Switch to python: python (command line should now start with >>>)
    • import numpy
    • import cv2
    • If nothing returns then it was done right :)
  • Deep Learning Frameworks used
    • OpenCV
    • Caffe
    • Darknet

Back to top


File description, commands, & outputs

Descripts

  • 02_01: Displays the 'devon.jpg' image & 3 intensity/grayscale channels
  • 02_02: Displays the 'devon.jpg' image
  • 02_03: Runs the 'shore.mov' video
  • 04_01: Returns first few entries of 'synset_words.txt' file
  • 04_02: Classification & Probability in an image
  • 04_03: Classification & Probability in a video
  • 04_04: Classification for an image & video using YOLOv3 w/ confThreshold=0.5
  • 04_05: Classification for an image & video using YOLOv3 w/ confThreshold=0.4

04_01

  • command: python image.py
  • output:

04_05 fruitOutput

04_02

  • command: python image.py
  • output:

04_05 fruitOutput

04_03

  • using dnn module as an inference engine for a video file
  • command: python video.py
  • output:

04_03 shoreVideo

04_04 and 04_05 - images

  • passing an image through the network using YOLOv3 (an object detection algorithm) w/ confidence threshold's =0.5 and =0.4
  • command: python yolo.py --image ../images/fruit.jpg
  • 04_04 output:

04_05 fruitOutput

* 04_05 output:

04_05 fruitOutput

04_05 - video

  • passing a video through the network using YOLOv3 w/ confidence threshold = 0.4
  • command: python yolo.py --video ../images/restaurant.mov
  • output:

Back to top


Resources

About

Image and video classification practice with OpenCV and YOLOv3.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages