Fotobooth is a python script running on a raspberry pi, controlling a Canon DSLR 1000D through gphoto2. Users can trigger the camera through a footswitch.
Files for the background lights and nerdfacts sign can be found here.
The fotobooth was part of an educational project at the Leipzig University of Applied Sciences (HTWK Leipzig). The folder cvml contains all scripts and notebooks developed during the project. The whole project relies heavily on opencv for python.
The cvml-project consists of two seperate parts:
- hand detection and gesture recognition (mediapipe + tensorflow lite)
- face (and object) detection (opencv cascade classifier)
The project uses the mediapipe framework from google to detect hand landmarks on an image. It returns an 1x21x2 Array containing the position of finger joints and finger tips in the image.
The landmarks that were detected by mediapipe are processed by a tensorflow lite model to recognize the gestures "thumbs up" and "thumbs down". The model was found on the blog:
https://techvidvan.com/tutorials/hand-gesture-recognition-tensorflow-opencv/
and was converted from saved_model.pb to handgestures.tflite using the Tensorflow Model Converter:
https://www.tensorflow.org/lite/models/convert/convert_models#python_api
To postprocess the images taken in the fotobooth a cascade classifier was used to detect faces in images.
face_cas = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
While giving acceptable results on normal images without any accessories, the results with people wearing big glasses were upgradable. Therefore a custom cascade classifier was trained to detect the "Big Glasses" in images. The application to train the classifier is:
The following image of big glasses was detected with big_glasses_classifier_5.
sample resolution 150x150
max_scale = 1.2
min_neigh = 5