The coin-vision Project is a computer vision application designed to identify and extract one or more coins from images. It makes use of the object detection model for initial identification and for coin classification.
In order to reduce training complexity and implement software development best practices, the code follows a modular architecture, separating concerns into distinct functions for coin detection, filtering, and result handling. Two separate models are used to detect and classify the coins. An out of the box YOLOv8 model with no additional training serves as the backbone for object detection. Filters are used to compensate for the lack of training. MobileNetV2 is then used to classify the detected items, this classification model was trained with only a few dozens training samples per class. Filters are used again to compensate and disqualify items that are classified with low confidence and mark them as class "None".
Using YOLOv8 the code first detects circular or oval shapes that correspond to coins in various image formats. Coins are extracted in 240x240 boxes for further classification
Filters are applied on the generated boxes, first, duplicated boxes are removed, then boxes that contain non oval\round objects are filtered out
With the remaining boxes the code runs classification using MobileNetV2 . Each coin will be classified as one of 8 classes or Israeli NIS coins. If a coin can not be identified with a threshold defined in COIN_CLASSIFICATION_PROB_THRESHOLD
, the coin is classified as class NONE
Calculate the value of the coins found in the image
It outputs the value of the coins presented in the picture.
Below are the results of running detection on multi coin images
Ensure you have Python installed, then install the required packages using requirements.txt:
pip install -r requirements.txt
Confirm all dependencies are installed correctly:
python -m pip check
Run the main script using the following command-line options to perform specific actions:
python main.py --action <action>
-
detect
Description: Detects coins in the input images and saves the results.
-
label
Description: Label detected coin images for further processing or training.
-
train
Description: Train and evaluate a machine learning model (e.g., MobileNetV2) using labeled coin data.
-
test_gpu
Description: Test if your GPU is being utilized correctly for computations.
-
run
Description: Calculate the total value of coins in images.
-
Detect coins in the raw images:
python main.py --action detect
-
Label the detected coins for classification:
python main.py --action label
-
Train the model using labeled coin data:
python main.py --action train
-
Calculate the total value of coins in labeled multi-coin images:
python main.py --action run
- Ensure the configuration paths (config.RAW_IMAGES_FOLDER, config.DETECTION_RESULTS_FOLDER, etc.) are set up correctly in your project configuration file.
- GPU support is optional but highly recommended for faster detection and processing performance.
├── LICENSE <- Open-source license if one is chosen
├── Makefile <- Makefile with convenience commands like `make data` or `make train`
├── README.md <- The top-level README for developers using this project.
├── data
│ ├── external <- Data from third party sources.
│ ├── interim <- Output of internal training and work
│ ├── processed <- The final images to be used for training and testing
│ └── raw <- The original uploaded images, raw
│
├── docs <- A default mkdocs project; see www.mkdocs.org for details
│
├── models <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks <- Jupyter notebooks. Naming convention is a number (for ordering),
│ the creator's initials, and a short `-` delimited description, e.g.
│ `1.0-jqp-initial-data-exploration`.
│
├── pyproject.toml <- Project configuration file with package metadata for
│ coin_vision and configuration for tools like black
│
├── references <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
│ └── figures <- Generated graphics and figures to be used in reporting
│
├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
│ generated with `pip freeze > requirements.txt`
│
├── setup.cfg <- Configuration file for flake8
│
└── coin_vision <- Source code for use in this project.
│
├── __init__.py <- Makes coin_vision a Python module
│
├── config.py <- Store useful variables and configuration
│
├── dataset.py <- Scripts to download or generate data
│
├── features.py <- Code to create features for modeling
│
├── modeling
│ ├── __init__.py
│ ├── predict.py <- Code to run model inference with trained models
│ └── train.py <- Code to train models
│
└── plots.py <- Code to create visualizations