Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YOLO8 support #50

Merged
merged 28 commits into from
Jul 30, 2024
Merged

YOLO8 support #50

merged 28 commits into from
Jul 30, 2024

Conversation

Eldies
Copy link

@Eldies Eldies commented Jul 25, 2024

Summary

adds support for yolo8 formats - detection, segmentation, pose, oriented bounding boxes

How to test

Checklist

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below)
# Copyright (C) 2022 CVAT.ai Corporation
#
# SPDX-License-Identifier: MIT

Summary by CodeRabbit

  • New Features

    • Introduced classes for YOLOv8 format converters and importers, enhancing support for various annotation types including segmentation and pose estimation.
    • Added YAML configuration files for YOLO datasets, streamlining dataset setup and organization.
    • Created new test utility functions and classes to improve testing processes.
  • Bug Fixes

    • Improved error handling in annotation processing to provide clearer messages when encountering issues.
  • Documentation

    • Added configuration files to specify dataset paths and class mappings for training purposes, aiding in model integration.
  • Tests

    • Comprehensive suite of unit tests for validating YOLO format functionality, ensuring robustness of dataset conversions and imports.

Copy link

coderabbitai bot commented Jul 25, 2024

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The recent updates significantly enhance the YOLO format support within the Datumaro framework. New converter and extractor classes have been introduced, allowing for improved handling of complex annotation types, including segmentation and pose estimation. The updates also streamline the dataset management process through YAML configuration files, enhancing flexibility and clarity. Overall, these changes bolster the framework's capabilities, making it easier to work with various YOLO formats and ensuring robust data handling for machine learning tasks.

Changes

Files Change Summary
datumaro/plugins/yolo_format/converter.py, datumaro/plugins/yolo_format/extractor.py, datumaro/plugins/yolo_format/importer.py, datumaro/plugins/yolo_format/format.py Introduced new classes for YOLOv8 converters, extractors, and importers, enhancing support for multiple annotation types and YAML configurations. Updated methods for better modularity and error handling.
datumaro/util/meta_file_util.py Enhanced the save_meta_file function to manage additional metadata for label and point categories.
tests/unit/data_formats/test_yolo_format.py Added comprehensive unit tests for YOLO converters and importers, ensuring functionality and error handling across different YOLO formats.
tests/assets/yolo_dataset/, tests/conftest.py, tests/utils/assets.py, tests/utils/test_utils.py Introduced various YAML configuration files for YOLO datasets and added utility functions and test helpers for improved testing framework integration.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Yolo8Importer
    participant Yolo8Extractor
    participant Dataset

    User->>Yolo8Importer: Import YOLOv8 dataset
    Yolo8Importer->>Yolo8Extractor: Detect and validate dataset files
    Yolo8Extractor->>Dataset: Load annotations and media
    Dataset->>User: Provide imported dataset
Loading

🐇 In the meadow, hop and play,
New features sprout, bright as day!
With YAML paths and formats fair,
YOLO's gifts are everywhere!
Let's dance and cheer, oh what delight,
For datasets now shine so bright! ✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 2a4d9db and b8b74c1.

Files ignored due to path filters (9)
  • tests/assets/yolo_dataset/yolo/obj_train_data/1.jpg is excluded by !**/*.jpg
  • tests/assets/yolo_dataset/yolo8/images/train/1.jpg is excluded by !**/*.jpg
  • tests/assets/yolo_dataset/yolo8_obb/images/train/1.jpg is excluded by !**/*.jpg
  • tests/assets/yolo_dataset/yolo8_pose/images/train/1.jpg is excluded by !**/*.jpg
  • tests/assets/yolo_dataset/yolo8_pose_two_values_per_point/images/train/1.jpg is excluded by !**/*.jpg
  • tests/assets/yolo_dataset/yolo8_segmentation/images/train/1.jpg is excluded by !**/*.jpg
  • tests/assets/yolo_dataset/yolo8_with_list_of_imgs/images/train/1.jpg is excluded by !**/*.jpg
  • tests/assets/yolo_dataset/yolo8_with_list_of_names/images/train/1.jpg is excluded by !**/*.jpg
  • tests/assets/yolo_dataset/yolo8_with_subset_txt/images/train/1.jpg is excluded by !**/*.jpg
Files selected for processing (27)
  • datumaro/plugins/yolo_format/converter.py (6 hunks)
  • datumaro/plugins/yolo_format/extractor.py (9 hunks)
  • datumaro/plugins/yolo_format/format.py (1 hunks)
  • datumaro/plugins/yolo_format/importer.py (1 hunks)
  • datumaro/util/meta_file_util.py (1 hunks)
  • tests/assets/yolo_dataset/yolo8/data.yaml (1 hunks)
  • tests/assets/yolo_dataset/yolo8/labels/train/1.txt (1 hunks)
  • tests/assets/yolo_dataset/yolo8_obb/data.yaml (1 hunks)
  • tests/assets/yolo_dataset/yolo8_obb/labels/train/1.txt (1 hunks)
  • tests/assets/yolo_dataset/yolo8_pose/data.yaml (1 hunks)
  • tests/assets/yolo_dataset/yolo8_pose/labels/train/1.txt (1 hunks)
  • tests/assets/yolo_dataset/yolo8_pose_two_values_per_point/data.yaml (1 hunks)
  • tests/assets/yolo_dataset/yolo8_pose_two_values_per_point/labels/train/1.txt (1 hunks)
  • tests/assets/yolo_dataset/yolo8_segmentation/data.yaml (1 hunks)
  • tests/assets/yolo_dataset/yolo8_segmentation/labels/train/1.txt (1 hunks)
  • tests/assets/yolo_dataset/yolo8_with_list_of_imgs/data.yaml (1 hunks)
  • tests/assets/yolo_dataset/yolo8_with_list_of_imgs/labels/train/1.txt (1 hunks)
  • tests/assets/yolo_dataset/yolo8_with_list_of_names/data.yaml (1 hunks)
  • tests/assets/yolo_dataset/yolo8_with_list_of_names/labels/train/1.txt (1 hunks)
  • tests/assets/yolo_dataset/yolo8_with_subset_txt/data.yaml (1 hunks)
  • tests/assets/yolo_dataset/yolo8_with_subset_txt/labels/train/1.txt (1 hunks)
  • tests/assets/yolo_dataset/yolo8_with_subset_txt/train.txt (1 hunks)
  • tests/conftest.py (2 hunks)
  • tests/unit/data_formats/init.py (1 hunks)
  • tests/unit/data_formats/test_yolo_format.py (1 hunks)
  • tests/utils/assets.py (1 hunks)
  • tests/utils/test_utils.py (1 hunks)
Files skipped from review due to trivial changes (16)
  • tests/assets/yolo_dataset/yolo8/data.yaml
  • tests/assets/yolo_dataset/yolo8/labels/train/1.txt
  • tests/assets/yolo_dataset/yolo8_obb/data.yaml
  • tests/assets/yolo_dataset/yolo8_obb/labels/train/1.txt
  • tests/assets/yolo_dataset/yolo8_pose/data.yaml
  • tests/assets/yolo_dataset/yolo8_pose/labels/train/1.txt
  • tests/assets/yolo_dataset/yolo8_pose_two_values_per_point/data.yaml
  • tests/assets/yolo_dataset/yolo8_pose_two_values_per_point/labels/train/1.txt
  • tests/assets/yolo_dataset/yolo8_segmentation/data.yaml
  • tests/assets/yolo_dataset/yolo8_with_list_of_imgs/data.yaml
  • tests/assets/yolo_dataset/yolo8_with_list_of_imgs/labels/train/1.txt
  • tests/assets/yolo_dataset/yolo8_with_list_of_names/data.yaml
  • tests/assets/yolo_dataset/yolo8_with_subset_txt/data.yaml
  • tests/assets/yolo_dataset/yolo8_with_subset_txt/train.txt
  • tests/unit/data_formats/init.py
  • tests/utils/assets.py
Additional comments not posted (62)
tests/assets/yolo_dataset/yolo8_with_list_of_names/labels/train/1.txt (2)

1-1: Class ID format is correct.

The class ID is correctly formatted as an integer.


2-2: Bounding box coordinates format is correct.

The bounding box coordinates are correctly formatted and normalized.

tests/assets/yolo_dataset/yolo8_with_subset_txt/labels/train/1.txt (2)

1-1: Class ID format is correct.

The class ID is correctly formatted as an integer.


2-2: Bounding box coordinates format is correct.

The bounding box coordinates are correctly formatted and normalized.

tests/assets/yolo_dataset/yolo8_segmentation/labels/train/1.txt (2)

1-1: Class ID format is correct.

The class ID is correctly formatted as an integer.


2-2: Polygon coordinates format is correct.

The polygon coordinates are correctly formatted and normalized.

datumaro/plugins/yolo_format/format.py (3)

10-11: LGTM! The added attributes enhance file management.

The addition of LABELS_EXT and SUBSET_LIST_EXT attributes standardizes the file extensions for labels and subset lists.


14-21: LGTM! The extended configurability and organization are beneficial.

The added configuration keys (path, kpt_shape, flip_idx) and folder names (IMAGES_FOLDER_NAME, LABELS_FOLDER_NAME) enhance the configurability and organization of YOLOv8-specific data.


24-25: LGTM! The attribute enhances pose estimation capabilities.

The addition of KPT_SHAPE_FIELD_NAME for keypoint shape representation is beneficial for pose estimation tasks.

tests/utils/test_utils.py (6)

9-12: LGTM! The class helps transition from unittest to pytest.

The TestCaseHelper class is a practical approach to minimize migration work from unittest to pytest.


14-15: LGTM! The method mimics assertTrue from unittest.TestCase.

The assertTrue method correctly replicates the behavior of unittest.TestCase.assertTrue.


17-18: LGTM! The method mimics assertFalse from unittest.TestCase.

The assertFalse method correctly replicates the behavior of unittest.TestCase.assertFalse.


20-21: LGTM! The method mimics assertEqual from unittest.TestCase.

The assertEqual method correctly replicates the behavior of unittest.TestCase.assertEqual.


23-27: LGTM! The method mimics assertListEqual from unittest.TestCase.

The assertListEqual method correctly replicates the behavior of unittest.TestCase.assertListEqual.


29-30: LGTM! The method mimics fail from unittest.TestCase.

The fail method correctly replicates the behavior of unittest.TestCase.fail.

tests/conftest.py (2)

26-29: LGTM! The fixture enhances test setup.

The test_dir fixture uses the TestDir context manager to create a temporary directory for tests, improving test setup and teardown.


32-34: LGTM! The fixture improves test management.

The helper_tc fixture provides access to TestCaseHelper for all tests within a class, improving test management and execution.

datumaro/util/meta_file_util.py (2)

53-55: Ensure consistency in label categories.

The addition of label_categories is correct. However, ensure that all labels have a corresponding parent category. If not, consider handling cases where label.parent might be None.


56-60: Handle potential missing attributes in point categories.

The handling of point_categories is correct. However, ensure that all cat.labels and cat.joints are present and valid. Consider adding checks or handling cases where these attributes might be missing or malformed.

datumaro/plugins/yolo_format/importer.py (8)

24-25: Ensure the presence of obj.data file.

The detect method correctly requires the presence of the obj.data file. Ensure that this file is always present in the expected format.


28-29: Recursive search for sources.

The find_sources method correctly performs a recursive search for sources with the .data extension. Ensure that this extension is consistently used for YOLO datasets.


36-47: Ensure correct YAML parsing and field validation.

The detect method correctly requires the presence of the data.yaml file and checks for the absence of the KPT_SHAPE_FIELD_NAME field. Ensure that the YAML file is always in the expected format and handle potential parsing errors gracefully.


49-51: Recursive search for sources.

The find_sources method correctly performs a recursive search for sources with the .yaml extension. Ensure that this extension is consistently used for YOLOv8 datasets.


54-55: Correctly set the extractor.

The EXTRACTOR is correctly set to Yolo8SegmentationExtractor. Ensure that this extractor is implemented correctly and handles segmentation data as expected.


58-59: Correctly set the extractor.

The EXTRACTOR is correctly set to Yolo8ObbExtractor. Ensure that this extractor is implemented correctly and handles oriented bounding box data as expected.


62-63: Correctly set the extractor.

The EXTRACTOR is correctly set to Yolo8PoseExtractor. Ensure that this extractor is implemented correctly and handles pose estimation data as expected.


66-77: Ensure correct YAML parsing and field validation for pose estimation.

The overridden detect method correctly requires the presence of the data.yaml file and checks for the presence of the KPT_SHAPE_FIELD_NAME field. Ensure that the YAML file is always in the expected format and handle potential parsing errors gracefully.

datumaro/plugins/yolo_format/converter.py (15)

113-115: Ensure subset naming consistency.

The apply method correctly handles subset naming. Ensure that subset names are consistent and do not clash with reserved keywords.


119-122: Create subset folders.

The apply method correctly creates subset folders for images and annotations. Ensure that these directories are always created successfully and handle potential errors gracefully.


127-134: Export media and annotations.

The apply method correctly exports media and annotations. Ensure that the exported files are in the expected format and handle potential errors gracefully.


143-149: Save configuration files.

The _save_config_files method correctly saves configuration files. Ensure that the configuration files are always saved successfully and handle potential errors gracefully.


163-173: Create subset list file.

The _make_subset_list_file method correctly creates a subset list file. Ensure that the file is always created successfully and handle potential errors gracefully.


175-192: Export media.

The _export_media method correctly exports media. Ensure that the exported media files are in the expected format and handle potential errors gracefully.


196-212: Export item annotations.

The _export_item_annotation method correctly exports item annotations. Ensure that the exported annotations are in the expected format and handle potential errors gracefully.


216-225: Create annotation line.

The _make_annotation_line method correctly creates annotation lines for bounding boxes. Ensure that the annotation lines are always created successfully and handle potential errors gracefully.


267-278: Save configuration files for YOLOv8.

The _save_config_files method correctly saves configuration files for YOLOv8. Ensure that the configuration files are always saved successfully and handle potential errors gracefully.


279-281: Create image subset folder for YOLOv8.

The _make_image_subset_folder method correctly creates image subset folders for YOLOv8. Ensure that these directories are always created successfully and handle potential errors gracefully.


283-285: Create annotation subset folder for YOLOv8.

The _make_annotation_subset_folder method correctly creates annotation subset folders for YOLOv8. Ensure that these directories are always created successfully and handle potential errors gracefully.


290-302: Create annotation line for segmentation.

The _make_annotation_line method correctly creates annotation lines for segmentation annotations. Ensure that the annotation lines are always created successfully and handle potential errors gracefully.


307-313: Create annotation line for oriented bounding boxes.

The _make_annotation_line method correctly creates annotation lines for oriented bounding box annotations. Ensure that the annotation lines are always created successfully and handle potential errors gracefully.


317-353: Save configuration files for pose estimation.

The _save_config_files method correctly saves configuration files for pose estimation. Ensure that the configuration files are always saved successfully and handle potential errors gracefully.


354-380: Create annotation line for pose estimation.

The _make_annotation_line method correctly creates annotation lines for pose estimation annotations. Ensure that the annotation lines are always created successfully and handle potential errors gracefully.

datumaro/plugins/yolo_format/extractor.py (10)

8-18: LGTM! New imports are necessary.

The new imports for math, yaml, and cached_property are necessary for the new functionality and refactoring.


Line range hint 52-117:
LGTM! Refactored configuration management improves flexibility.

The _config property now uses a YAML configuration file, enhancing flexibility in configuration parsing and error handling.


122-129: LGTM! New method _iterate_over_image_paths improves modularity.

The method helps in iterating over image paths, improving code modularity and readability.


246-252: LGTM! Refactored _parse_annotations method enhances error reporting.

The refactoring modularizes annotation loading logic and improves error reporting.


253-279: LGTM! New method _load_one_annotation improves clarity.

The method modularizes the logic for loading individual annotations, improving code clarity and maintainability.


Line range hint 280-302:
LGTM! Refactored _load_categories method enhances category management.

The refactoring accommodates different formats of category definitions, enhancing the ability to parse and manage category data.


318-389: LGTM! New class Yolo8Extractor enhances framework capabilities.

The class introduces support for YOLOv8 formats, enhancing the framework's capabilities.


391-419: LGTM! New class Yolo8SegmentationExtractor enhances segmentation support.

The class introduces support for segmentation annotations, enhancing the framework's capabilities.


421-459: LGTM! New class Yolo8ObbExtractor enhances oriented bounding box support.

The class introduces support for oriented bounding boxes, enhancing the framework's capabilities.


461-601: LGTM! New class Yolo8PoseExtractor enhances pose estimation support.

The class introduces support for pose estimation annotations, enhancing the framework's capabilities.

tests/unit/data_formats/test_yolo_format.py (10)

1-61: LGTM! New imports are necessary.

The new imports for yaml, Yolo8Converter, Yolo8ObbConverter, Yolo8PoseConverter, Yolo8SegmentationConverter, Yolo8Extractor, Yolo8ObbExtractor, Yolo8PoseExtractor, Yolo8SegmentationExtractor, Yolo8Importer, Yolo8ObbImporter, Yolo8PoseImporter, Yolo8SegmentationImporter are necessary for the new functionality and tests.


359-433: LGTM! New class Yolo8ConverterTest enhances test coverage.

The class introduces tests for the YOLOv8 converter, enhancing test coverage.


435-480: LGTM! New class Yolo8SegmentationConverterTest enhances test coverage.

The class introduces tests for the YOLOv8 segmentation converter, enhancing test coverage.


524-550: LGTM! New class Yolo8ObbConverterTest enhances test coverage.

The class introduces tests for the YOLOv8 oriented bounding box converter, enhancing test coverage.


552-719: LGTM! New class Yolo8PoseConverterTest enhances test coverage.

The class introduces tests for the YOLOv8 pose converter, enhancing test coverage.


803-821: LGTM! New class Yolo8ImporterTest enhances test coverage.

The class introduces tests for the YOLOv8 importer, enhancing test coverage.


823-845: LGTM! New class Yolo8SegmentationImporterTest enhances test coverage.

The class introduces tests for the YOLOv8 segmentation importer, enhancing test coverage.


847-867: LGTM! New class Yolo8ObbImporterTest enhances test coverage.

The class introduces tests for the YOLOv8 oriented bounding box importer, enhancing test coverage.


869-926: LGTM! New class Yolo8PoseImporterTest enhances test coverage.

The class introduces tests for the YOLOv8 pose importer, enhancing test coverage.


1046-1066: LGTM! New class Yolo8ExtractorTest enhances test coverage.

The class introduces tests for the YOLOv8 extractor, enhancing test coverage.

tests/conftest.py Outdated Show resolved Hide resolved
@zhiltsov-max zhiltsov-max mentioned this pull request Jul 26, 2024
7 tasks
│ │ ├── image2.jpg
│ │ ├── image3.jpg
│ │ └── ...
│ ├── valid/ # directory with images for validation subset

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

val?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, data.yaml says where are the images of a subset, and the folder does not have to have the same name as a subset. Tried to write it in docs in a more understandable way.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The standard name for validation subset is "val". Nobody calls it "valid" as far as I know. @zhiltsov-max , do you have any opinion regarding that?

│ │ ├── image2.txt
│ │ ├── image3.txt
│ │ └── ...
│ ├── valid/ # directory with annotations for validation subset

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

val?

```

Files in directories `labels/train/` and `labels/valid/` should
contain information about labels for images in `images/train` and `images/valid` respectively.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

images/val?

...
```

For **Oriented Bounding Box** and **Segmentation** it contains coordinates

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would separate OBB and Segmentation because a polygon can have more than 4 points. Does the original documentation also join them?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separated them

@@ -0,0 +1,30 @@
# Copyright (C) 2019-2024 Intel Corporation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incorrect header.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhiltsov-max , what is the policy regarding Copyright inside Datumaro?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copypasted the code, as well as the header, from the upstream: https://github.com/openvinotoolkit/datumaro/blob/develop/tests/utils/test_utils.py#L467
I am not sure it will be right to change the header.

@@ -0,0 +1,1223 @@
import copy

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add copyright into every file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added copyrights

@@ -7,3 +7,21 @@ class YoloPath:
DEFAULT_SUBSET_NAME = "train"
SUBSET_NAMES = ["train", "valid"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see also valid here. It looks strange. Is it a typo?

SUBSET_LIST_EXT = ".txt"


class Yolo8Path(YoloPath):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Official name is YOLOv8 (Yolo version 8).

It looks it isn't important, but I would to have users here when they a looking for YOLOv8 support. It is critical for search engine optimization.

Let's try to add v8 instead of 8 anywhere.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be enough too have YOLOv8 in the docs, or would it be better to also rename all the formats, like yolo8_segmentation -> yolo_v8_segmentation ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rename to yolov8_segmentation.
Also, I would recommend to refactor code as well, to be consistent.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed YOLO8 to YOLOv8 everywhere

@@ -0,0 +1,324 @@
---
title: 'YOLO8'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YOLOv8

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

class Yolo8OrientedBoxesExtractor(Yolo8Extractor):
@staticmethod
def _check_is_rectangle(p1, p2, p3, p4):
p12_angle = math.atan2(p2[0] - p1[0], p2[1] - p1[1])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably it is possible to do that without math library.

  1. My way is to check that opposite sides are equal and after that check a^2 + b^2 == c^2
  2. On the internet I have found another way. Find the center of mass and calculate distance till all points from the center mass. They should be equal.

Because the check will be run multiple times, I would recommend to choose the optimal implementation. Can you compare them?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compared my implementation

def _check_is_rectangle_atan2(p1, p2, p3, p4):
    p12_angle = math.atan2(p2[0] - p1[0], p2[1] - p1[1])
    p23_angle = math.atan2(p3[0] - p2[0], p3[1] - p2[1])
    p43_angle = math.atan2(p3[0] - p4[0], p3[1] - p4[1])
    p14_angle = math.atan2(p4[0] - p1[0], p4[1] - p1[1])

    if abs(p12_angle - p43_angle) > 0.001 or abs(p23_angle - p14_angle) > 0.001:
        return False
    return True

with your second option. I have not tried your option 1, because it will have more pows and more sqrts, therefore should be slower than option 2.

def _check_is_rectangle_2(p1, p2, p3, p4):
    center_x = (p1[0] + p2[0] + p3[0] + p4[0]) / 4
    center_y = (p1[1] + p2[1] + p3[1] + p4[1]) / 4

    distances_squared = [
        (x - center_x) ** 2 + (y - center_y) ** 2
        for x, y in (p1, p2, p3, p4)
    ]

    if math.sqrt(max(distances_squared)) - math.sqrt(min(distances_squared)) > 0.1:
        return False
    return True

Results:

%%timeit
_check_is_rectangle_atan2(f_p1, f_p2, f_p3, f_p4)

475 ns ± 6.64 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

%%timeit
_check_is_rectangle_2(f_p1, f_p2, f_p3, f_p4)

1.52 μs ± 9.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

I tried to do it without a list, did not hrlp much.
I have no idea why atan2 implementation is faster

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The best I managed is that:

def _check_is_rectangle_vector(p1, p2, p3, p4):
    # check vectors p1->p2 and p3->p4 sum to 0
    if abs(p2[0] - p1[0] + p4[0] - p3[0]) > 0.001 or abs(p2[1] - p1[1] + p4[1] - p3[1]) > 0.001:
        return False
    # check vectors p2->p3 and p4->p1 sum to 0
    if abs(p3[0] - p2[0] + p1[0] - p4[0]) > 0.001 or abs(p3[1] - p2[1] + p1[1] - p4[1]) > 0.001:
        return False
    return True

338 ns ± 5.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In case if points do not form a rectangle it works even faster. But this code also is less understandable, so I am not sure it would be justifiable to use it instead of atan2 version

config_file=None,
**kwargs,
) -> None:
super().__init__(*args, **kwargs)
Copy link

@nmanovic nmanovic Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Eldies , do we have a test for custom config name for the extractor?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


```bash
datum create
datum import --format yolo8 <path/to/dataset> # for Detection dataset

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
datum import --format yolo8 <path/to/dataset> # for Detection dataset
datum import --format yolov8 <path/to/dataset> # for Detection dataset

see yolov8.org (it is now ultralitics spells the name)

Copy link

sonarcloud bot commented Jul 30, 2024

@nmanovic nmanovic merged commit c7bc106 into develop Jul 30, 2024
19 checks passed
@nmanovic nmanovic deleted the dl/yolo8 branch July 30, 2024 07:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants