Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match image by name (instead of id) on CVAT upload #1807

Merged
merged 2 commits into from
Jul 18, 2020

Conversation

artemisart
Copy link
Contributor

Motivation and context

Currently, the cvat annotation import (upload) matches annotations to frames by id, so this works only on very specific cases like backups for the same task. When moving annotations between tasks, if the order doesn't match exactly or if there are more frames, or (in our case) if we generate cvat annotations for pre-annotation, nothing matched and it's useless.
This PR just uses the same matching as for coco files, so it works both for old use-cases like backup (if the ID matched, then the filenames will match) but also pre-annotation and others.

How has this been tested?

Manually, uploaded some files.

Checklist

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below)
# Copyright (C) 2020 Intel Corporation
#
# SPDX-License-Identifier: MIT

@artemisart artemisart requested a review from zhiltsov-max as a code owner June 26, 2020 08:21
@coveralls
Copy link

coveralls commented Jun 26, 2020

Pull Request Test Coverage Report for Build 6493

  • 3 of 3 (100.0%) changed or added relevant lines in 1 file are covered.
  • 699 unchanged lines in 28 files lost coverage.
  • Overall coverage decreased (-0.2%) to 64.918%

Files with Coverage Reduction New Missed Lines %
cvat/apps/dataset_manager/formats/mask.py 1 89.29%
cvat/apps/dataset_manager/formats/mot.py 1 94.74%
datumaro/datumaro/plugins/datumaro_format/converter.py 2 98.16%
datumaro/datumaro/plugins/tf_detection_api_format/extractor.py 2 88.68%
datumaro/datumaro/plugins/coco_format/importer.py 3 79.49%
cvat/apps/dataset_manager/formats/pascal_voc.py 4 60.0%
datumaro/datumaro/cli/util/project.py 4 26.92%
datumaro/datumaro/plugins/yolo_format/converter.py 4 89.72%
datumaro/datumaro/util/init.py 4 80.85%
datumaro/datumaro/components/converter.py 5 88.46%
Totals Coverage Status
Change from base Build 6185: -0.2%
Covered Lines: 11034
Relevant Lines: 16590

💛 - Coveralls

@@ -432,7 +432,7 @@ def load(file_object, annotations):
)
elif el.tag == 'image':
image_is_opened = True
frame_id = int(el.attrib['id'])
frame_id = annotations.match_frame(el.attrib['name'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should use https://github.com/opencv/cvat/blob/develop/cvat/apps/dataset_manager/bindings.py#L571 here, which will fall back to ids and other things in the case image name did not match. This can happen, for example, in the case when the task was created from files on a shared storage, as image names in the task will include their absolute paths, in which case the new variant doesn't work. This problem would require analysis of the whole input file, but ids can simplify things.

Another simple variant - use the name, otherwise use the id and check if basenames match, if there is one provided.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I thought match_frame already matched on the basename and not the full path but not sure, and I didn't follow the updates of the importer.

Copy link
Contributor

@zhiltsov-max zhiltsov-max Jun 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can find an example of a DataseItem for these purposes here: https://github.com/opencv/cvat/blob/develop/cvat/apps/dataset_manager/formats/yolo.py#L40

I think, the approach in the last line above will just fit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, any updates?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@nmanovic
Copy link
Contributor

@artemisart , thanks for the great contribution. I personally want to accept it. The only problem that it is necessary to fix a component from @zhiltsov-max. Are you going to do that?

@artemisart
Copy link
Contributor Author

Yes, sorry I didn't have the time to update and test yet but it should be good now.

Copy link
Contributor

@nmanovic nmanovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@artemisart , Thanks for the PR. Great contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants