Skip to content

Conversation

@alaradirik
Copy link
Contributor

@alaradirik alaradirik commented Jan 23, 2023

What does this PR do?

Fixes the post_process_instance_segmentation method of MaskFormerImageProcessor. This issue mainly affects Mask2Former as it uses MaskFormerImageProcessor and there aren't any MaskFormer models trained on instance segmentation datasets.

Unlike panoptic segmentation post-processing, the final score of each binary mask proposal is calculated by multiplying the mask proposal score with the class score. mask_threshold and overlap_mask_area_threshold arguments are not needed anymore, I can either add a warning to deprecate them or leave it as it is for now.

Post-processed results of the mask2former-swin-small-coco-instance model inference:
download

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [ X] Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jan 23, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

Copy link
Contributor

@NielsRogge NielsRogge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing! Should we add a corresponding test for it, which verifies the postprocessed results?

@alaradirik
Copy link
Contributor Author

Thanks for fixing! Should we add a corresponding test for it, which verifies the postprocessed results?

I added a test but Mask2Former, unlike MaskFormer, outputs segmentation maps of shape (96, 96) instead of the preprocessed input size for efficiency. They scale the mask logits to the preprocessed image size during postprocessing (same for semantic and panoptic segmentation), even if no target_sizes is passed. I think it'd better to add an image processor for Mask2Former as its post-processing requires additional scaling.

What do you think @NielsRogge @sgugger?

@NielsRogge
Copy link
Contributor

If postprocessing is different, then it indeed requires its own image processor class.

@alaradirik alaradirik merged commit f424b09 into huggingface:main Jan 24, 2023
sgugger pushed a commit that referenced this pull request Jan 24, 2023
* fix instance segmentation post processing

* add Mask2FormerImageProcessor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants