Skip to content
This repository was archived by the owner on Oct 9, 2023. It is now read-only.

Fix inference for instance segmentation #857

Merged
merged 9 commits into from
Oct 13, 2021
Merged

Conversation

SeanNaren
Copy link
Contributor

@SeanNaren SeanNaren commented Oct 12, 2021

What does this PR do?

Fixes #840

It seems we were missing parsing logic for instance segmentation. On top of that, it seems our testing suite misses actually testing this for some reason.

I noticed that Icevision doesn't support collating images of different sizes in inference, so I've added a guard in case.

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests? [not needed for typos/docs]
  • Did you verify new and existing tests pass locally with your changes?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

  • Is this pull request ready for review? (if not, please submit in draft mode)

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@codecov
Copy link

codecov bot commented Oct 12, 2021

Codecov Report

Merging #857 (02687e5) into master (9e618ed) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #857      +/-   ##
==========================================
+ Coverage   88.95%   88.96%   +0.01%     
==========================================
  Files         226      226              
  Lines       12441    12454      +13     
==========================================
+ Hits        11067    11080      +13     
  Misses       1374     1374              
Flag Coverage Δ
unittests 88.96% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
flash/core/integrations/icevision/data.py 86.27% <ø> (ø)
flash/image/instance_segmentation/data.py 89.74% <100.00%> (+1.50%) ⬆️
flash/image/instance_segmentation/model.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9e618ed...02687e5. Read the comment docs.

@SeanNaren
Copy link
Contributor Author

SeanNaren commented Oct 12, 2021

Thanks @ethanwharris and @karthikrangasai! Overall the gaps in my understanding and wrong direction lead me to this conclusion:

So to be clear, the solution to this issue would be to ensure that the data-pipeline once trained is serialized and stored within the checkpoint, and de-serialized when loading?

EDIT: this is the logic that prevents the data pipeline from being saved in the model: https://github.com/PyTorchLightning/lightning-flash/blob/master/flash/core/model.py#L755-L760

@tchaton
Copy link
Contributor

tchaton commented Oct 12, 2021

Thanks @ethanwharris and @karthikrangasai! Overall the gaps in my understanding and wrong direction lead me to this conclusion:

So to be clear, the solution to this issue would be to ensure that the data-pipeline once trained is serialized and stored within the checkpoint, and de-serialized when loading?

EDIT: this is the logic that prevents the data pipeline from being saved in the model: https://github.com/PyTorchLightning/lightning-flash/blob/master/flash/core/model.py#L755-L760

Yes, the goal is to enable to dump the entire DataPipeline. However, sometimes it is not pickable and we need to add support for it.

Best,
T.C

@SeanNaren
Copy link
Contributor Author

The code has been modified to override the on_load_checkpoint function, re-creating the data pipeline if necessary (I think it is in all cases, but added the guard in case):

https://github.com/PyTorchLightning/lightning-flash/pull/857/files#diff-7841dacfc0a21c162cc8d4177ad101d93c09306a36aa34ea9c025470fb91096dR100-R110

Still need to track down why there is only 2 output returns and not 3!

Copy link
Contributor

@tchaton tchaton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

@SeanNaren
Copy link
Contributor Author

Now just one issue to figure out:

  • The output from the collate_fn function is always 2 elements. I can see from Flash that we format the dataset to have 3 samples, and in the internal collate_fn we return a batch of 3 elements. This I think then gets formatted by IceVision internally. I'll continue investigating what happens internally causing the number of elements to go from 3 to 2. cc @tchaton

@SeanNaren
Copy link
Contributor Author

SeanNaren commented Oct 12, 2021

After digging through a huge rabbit hole I think I've remedied the issue by specifying a custom PostProcess object. The Default uncollate was doing crazy things to the BaseRecord, and simply returning the output seems to be the right way to handle this.

@kurtjcu would you be able to help test this to see if this fixes your issue?

@SeanNaren SeanNaren merged commit e0fbf5c into master Oct 13, 2021
@SeanNaren SeanNaren deleted the fix/icevision_inference branch October 13, 2021 11:16
@aswanthkrishna
Copy link

getting same error with semantic segmentation. how can i solve?

@ethanwharris
Copy link
Collaborator

@aswanthkrishna Sorry you're experiencing issues. Could you open a seperate issue with a reproducible script showing the error you're getting?

Thanks 😃

@kurtjcu
Copy link

kurtjcu commented Nov 18, 2021

appears to be the same issue with object detection, replicated is the same way, using flash and icevision installed from master on github

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Loading/saving/serve checkpoints in InstanceSegmentation example is broken
6 participants