-
Notifications
You must be signed in to change notification settings - Fork 31.8k
Add TFData2VecVision for semantic segmentation #17271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TFData2VecVision for semantic segmentation #17271
Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
sgugger
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for your PR! Note that on the pyramid pooling class, even if we change the PyTorch class to not subclass ModuleList anymore, it will still need to keep the same weight names, otherwise compatibility with any checkpoint on the Hub will be broken.
Absolutely. |
|
@Rocketknight1 a gentle ping 👀 |
|
Ah, I'm sorry! Will review it by tomorrow. |
|
Hi, I just took a look over this! I suspect the issue with the tests is that there's something like a layer name collision when saving. In h5 files, weights are saved as 'datasets' , so this error is telling us that the weights are not uniquely named - the same 'dataset' name is being written to twice during saving, which means two layers share the same name. |
|
Yes, I suspected something similar but couldn't figure out where the duplicate is coming from. Do you have any suggestions? |
|
I suspect the issue is most likely related to the implementation of AdaptiveAvgPool I wrote - the practice of precomputing a constant sparse matrix like that is non-standard, and TF might be trying to save that Tensor somehow. Can you try replacing it with a 'dummy' layer that has the same output shape and seeing if the error goes away? If so, I can work on a different implementation for the layer - I have some ideas that I think will improve performance a lot, and they might also resolve the problem too. |
Sure. I will do it and get back. |
|
@sayakpaul I used post-mortem debugging to isolate this - just add this to Then run the tests with From there, I can tell that the offending array has name |
There are multiple 1x1 convs, yes. |
Could you elaborate a bit more here? I have added the |
|
@sayakpaul I stepped up to the frame of |
|
@Rocketknight1 I looked into the layers with It still didn't resolve the issue. The only potential suspect I could find is the following. There are two layers namely Thoughts? |
|
Update: With @Rocketknight1's help, I was able to resolve the current test failure (commit here). But I have run into two more failures which I am currently discussing with @Rocketknight1. He's on vacation. Once he gets back, hopefully, will be able to report back with updates. |
Fix/tf data2vec seg
Rocketknight1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With tests passing now, I'm happy to approve this!
|
|
||
| This model was contributed by [edugp](https://huggingface.co/edugp) and [patrickvonplaten](https://huggingface.co/patrickvonplaten). | ||
| [sayakpaul](https://github.com/sayakpaul) contributed Data2Vec for vision in TensorFlow. | ||
| [sayakpaul](https://github.com/sayakpaul) and [Rocketknight1](https://github.com/Rocketknight1) contributed Data2Vec for vision in TensorFlow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You did almost all of it!
* feat: initial implementation of data2vec segmentation model in TF. * chore: minor corrections to make the segmenter work. * chore: removed unncessary files. * chore: add tests and other modifications. * fix: loss computation for segmentation. * chore: remove unused variable. * chore: formatting. * added a dummy adaptive pooling layer. * removed unnecessary file. * potentially add identifiers to layer names. * fix: layer naming. * chore: removed unnecessary print. * Skipping unneeded test * chore: add logging to debug tolerance. * fix: segmentation tests for tfdata2vecvision * chore: make style. * fix: layer names, assertion to be resolved. * Bumping test tolerance a bit * chore: bump the tol in PT test. Co-authored-by: matt <[email protected]>
* feat: initial implementation of data2vec segmentation model in TF. * chore: minor corrections to make the segmenter work. * chore: removed unncessary files. * chore: add tests and other modifications. * fix: loss computation for segmentation. * chore: remove unused variable. * chore: formatting. * added a dummy adaptive pooling layer. * removed unnecessary file. * potentially add identifiers to layer names. * fix: layer naming. * chore: removed unnecessary print. * Skipping unneeded test * chore: add logging to debug tolerance. * fix: segmentation tests for tfdata2vecvision * chore: make style. * fix: layer names, assertion to be resolved. * Bumping test tolerance a bit * chore: bump the tol in PT test. Co-authored-by: matt <[email protected]>
This PR introduces
TFData2VecVisionForSemanticSegmentationwhich takes theTFData2VecVisionMainLayerand appends the necessary layers for performing semantic segmentation along with loss computation (first one in this line?).Notes
TFData2VecVisionForSemanticSegmentationclass is introduced totests/models/test_modeling_tf_data2vec_vision.py. Without that class, the test runs as expected. I would appreciate any help.nn.ModuleList. It is currently leading a few idiosyncracies on the TF side (mainly related to naming of the layers). Once that is sorted out we can again revisit thisTFData2VecVisionForSemanticSegmentationclass and make the amends if needed. Happy to take the charge then.RUN_SLOW=1 python -m pytest tests/models/data2vec/test_modeling_tf_data2vec_vision.py.Here's the trace of the errors from running tests:
Additionally, here's a little code for testing the segmentation class:
@Rocketknight1 @sgugger