-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Are new models planned to be added? #2707
Comments
This request has come often. Just linking all those for reference. archived - update issue instead
Edit by @datumbox: I shamelessly edited your comment and moved your fantastic up-to-date list on the issue for greater visibility. Reply by @oke-aditya: I was actually going to suggest to do the same 😃 A generalized guideline for adding models is being added in |
Hi, To complement @oke-aditya great answer, we will be adding more models to torchvision, including Efficient Nets and MobileNetV3. The current limitation is that we would like to ensure that we can reproduce the pretrained model using the training scripts from |
I hope to add Mish activation function. |
@songyuc There is a closed feature request on PyTorch for adding Mish. You can comment over there for increased visibility so that Mish can be considered to be added in the future. Link to the issue - pytorch/pytorch#25584 |
first, thanks for your great works. |
Hi @WZMIAOMIAO Swish Activation function is added in to PyTorch (not torchvision) as |
@oke-aditya Thank you for your reply. I've seen MobileNetv3 in the torchvision repository. When will EfficientNet, RegNet and NFNet be added? |
@stwinata Thanks for offering. Which models do you have in mind to contribute? The process of model contribution was a bit problematic (mainly due to the training bit) and we still haven't figure out all details. But depending on the proposal, we might be able to work something out. :) |
@datumbox thanks for the quick reply! I am interested in DETR or EfficientDet. I was thinking for first commit maybe DETR might be easier, since we can use DETR's original repo for referene and may be able try to load weights for preliminary validations.
Perhaps we can also try to determine a canonical pipeline for model contribution through this experience and document it S.T others can contribute in the future easily 😃 ! |
@datumbox Does this come down to lack of GPU resources? Or is it due to the need to validate that it can properly train? |
@stwinata DETR sounds a good addition to me. Since @fmassa is one of the main authors, I will let him have the final say on this. Contributing models is tricky because:
Happy to discuss more and see if it's worth doing this now. |
@datumbox These comments makes sense 😃
Yeah I agree, some might even say getting it to models to be "useful" aka reproducing the Paper results are the fun bits 😃
I think this way, we can ease the load on Pytorch/Vision maintainers, make PRs much more concrete and useful. Perhaps we can also have a simple util script that tests trained candidate implementations on various benchmarks.(this might be another feature request 😄 )
I also agree with this. Moreover, I think these days GPU-resources either at home, or thru AWS and GCP are getting ubiquitous enough for contributors to do training by themselves 😃 |
@stwinata Thanks for the comments. I think we agree. Below I write few thoughts on the potential process we could adopt. The minimum to merge such a contribution is:
Note that there are details here related to the code quality etc, but these are rules that apply in all PRs. For someone who would be interested in adding a model, here are a few important considerations:
The above are a very big ask I think. But if an OSS contributor is willing to give it a try despite the above adversities, then we would be happy to pair up and help. This should happen in a coordinated way to:
@fmassa let me know your thoughts on this as well. |
I am aming at adding FCOS to torchvision. |
@santhoshnumberone I think DINO is more practical, since user can train less epochs to get good mAP. |
What I meant was, if anyone of you could check if facebookresearch/dino and the DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection are not the same. I feel both are different |
There are many variants of DETR. E.g. deformable DETR, modulated DETR, A simple search would give these results. https://github.com/search?q=DETR Let's start of by including vanilla DETR :) |
Hi , MobileOne introduced by Apple is interesting, the mobile-vision team implement it at facebookresearch/mobile-vision#91 , is there any plan to support it there? |
@zhiqwang MobileOne wan't in our shotlist but we can certainly keep an eye on it if it builds momentum. |
Any small list for Semantic Segmentation models? Maybe a tentative one
I can try U2Net. Maybe it's an easy Model |
Yesss BiFPN is very popular. Maybe once we initiate EfficientDet it would get added. |
Will you try to add EfficientDet? |
I'm pretty noob when it comes to implementing models. But maybe I will give it a shot after I add a few easy models. |
@datumbox Please Pin this issue. |
@oke-aditya Can you add Unet? |
Will need to discuss with the maintainers, and I think U2-Net will be more helpful. U-Net I'm not sure. I think I can implement them. |
The maintainers are not responsive to issues. Ones you open a PR, they will talk with you. |
Hi @talregev , It's nice to see that you're excited about torchvision models. I'm going to ask you to be a little more patient here. @datumbox is on well deserved holidays and I'm sure he'll get back to you as soon as he can.
Rest assured that we are responsive to issues. Furthermore, like in most projects, we don't encourage opening PRs prior to opening issues in order to leave time and space for discussing the requested feature. As a side note:
isn't the best way to engage with open source projects. We always welcome suggestions and feature requests, but in order for us to help you best, we usually need a bit more details on what is requested, and why it would be useful to you. Also, while a gentle ping can sometimes be appropriate, just at-ing people without context or form might not get you the outcome that you're looking for. |
@NicolasHug Thank you for your nice suggestion. Can you pin this issue? |
@talregev Apologies for the delayed response. I was on my annual leave. As others pointed out, U-net is a bit old now (released in 2015) and there are quite a few good community implementations already. If we were to add more models, we would probably prioritizing transformer approaches that yield better results. We don't have immediate plans for this though as our focus this half would be Videos. Concerning pinning the issue, we got quite a few pinned ones already so I'm not sure this will increase the visibility. There are at least 4-5 more tickets like this for losses, operators, data augmentations etc. Perhaps the solution here is to pin the issue with our H2 roadmap once finalized and then link to this issue from there. |
I also want to have BiFPN neck |
Hey, I was currently working with SegFormer, TransUnet, UNETR (although I am working with medical imaging datasets solely now). All three of them are relatively new models (early 2021) but they have shown good results and also have a fair amount of citations. Edit: MaskFormer and DPT could also be good additions I believe. |
Hi. While these models are new. These look to be specialized in medical image segmentation. Are these also valid on general datasets such as Pascal VOC or COCO? Or is there any valid performance measurement over these standard datasets? |
UNETR and TransUnet yes. SegFormer, MaskFormer and DPT support Cityscapes, ADE20k, coco, etc |
Hello everyone,
As you can see, it outperforms VGGNet and many other architectures, it also outperforms resnet18(11m vs 5m), and some MobileNet variants as well, and achieves a high accuracy nonetheless despite being super simple and compared to architectures such as DenseNet, it performs very well with a fraction of memory usage. It performs much faster on older GTX cards, but still performs decently on new hardware as well. I believe having an efficient yet simple architecture that uses basic operators and provides a decent performance can be a good addition to the diversity of models in the Pytorch repository. Heres is the link to our pytorch implementation of simplenet : https://github.com/Coderx7/SimpleNet_Pytorch I'd be delighted to answer any questions. |
There's new discussion about adding YOLO in this issue. |
🚀 Feature
Adding new models to the models section.
Motivation
Many new models have been proposed in the recent years and do not exist in the models module.
For example, the EfficientNets appear to provide with 8 models of different complexities that outperform everything else that exists at each complexity level.
Pitch
See Contributing to Torchvision - Models for guidance on adding new models.
Add pre-trained weights for the following variants:
The text was updated successfully, but these errors were encountered: