You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Addition of Swin, SwinV2 to the swav self-supervised models
Motivation and Pitch
Swin, SwinV2 vision transformers are currently one of the top-performing models on multiple computer vision tasks, but such as all transformers, training them would need a large number of samples.
I thought that a self-supervised approach such as SWAV would benefit those models and help use those vision transformers in low resources tasks.
I am very eager to add it myself.
The text was updated successfully, but these errors were encountered:
SWAV is already implemented in lightning bolts. You will have to inherent SWAV class and override init_model method for training SWIN transformers. FYI I recommend you check DINO for training vision transformers. SWaV is not the best candidate.
SWAV is already implemented in lightning bolts. You will have to inherent SWAV class and override init_model method for training SWIN transformers. FYI I recommend you check DINO for training vision transformers. SWaV is not the best candidate.
But when I override the init_model I will also need to modify SWIN architecture by
Splitting the default forward function into forward_backbone and forward_head
Add a new forward which calls both of them
Add prototype layer and conditions for projection head in the model init
So I added the modified SWIN in swin_swav.py and instead of overriding the init_model, I just added them as arch options to be chosen for ease of use.
🚀 Feature
Addition of Swin, SwinV2 to the swav self-supervised models
Motivation and Pitch
Swin, SwinV2 vision transformers are currently one of the top-performing models on multiple computer vision tasks, but such as all transformers, training them would need a large number of samples.
I thought that a self-supervised approach such as SWAV would benefit those models and help use those vision transformers in low resources tasks.
I am very eager to add it myself.
The text was updated successfully, but these errors were encountered: