This repository contains a PyTorch code of mixout. This technique regularizes learning to minimize the deviation from the target parameters. For more detailed description of mixout, see "Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models".
There is an example code ( about applying mixout to a model. In, you can find the functional version of mixout similar to torch.nn.functional.dropout. The module version of mixout is available in as well, but it is quite different compared to torch.nn.Dropout. I highly recommend users to read
Thanks to Michael Wilson, there is also an example of applying mixout to a pretrained model from Huggingface in Because of how models on Huggingface are structured, this works slightly differently from
For better usage of the library, Vadim makes this repo as a package, also he adds the figure of Mixout for faster understanding the concept behind Mixout. Also, he add the typing library to emphasize what input types the library expects.
Cheolhyoung Lee, Kyunghyun Cho, and Wanmo Kang, Mixout: Effective regularization to Finetune Large-scale Pretrained Language Models, International Conference on Learning Representations (2020).
Stephen Roller also implemented mixout in his gist. His implementation is actually mixconnect similar to dropconnect. (It is also introduced in the mixout paper.) However, unlike my implementation, MixWrapper can wrap most of torch.nn.Module's and that you do not need to make your mixed module such as MixLinear in If you do not need to customize mixout, his code is convenient to use.