This repository contains code to benchmark novel stochastic gradient descent algorithms on the CIFAR10 dataset.
If you want your algorithm to be included open an issue here.
Requirements: Python 3.6+, Pytorch 1.3+, tqdm
Supported optimizers:
- Stochastic Gradient Descent with Momentum (SGDM)
- Stochastic Gradient Descent with Aggregated Momentum (SGD_aggmo) [arXiv]
- Stochastic Gradient Descent with Momentum and Learning Rate Dropout (SGD_LRD) [arXiv]
- Adam: A method for stochastic optimization (ADAM) [arXiv]
- Adam with Learning Rate Dropout (ADAM_LRD) [arXiv]
- RMSProp [Lecture Notes]
- RMSProp with Learning Rate Dropout [arXiv]
- RAdam: On the Variance of the Adaptive Learning Rate and Beyond [arXiv]
- RAdam with Learning Rate Dropout [arXiv]
- AdaBound: Adaptive Gradient Methods with Dynamic Bound of Learning Rate [ICLR2019]
- AdamW: Decoupled Weight Decay Regularization [arxiv]
- Coolmomentum: Stochastic Optimization by Langevin Dynamics with Simulated Annealing [Nature Scientific Reports]
More details are of all runs can be found here.