Analysis of SGD variants

Aim of the project

The project aims to analyse 3 new variances of Stochastic Gradient Descent:

SARAH
SpiderBoost
SNVRG

In order to avoid dependencies to other libraries, it's all being implemented in Python using NumPy.
The implementation can be found in the Python Jupyter notebook, and in the PDF file there are all the conclusions and the analysis performed.

How to

All the code is been written to be sort of readable, and it's commented, in order to make it easier to comprehend.

For the algorithms, the same structure is used over all of them:

a fist part initializes a bunch of arrays for tracking various things, used then for the plots
a second part, which is the "loops", that controls the flow of the algorithms (epochs, iterations, minibatches)
the algorithm itself
the tracking part, where we update the arrays that tracks grad, points, and so on

All fo them are distinguishable because separated by a newline in all of the algorithms (and therefore they are pretty redundant)

How to load datasets

The code provided in the Jupyter notebook loads the datasets from the main memory.

In order to so, it's required that they are in specific folders with specific names, however the convention is equal to all of them.

Create a folder called datasets in the same folder as the Jupyter notebook
Create the following folders inside the new datasets folder:
- bio-or-not for the "Bio or not" dataset
- fish-or-not for the "Fish or not" dataset
- cat-vs-dog for the "Cats vs Dogs" dataset

inside the folder created at point 2, create 2 folders, which names are the first and the last word of the folder name.
The final result should be the following:

path/to/project
    ├── datasets
    │    ├── cat-vs-dog
    |    │   ├── cat
    |    │   └── dog
    │    ├── bio-or-not
    |    │   ├── bio
    |    │   └── not
    │    └── fish-or-not
    |        ├── fish
    |        └── not
    └── notebook.ipynb

inside the "leaf" folders, insert the desired images
inside the first snipped, set DOWNLOAD_AND_STORE = True (it will tell the code to look in the filesystem for the images, process that will take 2/3 minutes at least, and at the end will store in the dataset folder a serialized and lighter version, which will be loaded all the following times, after having set back DOWNLOAD_AND_STORE = False, that will take just a second to be loaded)

WARNING: please pay attention because some images are corrupted, and the code might not run in that case (however, it will display in the output the name of the file that is causing problems)

Libraries

In order to run the full project, some "not too common" libraries are required, mainly to have nice plots and animations:

plotly: a installation procedure can be found here https://plotly.com/python/getting-started/, however it's used only for 3D animations and contour plots, so if not required, it's enough to comment out the code that uses it
PIL: a installation procedure can be found here https://pillow.readthedocs.io/en/stable/installation.html, it's used to process images from the filesystem to convert them to numpy arrays
json: used only with plotly, so if you don't want 3D plots, you can remove it
Dash: used only with plotly to create a server on which others can check the 3D animations, an installation procedure can be found here https://dash.plotly.com/installation, the code that uses it is already commented, so you might even want to ignore it

Others libraries required are:

tensorflow, to process images
matplotlib, for all the plots
numpy
time, to time the algorithms
sklearn, to process images

Datasets URL can be found on the report PDF file.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
paper		paper
.gitignore		.gitignore
README.md		README.md
optimizers.ipynb		optimizers.ipynb
report.pdf		report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analysis of SGD variants

Aim of the project

How to

How to load datasets

Libraries

About

Releases

Packages

Languages

AlbertoSinigaglia/optimizer-sarah-spiderboost-snvrg

Folders and files

Latest commit

History

Repository files navigation

Analysis of SGD variants

Aim of the project

How to

How to load datasets

Libraries

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages