Fashion_MNIST

Dimensionality-reduction + ConvNet example

I am considering imbalanced binary classification problem of classifying images from the Fashion-MNIST data-set (https://github.com/zalandoresearch/fashion-mnist) as bags (8th class) or not-bags (i.e. any other of the 10 classes).
As an interesting constraint, I am allowed to select only 20 features from the 28x28 = 784 dimensions that each image belongs to.

Solution found in this repo consists of four jupyter-notebooks:

desc	link	note
Encode the images to 20 dims. Train the appropriate decoders at the same time.	https://github.com/olszewskip/Fashion_MNIST/blob/master/dimensionality_reduction.ipynb	An autoencoder does better than PCA.
Save the decoded images to files.	https://github.com/olszewskip/Fashion_MNIST/blob/master/compression.ipynb	The classifier will be fit without modifying the decoder, hence we can decode once before further training.
Train a convolutional network. Compare results of various encoders	https://github.com/olszewskip/Fashion_MNIST/blob/master/classification.ipynb	PCA can be as good as an overcomplicated autoencoder, but it does not win.
Score the best model on test-data	https://github.com/olszewskip/Fashion_MNIST/blob/master/final_test.ipynb	97% F1 in 'bag'-class.

The notebooks were run on Colab.

The project lacks a simple baseline. One should try to use PCA(20) with a random-forest on top or something similar, to get a better feeling how crucial the deep-learning is in this case.
Make a t-SNE plot.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.ipynb_checkpoints		.ipynb_checkpoints
.gitignore		.gitignore
README.md		README.md
classification.ipynb		classification.ipynb
clf20_a.h5		clf20_a.h5
clf20_b.h5		clf20_b.h5
clf20_c.h5		clf20_c.h5
compression.ipynb		compression.ipynb
dimensionality_reduction.ipynb		dimensionality_reduction.ipynb
final_test.ipynb		final_test.ipynb
model1.h5		model1.h5
model2.h5		model2.h5
playground.ipynb		playground.ipynb
shallow_classification.ipynb		shallow_classification.ipynb