In this project, we will work on an image quality improvement method. Given a set of high quality images, we will apply transformations that mimic the common quality issues. Then, we will train a network(autoencoder) to reconstruct the original, high quality image.
Link to wiki (notion)
TODO
- Introduce different types of noises to cifar10 photos
- Visualize dataset with different noises
- Deliver synopsis
- Build autoencoder
- Build autoencoder
- Denoise cifar10 photos
- Cats and dogs dataset
- Enhance Cats and dogs dataset photos resolution
- Poster
Dev docker image: mactat/dl-iqiwa:latest
git clone <repo>
cd <project-dir>
docker compose pull
docker compose up
Go to http://127.0.0.1:42065/?token=pass
You can also set your jupyter interpreter in vscode to http://127.0.0.1:42065/?token=pass
Enjoy :)
git clone <repo>
cd <project-dir>
Copy your kaggle key
to /scrpts
cd scripts
chmod +x data data_extraction.sh
./data data_extraction.sh <name of kaggle key>
pip3 install -r requirements.txt
python3 train_model.py
Parameters of train_model.py:
python3 train_model.py --help
usage: train_model.py [-h] [--model MODEL] [--epochs EPOCHS] [--verbose VERBOSE]
Parameters for training
optional arguments:
-h, --help show this help message and exit
--model MODEL Specify the model file(without .py extension)
--epochs EPOCHS Specify the number of epochs
--verbose VERBOSE Print output or not
Models will be stored in artifactory: https://dlmodels.jfrog.io
To upload a model
curl -u<USERNAME>:<PASSWORD> -T <PATH_TO_FILE> "https://dlmodels.jfrog.io/artifactory/iqiwa-generic-local/<MODEL_NAME>.pth"
To download a model
curl https://dlmodels.jfrog.io/artifactory/iqiwa-generic-local/<MODEL_NAME>.pth > model.pth
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [128, 64, 16, 16] 3,136
Conv2d-2 [128, 128, 8, 8] 131,200
Linear-3 [128, 10] 81,930
Linear-4 [128, 10] 81,930
Encoder-5 [[-1, 10], [-1, 10]] 0
Linear-6 [128, 8192] 90,112
ConvTranspose2d-7 [128, 64, 16, 16] 131,136
ConvTranspose2d-8 [128, 3, 32, 32] 3,075
Decoder-9 [128, 3, 32, 32] 0
================================================================
Total params: 522,519
Trainable params: 522,519
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 1.50
Forward/backward pass size (MB): 54.02
Params size (MB): 1.99
Estimated Total Size (MB): 57.51
----------------------------------------------------------------
Number of parameters: 522519
Original image | Reconstruction |
---|---|
![]() |
![]() |
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [1, 32, 16, 16] 1,568
MaxPool2d-2 [1, 32, 8, 8] 0
Conv2d-3 [1, 16, 4, 4] 8,208
Conv2d-4 [1, 8, 2, 2] 2,056
Encoder-5 [1, 8, 2, 2] 0
ConvTranspose2d-6 [1, 16, 4, 4] 2,064
Upsample-7 [1, 16, 8, 8] 0
ConvTranspose2d-8 [1, 32, 16, 16] 8,224
ConvTranspose2d-9 [1, 3, 32, 32] 1,539
Decoder-10 [1, 3, 32, 32] 0
================================================================
Total params: 23,659
Trainable params: 23,659
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.01
Forward/backward pass size (MB): 0.20
Params size (MB): 0.09
Estimated Total Size (MB): 0.30
----------------------------------------------------------------
Original image | Reconstruction | Image with noise | Reconstruction |
---|---|---|---|
![]() |
![]() |
![]() |
![]() |
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [1, 32, 16, 16] 1,568
MaxPool2d-2 [1, 32, 8, 8] 0
Conv2d-3 [1, 16, 4, 4] 8,208
Conv2d-4 [1, 8, 2, 2] 2,056
Encoder-5 [1, 8, 2, 2] 0
ConvTranspose2d-6 [1, 16, 4, 4] 2,064
Upsample-7 [1, 16, 8, 8] 0
ConvTranspose2d-8 [1, 32, 16, 16] 8,224
ConvTranspose2d-9 [1, 3, 32, 32] 1,539
Decoder-10 [1, 3, 32, 32] 0
================================================================
Total params: 23,659
Trainable params: 23,659
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.01
Forward/backward pass size (MB): 0.20
Params size (MB): 0.09
Estimated Total Size (MB): 0.30
----------------------------------------------------------------
Original image | Reconstruction | Image with noise | Reconstruction |
---|---|---|---|
![]() |
![]() |
![]() |
![]() |
Model definition:
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
ConvTranspose2d-1 [1, 10, 180, 180] 280
ConvTranspose2d-2 [1, 20, 359, 359] 1,820
ConvTranspose2d-3 [1, 3, 360, 360] 963
================================================================
Total params: 3,063
Trainable params: 3,063
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.37
Forward/backward pass size (MB): 25.10
Params size (MB): 0.01
Estimated Total Size (MB): 25.49
----------------------------------------------------------------
Model definition:
==========================================================================================
Layer (type:depth-idx) Output Shape Param #
==========================================================================================
Model -- --
├─UpsamplingBilinear2d: 1-1 [5, 3, 360, 360] --
├─Conv2d: 1-2 [5, 64, 360, 360] 1,792
├─ResidualBlock: 1-3 [5, 64, 360, 360] --
│ └─ConvolutionalBlock: 2-1 [5, 64, 360, 360] --
│ │ └─Sequential: 3-1 [5, 64, 360, 360] 73,986
│ └─ConvolutionalBlock: 2-2 [5, 64, 360, 360] --
│ │ └─Sequential: 3-2 [5, 64, 360, 360] 73,984
├─ResidualBlock: 1-4 [5, 64, 360, 360] --
│ └─ConvolutionalBlock: 2-3 [5, 64, 360, 360] --
│ │ └─Sequential: 3-3 [5, 64, 360, 360] 73,986
│ └─ConvolutionalBlock: 2-4 [5, 64, 360, 360] --
│ │ └─Sequential: 3-4 [5, 64, 360, 360] 73,984
├─Conv2d: 1-5 [5, 3, 360, 360] 1,731
==========================================================================================
Total params: 299,463
Trainable params: 299,463
Non-trainable params: 0
Total mult-adds (G): 193.72
==========================================================================================
Input size (MB): 0.49
Forward/backward pass size (MB): 5655.74
Params size (MB): 1.20
Estimated Total Size (MB): 5657.43
==========================================================================================