Pytorch implementation of "Learning to See in the Dark" [1], a model using a U-net architecture[2]
We obtain results comparable to the original Tensorflow model (PNSR 28.39, SSIM 0.784) on the Sony dataset.
Saved model parameters can be downloaded here
Modifications that were tried and did not improve our results (not published in this repo):
- Using VGG16 feature loss (a.k.a. perceptual loss) instead of L1 Loss (inspired by [3] and [4])
- Replace first layer by three 3*3 convolutional layers that progressively increase number of channels - PNSR = 28.45(inspired by [5])
- Building another U-net model that uses a pretrained Resnet34 model as a backbone (PNSR = 21.7) [6]
- Adding short skip connections (no loss improvement, but potentially faster training) [5]
- Combining our fully trained baseline U-net with a "dynamic Unet" [6] to form a W shaped model (very poor results)
- Replacing transpose convolutions with other upsampling methods
- Using minimally processed 3-channel inputs (bilinear interpolation) instead of the original 4-channel raw input
[1] Learning to See in the Dark in CVPR 2018, by Chen Chen, Qifeng Chen, Jia Xu, and Vladlen Koltun. arXiv, github, Project Website
[2] U-Net: Convolutional Networks for BiomedicalImage Segmentation, by Olaf Ronneberger, Philipp Fischer, and Thomas Brox
[3] Perceptual Losses for Real-Time Style Transfer and Super-Resolution, by Justin Johnson, Alexandre Alahi, Li Fei-Fei
[4] fastai implementation of feature loss
[5] Bag of tricks for image classification with convolutional neural networks, by T. He, Z. Zhang, H. Zhang, Z. Zhang, J. Xie, and M. Li.