-
-
Notifications
You must be signed in to change notification settings - Fork 26
Switch from standard Convolution to Deformable Convolution #171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Experimenting with using Deformable Convolutional layers in our DeepBedMap input block and final output layers! This has huge potential in improving the realism of our predicted grid, as the kernel filters now have more flexibility in sampling different spatial locations rather than be fixed at some regular grid. Hyperparameters seem to be rather out of tune, with this setup's best performance (RMSE_test value) currently at 295.07, see https://www.comet.ml/weiji14/deepbedmap/8ff4fea7fb2e48268b8c5ff1de9068dc. Not so great, but the plots (especially if you look at them in 3D) seem to capture the streaklines better and show less pixelated artifacts. Next tuning stage should probably should increase the learning rate (to >2e-4) and/or number of epochs needed (>100??).
|
Check out this pull request on ReviewNB: https://app.reviewnb.com/weiji14/deepbedmap/pull/171 You'll be able to see notebook diffs and discuss changes. Powered by ReviewNB. |
DeepCode Report (#12b8ee)DeepCode analyzed this pull request. |
Continuing on from 6f4a6e5, we're also changing the pre_residual, post_residual and post_upsampling Convolutional layers to Deformable ones. Adjusted learning rate to be higher, previously tuning from 1-2e-4, now from 2-4e-4 (note that ESRGAN uses 2e-4, EDVR uses 4e-4); num_residual_blocks fixed at 12. Also dropping the intermediate_values column when reporting top ten best values in the last cell of srgan_train.ipynb. Achieved an RMSE_test value of 216.67 at https://www.comet.ml/weiji14/deepbedmap/b5a3f17d2c1a4fb18c73893bb80986ff with this setup, somewhat better than previous. Will consider changing all our Residual-in-Residual Dense Block layers to use Deformable Conv2D next, and look into using Structural similarity (SSIM) loss (perhaps swap out topographic loss for that).
2249667 to
88a364c
Compare
Differentiable structural similarity index that works on Chainer! Repository at https://github.com/higumachan/ssim-chainer.
Incorporating a Structural Similarity (SSIM) Index based loss function into out adapted ESRGAN's Generator Network's loss function. Currently set with a weighting of 1e-2 that matches the L1 content loss weighting. Properly creating a unit-tested function that wraps around the differentiable [SSIM](https://github.com/higumachan/ssim-chainer) chainer module, in case things change down the line. Also flipped the y_true/y_pred kwarg positioning in psnr() to ease my OCD, and correctly renamed d_train_loss to d_dev_loss (not major).
For better structural reconstruction of our DEM, we revise our SSIM Loss weighting from 1e-2 up to 5.25e-2. Based on [Zhao et al. 2017](https://doi.org/10.1109/TCI.2016.2644865)'s paper which empirically weighted MS-SSIM loss at 0.84 and L1 loss at 0.16 (1-0.84) which is therefore 5.25x. Yes, it's only an empirical setting, but too lazy to tune those weightings (though someone probably should in the future). Current best SSIM score we get is ~0.25 which is a ways off from perfect at 1.00, so setting a higher structural weighting should encourage our model to produce more structurally similar images to the groundtruth. Even though an RMSE_test of 1655.87 isn't so great, nor is the actual SSIM score of 0.1885, qualitative 3D evaluation of the result at https://www.comet.ml/weiji14/deepbedmap/88b073324a644fd695aecf47109dd2bc does show a pretty nice terrain. Tempted to use SSIM as 'the' tuning metric instead of RMSE_test now, but we'll see.
Closes #172 Add Structural Similarity Loss/Metric.
Adjust our Content, Adversarial, Topographic and Structural Loss weightings to be on a more equal footing, with priority towards better SSIM scores (patching #172). Content and Topographic Losses (~35) was overpowering the Adversarial and Structural Losses (~0.1-10) by an order of magnitude (i.e. wasn't really any adversarial impact/structural improvement)! Loss weighting changes are as follows: Loss: Content Adversarial Topographic Structural Old 1e-2 5e-3 5e-3 5.25e-3 New 1e-2 2e-2 2e-3 5.25e-0 This is all to do with our domain specific (DEM generation) task. Ideally we would scale our images to lie in the range of 0-1 like those out in the computer vision world, which can be easily done by converting metres to kilometres (divide by 1000). Workaround instead is to scale down the content and topographic losses relative to the adversarial and structural loss. Also because we are recording too much metric information that results in I/O errors when writing to the sqlite database, I've made code changes so that our 2 Tesla V100 GPUs and 2 Tesla P100 GPUs write their training results to separate databases named based on the hostname. Best tuned score had RMSE_test of 215.60 and SSIM of 0.6195 at https://www.comet.ml/weiji14/deepbedmap/699ecfa6f14448c09cf0e450edf64f30. The results aren't so good when ran on the whole Pine Island Glacier area (2007tx, 2010tr, istarxx), but they're not as bad compared to the others, and you can really see the realistic DEM textures now!
Tone down our use of Deformable Convolutional layers, applying them only at our final two convolutional layers rather than in other places as done in 6f4a6e5 and 88a364c. After re-reading [Dai et al. 2017](http://arxiv.org/abs/1703.06211)'s paper, it turns out that they only applied deformable convolution at the final three ResNet blocks (confusing use of the word 'top' layers), rather than at the input blocks, for reasons I don't quite fully understand. Anyways, our network does seem to perform better now, generating bed elevation models more closer to BEDMAP2 but of course, with some bumpy terrain! Best RMSE_test result of 66.70, with SSIM score of 0.7625 at https://www.comet.ml/weiji14/deepbedmap/72e783d7b96d4ef5ac39cc00b808198f! The RMSE for the full Pine Island Glacier area is 214.03 which is not ideal, but the (deform conv applied) model is starting to capture the topography a lot better, phew! Tweaking our default residual scaling to 0.1 as that seems to be where things are at, and re-instated our MedianPruner to use n_warmup steps=15 as was in f52be2d. Training is still a bit fiddly, with occasional shingle-like artifacts, but when we get lucky, the results can turn out well!
9472d7a to
12b8eef
Compare
weiji14
added a commit
that referenced
this pull request
Sep 14, 2019
Closes #171 Switch from standard Convolution to Deformable Convolution.
10 tasks
weiji14
added a commit
that referenced
this pull request
Apr 15, 2020
Lots of edits to bring the paper up to speed on what our DeepBedMap v0.9.4 model's training details, architecture and perceptual loss function looks like. Updated training details to reference quad GPU training and new hyperparameters. Mention how our Generator model architecture now uses Deformable Convolution in the last 2 layers (#171), and remove mention that we used a residual scaling factor of 0.1 and PixelShuffle upsampling layers (#160). Added proper equations on topographic loss (added in d599ee8) and structural loss (#172) functions we've introduced, with some tweaks to the Loss Functions subsection as well. Also made sure to cite Chainer and Optuna properly using their KDD19 paper, plus some others of course.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Towards improving the realism of our predicted DeepBedMap grids by using Deformable Convolutional Layers! Hopefully able to better capture the streaklines and show less pixelated artifacts.
Inspired by the Winning Solution in NTIRE19 Challenges on Video Restoration and Enhancement (CVPR19 Workshops) - Video Restoration with Enhanced Deformable Convolutional Networks https://xinntao.github.io/projects/EDVR
References: