Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exploding gradient problem #47

Open
9-Time opened this issue Apr 21, 2019 · 5 comments
Open

Exploding gradient problem #47

9-Time opened this issue Apr 21, 2019 · 5 comments

Comments

@9-Time
Copy link

9-Time commented Apr 21, 2019

I have tried training on open image dataset but i kept getting exploding gradient with the bounding box after 1 epoch
photo_2019-04-22_01-54-36

Any ideas on how to fix them?

@qfgaohao
Copy link
Owner

@scarmaten have you tried smaller learning rates?

@9-Time
Copy link
Author

9-Time commented Apr 23, 2019

yup i have tried it with learning rate all the way till 1e-20. still no luck.
The problem was partially solved when I removed Expand(mean), RandomSampleCrop(), RandomMirror() under the trainAugmentation in transformation.py but that would mean that the model's accuracy is much worse

@qfgaohao
Copy link
Owner

Hi @scarmaten , one possible way to debug is use a very small, verified dataset as both train and val data to train overfitted models. If you can get a overfitted model with 100% (or almost) accuracy, that means the architecture and system is fine, and the problem might be the data. Otherwise, there is something wrong in the system design.

@tamyiuchau
Copy link

I have also come across exploding gradient problem when trying to use mobilenet v2 ssd on Widerface. I am able to bisect the problem to the smooth L1 loss at

smooth_l1_loss = F.smooth_l1_loss(predicted_locations, gt_locations, size_average=False)
.
I suspect it is caused by no default boxes matched. After changing torch.nn.functional.smooth_l1_loss( reduction='sum'), problem is solved.

@tamyiuchau
Copy link

The question then arise, should I adjust the prior boxes for my specific purposes, or just let it go?(As it is kind of working now)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants