Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training Issue #50

Open
JoeHEZHAO opened this issue Oct 30, 2017 · 5 comments
Open

Training Issue #50

JoeHEZHAO opened this issue Oct 30, 2017 · 5 comments

Comments

@JoeHEZHAO
Copy link

JoeHEZHAO commented Oct 30, 2017

Hello Lin

I have a couple of questions about training the network with data generated by gen_pnet_data.py

Noted that the data/mtcnn/imglists/train_12.txt mix both positive and negative images and their groundtruth, I am wondering how do you deal with negative bounding box, which is 0 ?

For example, the regression result is [0.1, 0.2, 0.3, 0.4] and negative bbx ground truth is [0]. Should I make it [0,0,0,0] ? Or should I training bbx only with positive and parts data ? However, since we are training the classification and bbx at the same time, I am guessing we should training all data at the same time, right ?

Best
HZ

@Seanlinx
Copy link
Owner

@JoeHEZHAO
For bbox regression task, the gradients of negative examples will be set to 0 in backward,
https://github.com/Seanlinx/mtcnn/blob/master/core/negativemining.py#L53
so their ground truth can be any value you like. And it's set to [0,0,0,0] in this code.
https://github.com/Seanlinx/mtcnn/blob/master/core/imdb.py#L127

@JoeHEZHAO
Copy link
Author

@Seanlinx Thanks so much for replying. Please allow me rephrase your words, just to make sure I understand right.

For classification, we are using label 0 and 1 to find valid index and calculate the loss. And for bbx regression, we are using label -1 and 1 to get valid index and the loss. Am I correct ?

According to the paper, the total loss would be cls_loss + 0.5 * bbx_loss ?

@Seanlinx
Copy link
Owner

Seanlinx commented Nov 1, 2017

@JoeHEZHAO Yes, you're right.
I didn't follow the ratio of losses provided in the paper since my implication doesn't have landmark task included. I simply assign equal importance to cls and bbox tasks, so that's cls_loss + bbox_loss.
You can alter the value of grad_scale in 'bbox_pred' layer to set the importance of bbox task.

@JoeHEZHAO
Copy link
Author

@Seanlinx Thanks so much for your help. But another problem just come out.
When I train the pnet, the loss for bounding box regression would explode, such as up to 100000, if I don't use BatchNormalization after PRelu layer. May I ask, how did you calculate the loss for bounding box regression? I am using MSE for target_bbx (size [batch, 4] ) and pred_bbx [batch, 4].

@JoeHEZHAO
Copy link
Author

Sorry for the bother. Previous question has been solved by normalize input images with ImageNet's paremeter (mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)). I could get reasonable loss now. Seems like ConvNet works better on data in (0, 1) range.

However, when I train classification and regression together, the loss would never go down to 0.22 and 0.015 as your code trained. Though when I train them separately, I can get low loss as yours.

May I ask again, are you just adding two loss together then do back propagation ? Is there else should be done for loss calculating ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants