Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mAP #1

Open
GOATmessi8 opened this issue Mar 4, 2017 · 32 comments
Open

mAP #1

GOATmessi8 opened this issue Mar 4, 2017 · 32 comments

Comments

@GOATmessi8
Copy link

Have you ever evaluate the transformed trained model in VOC2007? I've tried your code and got a 71.9 mAP while the original is 76.8. Then I found a tiny error in test code, after fixing the result up to 72.8 mAP, still not enough...

@longcw
Copy link
Owner

longcw commented Mar 4, 2017

Yes, I got the same result. You can make a pull request for me to fix the bug.
I have no idea about the low mAP of my implementation. Did you try darknet implemented by the author?

@GOATmessi8
Copy link
Author

Not yet, but I found an issue in darkflow, it seems the transfer to tensorflow also cause some difference. https://github.com/thtrieu/darkflow/issues/25
I will make a pull request if I figure out the training part. Maybe that could solve the problem...

@longcw
Copy link
Owner

longcw commented Mar 7, 2017

I implemented the loss function following the darknet and the training process is work now.
I trained it on VOC2007 trainval set and got a 71.86 mAP ~50mAP on the test set.
Maybe you can find out some other problems about the low mAP with the help of darknet source code.

@terrychenism
Copy link

@longcw Thank you for sharing code. I have tested the converted darknet model, which got ~72 mAP. Then I trained VOC07 trainval set for 160 epoch (totally use your github codes), which only got ~50 mAP. Did you successfully train the yolo2 detector?

@longcw
Copy link
Owner

longcw commented Mar 8, 2017

Thank you for your comment.
I tested the trained model and got the same result, ~50mAP. There are still some bugs for training. I am sorry for this.

@crazylyf
Copy link
Contributor

crazylyf commented Mar 14, 2017

For test phase, there are two parameters inconsistent with the original darknet:

  • The thresh parameter for bbox filtering is 0.001 in darknet, while it is 0.01 in test.py;
  • The iou_thresh for nms is 0.5 in darknet, while it is 0.3 in this project.
    For train phase, the thresh in cfgs/config.py should be 0.24, instead of 0.3;

As @ruinmessi , before correcting those parameters, the mAP in VOC2007-test is 71.9. Correction of first parameter improves slightly to 72.2, and correction the iou_thresh further boosts to 73.6.
The tensorflow version of yolo (darkflow) seems to suffer such a problem too, and an issue of that project pointed out some possible reasons. Maybe the reasons exist also in this project?

@crazylyf
Copy link
Contributor

crazylyf commented Mar 14, 2017

@ruinmessi What error in test code have you fixed?

@GOATmessi8
Copy link
Author

GOATmessi8 commented Mar 15, 2017

@longcw @crazylyf Sorry for leaving a long time. I boost the mAP to 74.3 by changing the nms order like this while this project do the nms in a function called postprocess. with the exact parameters you mentioned.

@crazylyf
Copy link
Contributor

Why your mAP is 0.7 higher if we are using the same parameters? Am I missing something?

@GOATmessi8
Copy link
Author

The nms should implement before thresh holding.

@longcw
Copy link
Owner

longcw commented Mar 15, 2017

@ruinmessi Thank you for pointing out this problem.

@GOATmessi8
Copy link
Author

GOATmessi8 commented Mar 16, 2017

@longcw I am curious about how to convert the original weights to h5 file, could you please show me some details or scripts?

@longcw
Copy link
Owner

longcw commented Mar 16, 2017

@ruinmessi I use darkflow to load original weights from the binary weights file.

@rdfong
Copy link

rdfong commented Apr 14, 2017

Is there any update on the training issue?

@jxgu1016
Copy link

@ruinmessi Does the order of NMS and thresh holding affect the results? I don't think so..Can anyone prove I am wrong?

@rdfong
Copy link

rdfong commented Apr 28, 2017

Perhaps the weights of the convolutional layers needs to be held fixed while training on the VOC datasets?

@rdfong
Copy link

rdfong commented Apr 28, 2017

In darknet19_448.cfg from the darknet project, batch size is 128, not 16 as it is in the config files here. Unfortunately I do not have the resources to test with a full batch size of 128. With 16 though I can confirm that I only get ~50 mAP. Can someone else try to confirm whether or not changing the batch size makes a difference? It's the only parameter I can find that differs between the two projects.

@cory8249
Copy link
Contributor

cory8249 commented May 1, 2017

I slightly change this code (following original YOLO training procedure), and train 160 epoch on VOC07+12, test on VOC07-test, evaluated mAP with 416 x 416 resolution
0.6334, batch size 16 (trained by me)
0.6446, batch size 32 (trained by me)

0.7221, batch size 64 (directly test by using the weight provided by @longcw (yolo-voc.weights.h5)
0.768 , batch size 64 (claimed by paper, not trained by me)

Revise this code seems necessary if you want to train with such large batch size (64)
It need to work on multi-GPU. ( split a large batch to smaller to fit into single GPU memory)

I think there is still something mismatched, so mAP drops largely.

@JesseYang
Copy link

I have implemented YOLOv2 in tensorflow. But I can achieve an mAP of about only 0.60 on VOC07-test (train with VOC07+12 train+val), with all the tricks except "hi-res detector" in Table 2 in the paper implemented. @cory8249 Could you kindly share your code which achieves 0.768 mAP?
Thanks!!

@cory8249
Copy link
Contributor

cory8249 commented May 6, 2017

@JesseYang Sorry to let you misunderstand, 0.768 mAP is not trained by me. I just mention it as reference.

@JesseYang
Copy link

@cory8249 I see. Thanks!

@cory8249
Copy link
Contributor

cory8249 commented May 9, 2017

I fix the IoU bug, and train on VOC0712 trainval.
Get mAP = 0.6825 (still increase slowly)
https://github.com/cory8249/yolo2-pytorch/blob/master/darknet.py#L120

@JesseYang
Copy link

@cory8249 Have you fixed another issue when you got the 0.6825 mAP?

@cory8249
Copy link
Contributor

@JesseYang I think I've fix these exp() sig() bug in my experiment.

@cory8249
Copy link
Contributor

I also found something interesting:
ver.A = pytorch anaconda prebuild version (cp36)
ver.B = pytorch built from source code using native python (python35)
In training phase ver.A is 2x slower than ver.B (1 sec/batch vs. 0.5 sec/batch)
In test phase ver.A is 1.5x slower than ver.B (16 ms/img vs. 11ms/img)

Does anyone have this same problem ?

@cory8249
Copy link
Contributor

I've trained a model with mAP = 0.71 by fixing bug in #23

@gauss-clb
Copy link

gauss-clb commented Jan 1, 2018

Does anyone try to train yolov1 on pascal voc(2007+2012 trainval) and surpass mAP by 60% on 2007 test?

@xuzijian
Copy link

xuzijian commented Apr 8, 2018

After modified the code mentioned here, my mAP goes to 72.1% with 416*416 input.

@wahrheit-git
Copy link

@xuzijian what mAP do you get with VOC(2007 trainval) after the changes?

@xuzijian
Copy link

@kk1153 I haven't trained models with only VOC07 dataset

@Liu0329
Copy link

Liu0329 commented Jul 4, 2018

@cory8249 @xuzijian @JesseYang, I use the latest master code on 07+12trainval of batchsize=32 on pytorch 0.4, and got the mAP=0.663. But when I test the yolo-voc.weights.h5, the mAP=0.677, which is much worse than the mAP=0.722 mentioned above. Did I miss something ?
While this topic has been discussed for long, can anyone provide a good result with a clear repo to follow ? Thanks !

@DW1HH
Copy link

DW1HH commented Sep 25, 2018

@Liu0329 me too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests