-
-
Notifications
You must be signed in to change notification settings - Fork 16.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mAP bug at higher --conf #1466
Comments
A third option would be to extrapolate the curves to zero based on their last known derivatives. I think np.interp has an option for this baked in, could be used in conjunction with np.clip(0,1). |
Update on this. np.interp does not have built in extrapolation capability, we would need to mode to scipy for that, so I think I will simply turn back the clock on the code updates introduced in PR #1206 |
@glenn-jocher Hey man, current yolov5 codebase has this problem again. Can you solve this? |
python val.py --weights weights/yolov5s.pt --data data/coco.yaml --verbose --name coco --conf 0.7
val: data=data/coco.yaml, weights=['weights/yolov5s.pt'], batch_size=32, imgsz=640, conf_thres=0.7, iou_thres=0.6, task=val, device=, single_cls=False, augment=False, verbose=True, save_txt=False, save_hybrid=False, save_conf=False, save_json=True, project=runs/val, name=coco, exist_ok=False, half=False
YOLOv5 🚀 gitlab-584-g6b4eb27 torch 1.9.0 CUDA:0 (NVIDIA GeForce RTX 2060, 5934MB)
Fusing layers...
Model Summary: 224 layers, 7266973 parameters, 0 gradients
val: Scanning '11_mscoco/YOLO/val2017.cache' images and labels... 4952 found, 48 missing, 0 empty, 0 corrupted: 100%|██████████| 5000/5000 [00:00<?, ?it/s]
dataset: using NoneType
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|██████████| 157/157 [01:05<00:00, 2.39it/s]
all 5000 36335 0.902 0.238 0.572 0.461
person 5000 10777 0.962 0.379 0.671 0.541
bicycle 5000 314 0.98 0.153 0.568 0.432
car 5000 1918 0.918 0.276 0.599 0.485
motorcycle 5000 367 0.953 0.166 0.561 0.45
airplane 5000 143 0.984 0.434 0.711 0.621
bus 5000 283 0.95 0.47 0.718 0.627
train 5000 190 0.986 0.363 0.674 0.564
truck 5000 414 0.933 0.0676 0.502 0.395
boat 5000 424 0.93 0.0943 0.514 0.371
traffic light 5000 634 0.932 0.151 0.543 0.373
fire hydrant 5000 101 0.983 0.564 0.778 0.658
stop sign 5000 75 0.956 0.573 0.773 0.714
parking meter 5000 60 1 0.383 0.692 0.553
bench 5000 411 0.932 0.0998 0.519 0.419
bird 5000 427 0.929 0.215 0.576 0.446
cat 5000 202 0.925 0.243 0.582 0.461
dog 5000 218 0.911 0.33 0.629 0.548
horse 5000 272 0.942 0.474 0.706 0.579
sheep 5000 354 0.873 0.37 0.628 0.512
cow 5000 372 0.92 0.341 0.639 0.528
elephant 5000 252 0.877 0.536 0.68 0.551
bear 5000 71 0.921 0.493 0.713 0.632
zebra 5000 266 0.974 0.56 0.773 0.642
giraffe 5000 232 0.973 0.612 0.798 0.678
backpack 5000 371 1 0.0135 0.507 0.325
umbrella 5000 407 0.89 0.179 0.536 0.43
handbag 5000 540 1 0.00741 0.504 0.416
tie 5000 252 0.957 0.179 0.568 0.425
suitcase 5000 299 0.955 0.14 0.547 0.451
frisbee 5000 115 0.922 0.617 0.787 0.639
skis 5000 241 0.952 0.083 0.518 0.39
snowboard 5000 69 0.857 0.087 0.474 0.349
sports ball 5000 260 0.91 0.35 0.634 0.504
kite 5000 327 0.907 0.269 0.588 0.462
baseball bat 5000 145 1 0.159 0.579 0.372
baseball glove 5000 148 0.929 0.351 0.646 0.434
skateboard 5000 179 0.915 0.48 0.711 0.533
surfboard 5000 267 0.94 0.176 0.56 0.424
tennis racket 5000 225 0.948 0.404 0.68 0.467
bottle 5000 1013 0.935 0.155 0.546 0.444
wine glass 5000 341 0.938 0.223 0.58 0.467
cup 5000 895 0.893 0.225 0.56 0.479
fork 5000 215 0.952 0.093 0.523 0.425
knife 5000 325 0.857 0.0185 0.439 0.366
spoon 5000 253 0.75 0.0119 0.381 0.33
bowl 5000 623 0.939 0.172 0.554 0.474
banana 5000 370 0.909 0.0541 0.481 0.386
apple 5000 236 0.727 0.0339 0.377 0.272
sandwich 5000 177 0.778 0.0791 0.427 0.324
orange 5000 285 0.774 0.0842 0.435 0.407
broccoli 5000 312 0.929 0.0417 0.484 0.357
carrot 5000 365 0.833 0.0274 0.432 0.359
hot dog 5000 125 1 0.136 0.568 0.457
pizza 5000 284 0.93 0.327 0.628 0.512
donut 5000 328 0.832 0.241 0.538 0.489
cake 5000 310 0.875 0.113 0.498 0.39
chair 5000 1771 0.933 0.118 0.527 0.43
couch 5000 261 0.914 0.203 0.559 0.475
potted plant 5000 342 0.919 0.0994 0.507 0.371
bed 5000 163 1 0.092 0.546 0.413
dining table 5000 695 1 0.00576 0.503 0.328
toilet 5000 179 0.956 0.48 0.717 0.613
tv 5000 288 0.984 0.441 0.714 0.569
laptop 5000 231 0.929 0.394 0.67 0.582
mouse 5000 106 0.906 0.547 0.74 0.61
remote 5000 283 0.846 0.0777 0.463 0.382
keyboard 5000 153 0.912 0.34 0.639 0.507
cell phone 5000 262 0.847 0.191 0.525 0.431
microwave 5000 55 0.917 0.4 0.665 0.548
oven 5000 143 0.897 0.182 0.544 0.427
toaster 5000 9 0 0 0 0
sink 5000 225 0.895 0.227 0.565 0.46
refrigerator 5000 126 0.973 0.286 0.631 0.531
book 5000 1129 1 0.00266 0.501 0.468
clock 5000 267 0.967 0.547 0.762 0.57
vase 5000 274 0.849 0.226 0.529 0.414
scissors 5000 36 1 0.0833 0.542 0.488
teddy bear 5000 190 0.974 0.2 0.588 0.498
hair drier 5000 11 0 0 0 0
toothbrush 5000 57 1 0.0175 0.509 0.458
Speed: 0.2ms pre-process, 4.7ms inference, 0.5ms NMS per image at shape (32, 3, 640, 640)
Evaluating pycocotools mAP... saving runs/val/coco.03/yolov5s_predictions.json...
loading annotations into memory...
Done (t=0.52s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.03s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=8.40s).
Accumulating evaluation results...
DONE (t=1.68s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.183
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.236
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.208
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.055
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.229
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.269
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.151
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.194
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.194
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.055
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.241
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.290
Results saved to runs/val/coco.03
Process finished with exit code 0 |
@imyhxy thanks for raising this issue again! I'll add a TODO to investigate. TODO: investigate higher mAP at higher --conf bug in val.py, possibly related to curve extrap towards (0,1) x,y point |
@imyhxy this is associated with extrapolation of the PR curve in #4563 to bring us into alignment with Detectron2 and MMDetection mAP computation. Before this RP the curve fell to zero at the last datapoint (no matter where that was on the x axis), but PR #4563 updated this to connect the last point linearly to 0,1. The higher confidence thresholds lack data on the right side of the curve so the extrapolation error is greater: - mrec = np.concatenate(([0.], recall, [recall[-1] + 0.01]))
- mpre = np.concatenate(([1.], precision, [0.]))
+ mrec = np.concatenate(([0.0], recall, [1.0]))
+ mpre = np.concatenate(([1.0], precision, [0.0])) --conf 0.001!python val.py --weights yolov5m.pt --data coco.yaml --img 640 --iou 0.65 --half --conf 0.001
val: data=/content/yolov5/data/coco.yaml, weights=['yolov5m.pt'], batch_size=32, imgsz=640, conf_thres=0.001, iou_thres=0.65, task=val, device=, single_cls=False, augment=False, verbose=False, save_txt=False, save_hybrid=False, save_conf=False, save_json=True, project=runs/val, name=exp, exist_ok=False, half=True
YOLOv5 🚀 v6.0-3-g20a809d torch 1.9.0+cu111 CUDA:0 (Tesla P100-PCIE-16GB, 16280.875MB)
Fusing layers...
Model Summary: 290 layers, 21172173 parameters, 0 gradients
val: Scanning '../datasets/coco/val2017' images and labels...4952 found, 48 missing, 0 empty, 0 corrupted: 100% 5000/5000 [00:01<00:00, 2837.14it/s]
val: New cache created: ../datasets/coco/val2017.cache
Class Images Labels P R [email protected] [email protected]:.95: 100% 157/157 [01:19<00:00, 1.99it/s]
all 5000 36335 0.71 0.582 0.633 0.439
Speed: 0.1ms pre-process, 7.8ms inference, 1.7ms NMS per image at shape (32, 3, 640, 640)
Evaluating pycocotools mAP... saving runs/val/exp/yolov5m_predictions.json...
loading annotations into memory...
Done (t=0.89s)
creating index...
index created!
Loading and preparing results...
DONE (t=5.89s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=89.06s).
Accumulating evaluation results...
DONE (t=15.01s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.452
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.639
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.492
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.280
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.506
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.576
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.354
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.586
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.641
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.467
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.703
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.784
Results saved to runs/val/exp --conf 0.500!python val.py --weights yolov5m.pt --data coco.yaml --img 640 --iou 0.65 --half --conf 0.5
val: data=/content/yolov5/data/coco.yaml, weights=['yolov5m.pt'], batch_size=32, imgsz=640, conf_thres=0.5, iou_thres=0.65, task=val, device=, single_cls=False, augment=False, verbose=False, save_txt=False, save_hybrid=False, save_conf=False, save_json=True, project=runs/val, name=exp, exist_ok=False, half=True
YOLOv5 🚀 v6.0-3-g20a809d torch 1.9.0+cu111 CUDA:0 (Tesla P100-PCIE-16GB, 16280.875MB)
Fusing layers...
Model Summary: 290 layers, 21172173 parameters, 0 gradients
val: Scanning '../datasets/coco/val2017.cache' images and labels... 4952 found, 48 missing, 0 empty, 0 corrupted: 100% 5000/5000 [00:00<?, ?it/s]
Class Images Labels P R [email protected] [email protected]:.95: 100% 157/157 [01:08<00:00, 2.31it/s]
all 5000 36335 0.811 0.499 0.667 0.527
Speed: 0.1ms pre-process, 7.8ms inference, 1.0ms NMS per image at shape (32, 3, 640, 640)
Evaluating pycocotools mAP... saving runs/val/exp2/yolov5m_predictions.json...
loading annotations into memory...
Done (t=0.83s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.21s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=12.85s).
Accumulating evaluation results...
DONE (t=1.96s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.358
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.474
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.397
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.173
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.413
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.497
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.283
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.394
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.397
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.185
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.453
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.558
Results saved to runs/val/exp2 |
Partially addresses invalid mAPs at higher confidence threshold issue #1466.
Partially addresses invalid mAPs at higher confidence threshold issue #1466.
Partially addresses invalid mAPs at higher confidence threshold issue ultralytics#1466.
Is there a plan for fixing this issue ? The latest code on master still shows this warning. |
@smohan-ambarella there is no bug. If you don't want to be warned, don't modify arguments. |
Partially addresses invalid mAPs at higher confidence threshold issue ultralytics/yolov5#1466.
Partially addresses invalid mAPs at higher confidence threshold issue ultralytics/yolov5#1466.
Hello, Is this issue still persistent? |
Answer: The team is actively investigating the mAP behavior at higher confidence thresholds. For optimal results, we recommend using default |
A recent modification to the PR curve in pull request #1206 computation introduced a bug whereby mAP increases at higher --conf thresholds. This was caused by a change to the 'sentinel values' on the P and R vectors here:
The appropriate solution would be to reinstitute the old code, which drops the curves to zero after their last data point, or to interpolate it to zero at recall = 1. I'll experiment with both and implement a fix soon.
This does not affect any operations using the default test.py --conf 0.001, so I would imagine almost no users would be impacted by this, but it needs fixing in any case.
The text was updated successfully, but these errors were encountered: