-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Difference between official MOTChallenge code and this (with examples) #126
Comments
Thanks for the report. Indeed, we have currently have two applications that behave differently in this respect. The first one is the most generic one:
and the other one was prepared via a PR to include preprocessing
Ideally those two should be merged as probably 80% of the code overlaps. |
I had a look and it seems like it is indeed due to the preprocessing. Specifically, the matlab eval kit does an initial pass with independent per-frame matching and removes all predicted boxes that are matched to a ground-truth box belong to a set of classes. For the "Lif_T" data provided, I found 14 matches to the "distractor" class (ID 8) and 8 matches to the "occluder_on_grnd" class (ID 10). This seems about the right size to explain the difference in FP and FN, but I will need to dig a bit deeper to be certain. |
In general, it would be good if we could add support for "ignore" regions in the annotations. |
Hey Jack, thanks for the investigation. With ignore you mean: |
@cheind We would need to discuss the design of ignore regions. It could be a region that is matched to 1 prediction, or it could be a region that excludes multiple predictions. |
Update on the difference in results. I performed an initial, independent, per-frame matching and excluded any predictions that matched ground-truth boxes in classes {2, 7, 8, 12}. This brought the FN and FP numbers closer but still not exact:
|
Another update: It seems that the matlab code only preserves identities from the previous frame when calculating the correspondence for MOTA. This results in an inflated number of identity switches and a sub-optimal MOTA score. I modified the py-mot toolkit to do the same (although I believe it is not the intended behaviour) and obtained the following:
(To see this in the matlab code, check the usage of the Now only the MT/PT/ML measures remain. |
And finally, it seems that the matlab toolkit uses ≤ 0.8 for "partially tracked" whereas py-motmetrics uses < 0.8. Here is the matlab code responsible. If I make this modification to py-motmetrics, it closes the gap completely. I'm not sure where these metrics are officially defined. I would argue that if a track is exactly 80% tracked then we should put it in the "mostly tracked" category not the "partially tracked" category, but it's just my gut instinct ;) |
Hey hey. I also found the 'only preserves from previous frame' error earlier today, and this also resulted in my script getting the same result of Lif_T as the official script. Not 100% sure if this is a feature or a bug, but it is what it is. I definitely didn't imagine it being this way, hence coding it differently, but in some ways it makes sense. However, my script is still getting different results on this MOT20 tracker (attached). Now I think it is definitely the preprocessing, as when using the MATLAB script pre-processing but my eval code I get exactly the same results, otherwise when using my preprocessesing code I get 26 TPs more (out of around 1.7million total). Let me know if your current version gets the same result at MATLAB script? MATLAB SCRIPT RESULTS: | TP | FP | FN | IDSW | MOTP |
When I run the following file on this repo and the official MOTChallenge repo (https://github.com/dendorferpatrick/MOTChallengeEvalKit) I receive differing results.
Lif_T.zip
| MOTA | MOTP | IDF1 | IDP | IDR | Rcll | Prcn | FP | FN | MT | PT | ML | FM | IDSW
MOTCha | 66.98 | 89.09 | 72.35 | 88.77 | 61.06 | 68 | 98.85 | 2655 | 107803 | 679 | 595 | 364 | 1153 | 791
PYMOT | 67.0 | 0.109 | 72.4 | 88.8 | 61.1 | 68.0 | 98.90% | 2663 | 107797 | 693 | 581 | 364 | 1237 | 781
The float numbers don't say much because they the PYMOT numbers are rounded too much.
However the integer numbers such at FP / FN / IDSW are different.
Initial experiments suggest that one possible cause may be the 'preprocessing' may be a cause of difference (https://github.com/dendorferpatrick/MOTChallengeEvalKit/blob/master/matlab_devkit/utils/preprocessResult.m)
The text was updated successfully, but these errors were encountered: