Difference between official MOTChallenge code and this (with examples) #126

JonathonLuiten · 2020-09-17T13:59:53Z

When I run the following file on this repo and the official MOTChallenge repo (https://github.com/dendorferpatrick/MOTChallengeEvalKit) I receive differing results.

Lif_T.zip

| MOTA | MOTP | IDF1 | IDP | IDR | Rcll | Prcn | FP | FN | MT | PT | ML | FM | IDSW
MOTCha | 66.98 | 89.09 | 72.35 | 88.77 | 61.06 | 68 | 98.85 | 2655 | 107803 | 679 | 595 | 364 | 1153 | 791
PYMOT | 67.0 | 0.109 | 72.4 | 88.8 | 61.1 | 68.0 | 98.90% | 2663 | 107797 | 693 | 581 | 364 | 1237 | 781

The float numbers don't say much because they the PYMOT numbers are rounded too much.
However the integer numbers such at FP / FN / IDSW are different.

Initial experiments suggest that one possible cause may be the 'preprocessing' may be a cause of difference (https://github.com/dendorferpatrick/MOTChallengeEvalKit/blob/master/matlab_devkit/utils/preprocessResult.m)

cheind · 2020-09-17T15:01:04Z

Thanks for the report. Indeed, we have currently have two applications that behave differently in this respect. The first one is the most generic one:

py-motmetrics/motmetrics/apps/eval_motchallenge.py

Line 81 in d261d16

def main():

and the other one was prepared via a PR to include preprocessing

py-motmetrics/motmetrics/apps/evaluateTracking.py

Line 132 in d261d16

def main():

Ideally those two should be merged as probably 80% of the code overlaps.

jvlmdr · 2020-09-18T14:13:10Z

I had a look and it seems like it is indeed due to the preprocessing. Specifically, the matlab eval kit does an initial pass with independent per-frame matching and removes all predicted boxes that are matched to a ground-truth box belong to a set of classes. For the "Lif_T" data provided, I found 14 matches to the "distractor" class (ID 8) and 8 matches to the "occluder_on_grnd" class (ID 10). This seems about the right size to explain the difference in FP and FN, but I will need to dig a bit deeper to be certain.

jvlmdr · 2020-09-18T14:14:25Z

In general, it would be good if we could add support for "ignore" regions in the annotations.

cheind · 2020-09-18T17:34:31Z

Hey Jack, thanks for the investigation. With ignore you mean: ignore if ann is matched in accumulator.update?

jvlmdr · 2020-09-20T13:53:14Z

@cheind We would need to discuss the design of ignore regions. It could be a region that is matched to 1 prediction, or it could be a region that excludes multiple predictions.

jvlmdr · 2020-09-20T14:04:16Z

Update on the difference in results. I performed an initial, independent, per-frame matching and excluded any predictions that matched ground-truth boxes in classes {2, 7, 8, 12}. This brought the FN and FP numbers closer but still not exact:

	FP	FN	MT	PT	ML	FM	IDSW
matlab	2655	107803	679	595	364	1153	791
py-mot	2652	107800	693	581	364	1239	783

jvlmdr · 2020-09-20T16:55:55Z

Another update: It seems that the matlab code only preserves identities from the previous frame when calculating the correspondence for MOTA. This results in an inflated number of identity switches and a sub-optimal MOTA score. I modified the py-mot toolkit to do the same (although I believe it is not the intended behaviour) and obtained the following:

	FP	FN	MT	PT	ML	FM	IDSW
matlab	2655	107803	679	595	364	1153	791
py-mot	2655	107803	692	582	364	1153	791

(To see this in the matlab code, check the usage of the M[t] variable in clearMOTMex.cpp.)

Now only the MT/PT/ML measures remain.

jvlmdr · 2020-09-20T17:15:28Z

And finally, it seems that the matlab toolkit uses ≤ 0.8 for "partially tracked" whereas py-motmetrics uses < 0.8. Here is the matlab code responsible. If I make this modification to py-motmetrics, it closes the gap completely.

I'm not sure where these metrics are officially defined. I would argue that if a track is exactly 80% tracked then we should put it in the "mostly tracked" category not the "partially tracked" category, but it's just my gut instinct ;)

JonathonLuiten · 2020-09-20T18:54:45Z

Hey hey. I also found the 'only preserves from previous frame' error earlier today, and this also resulted in my script getting the same result of Lif_T as the official script. Not 100% sure if this is a feature or a bug, but it is what it is. I definitely didn't imagine it being this way, hence coding it differently, but in some ways it makes sense.

However, my script is still getting different results on this MOT20 tracker (attached).

tracker20.zip

Now I think it is definitely the preprocessing, as when using the MATLAB script pre-processing but my eval code I get exactly the same results, otherwise when using my preprocessesing code I get 26 TPs more (out of around 1.7million total).

Let me know if your current version gets the same result at MATLAB script?

MATLAB SCRIPT RESULTS:

| TP | FP | FN | IDSW | MOTP
MOT20-01 | 12843 | 25 | 7027 | 44 | 90.343
MOT20-02 | 94320 | 289 | 60422 | 304 | 90.584
MOT20-03 | 256590 | 756 | 57068 | 259 | 85.615
MOT20-05 | 487082 | 2059 | 159262 | 980 | 85.293
OVERALL | 850835 | 3129 | 283779 | 1587 | 86.053

jvlmdr mentioned this issue Dec 23, 2020

Official devkit counts ID switches differently #132

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference between official MOTChallenge code and this (with examples) #126

Difference between official MOTChallenge code and this (with examples) #126

JonathonLuiten commented Sep 17, 2020

cheind commented Sep 17, 2020 •

edited

Loading

jvlmdr commented Sep 18, 2020 •

edited

Loading

jvlmdr commented Sep 18, 2020

cheind commented Sep 18, 2020

jvlmdr commented Sep 20, 2020

jvlmdr commented Sep 20, 2020

jvlmdr commented Sep 20, 2020 •

edited

Loading

jvlmdr commented Sep 20, 2020

JonathonLuiten commented Sep 20, 2020

Difference between official MOTChallenge code and this (with examples) #126

Difference between official MOTChallenge code and this (with examples) #126

Comments

JonathonLuiten commented Sep 17, 2020

cheind commented Sep 17, 2020 • edited Loading

jvlmdr commented Sep 18, 2020 • edited Loading

jvlmdr commented Sep 18, 2020

cheind commented Sep 18, 2020

jvlmdr commented Sep 20, 2020

jvlmdr commented Sep 20, 2020

jvlmdr commented Sep 20, 2020 • edited Loading

jvlmdr commented Sep 20, 2020

JonathonLuiten commented Sep 20, 2020

cheind commented Sep 17, 2020 •

edited

Loading

jvlmdr commented Sep 18, 2020 •

edited

Loading

jvlmdr commented Sep 20, 2020 •

edited

Loading