-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve mAP performance #742
Conversation
for more information, see https://pre-commit.ci
Codecov Report
@@ Coverage Diff @@
## master #742 +/- ##
======================================
- Coverage 91% 83% -8%
======================================
Files 166 166
Lines 6817 6832 +15
======================================
- Hits 6181 5651 -530
- Misses 636 1181 +545 |
The performance improvements are great and really necessary. However, CUDA calculations are still by a factor 9-10 slower than CPU. Computations are not CUDA optimized and I don't know if it's possible (or how much effort it is).
Until then, I suggest to make the mAP metric CPU only, to avoid that somebody uses the slow CUDA version. |
sounds reasonable to me 🐰 |
* Remove deprecated functions, and warnings * Update links for docstring * chlog Co-authored-by: Daniel Stancl <[email protected]> Co-authored-by: Jirka Borovec <[email protected]>
for more information, see https://pre-commit.ci
I ran a benchmark on a real-works use-case with 1088 samples and ~10 bounding boxes per sample. Pycocotools (previous implementation): 74.91s IMHO this should be merged and released ASAP to make the metric usable again on GPUs. |
@tkupek @OlofHarrysson @twsl thanks for really trying to improve performance for this metric. |
@Borda please approve and merge. |
* Simplify id generation * rework and speed up _find_best_gt_match * add: Refactor to avoid duplicate calculations * precision,recall,scores on correct device (-20%) * arguments to python lists * enumerate instead of range * compute on device * Remove exception * Replace prec score loop * Fix auc flattening * draft to run metric on cpu only * move tensors to cpu on compute, need to be on GPU for multi GPU syncing Co-authored-by: tobias-kupek-swarm <[email protected]> Co-authored-by: Olof Harrysson <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Tobias Kupek <[email protected]> Co-authored-by: SkafteNicki <[email protected]> (cherry picked from commit 408cabe)
Unfortunately, I don't have any concrete ideas in mind, nor the time to look at it right now. |
I might give it another shot after i handed in my thesis next week. But cant promise any results or timeline |
What does this PR do?
Fixes #677
Work of @tkupek @OlofHarrysson @twsl
First steps to get performance on par with pycocotools/numpy
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃