-
Notifications
You must be signed in to change notification settings - Fork 587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add COCO-styled mAR to evaluate_detections #5274
Conversation
WalkthroughThis pull request introduces mean Average Recall (mAR) as a new evaluation metric across multiple components of the FiftyOne framework. The changes span documentation, utility functions, and UI components, enabling comprehensive support for mAR in detection model evaluations. The implementation focuses on integrating mAR calculation methods, updating evaluation panels, and providing clear documentation for users to understand and utilize this new performance metric. Changes
Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Tip CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@manushreegangwar can you please also document mAR in the relevant places so that users know we support COCO mAR now?
- https://docs.voxel51.com/user_guide/evaluation.html#map-and-pr-curves
- https://docs.voxel51.com/integrations/coco.html#map-and-pr-curves
- In the
COCOEvaluationConfig
:
compute_mAP (False): whether to perform the necessary computations so
that mAP, mAR, and PR curves can be generated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx
(1 hunks)docs/source/integrations/coco.rst
(3 hunks)docs/source/user_guide/evaluation.rst
(3 hunks)fiftyone/operators/builtins/panels/model_evaluation/__init__.py
(2 hunks)fiftyone/utils/eval/coco.py
(10 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (1)
Pattern **/*.{ts,tsx}
: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.
🪛 Ruff (0.8.2)
fiftyone/operators/builtins/panels/model_evaluation/__init__.py
140-140: Local variable e
is assigned to but never used
Remove assignment to unused variable e
(F841)
fiftyone/utils/eval/coco.py
799-799: Do not use bare except
(E722)
🔇 Additional comments (14)
fiftyone/utils/eval/coco.py (7)
54-54
: Docstring enhancement is clear and concise
The updated description clarifying that mAR is also computed alongside mAP and PR curves is well-written and informative.
222-222
: Addition of this parameter reference looks good
Adding the recall_sweep argument in this return signature is aligned with the new mAR functionality.
236-236
: Consistent initialization of the new recall_sweep parameter
Passing recall_sweep=recall_sweep ensures proper encapsulation of the new metric within the constructor.
259-260
: Helpful docstring expansion
The shape and usage of recall_sweep is now clearly stated.
278-278
: Constructor param for recall_sweep is well integrated
No issues found in this additional parameter.
301-303
: Great separation of classwise AP and AR
The logic for storing _classwise_AR when recall_sweep is available is straightforward and clear.
388-416
: mAR method is well implemented
The method follows the same structure as mAP, correctly handling missing or negative classwise AR.
docs/source/integrations/coco.rst (4)
474-476
: Clear explanation for new metrics
Noting mAR alongside mAP is a welcome addition for users wanting comprehensive COCO-style evaluations.
481-481
: Good to highlight COCO evaluation protocol
Clarifies that the mAR computations follow the same reference implementation.
493-493
: Helpful mention of the IoU sweep
Explicitly mentioning that IoU thresholds are swept for both mAP and mAR is very informative.
504-506
: Clear usage example for results.mAR()
Shows the alignment with the newly introduced metric in code.
app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (1)
376-382
: mAR column integrated appropriately
The new entry for mAR in the summaryRows seamlessly aligns with existing detection metrics, improving consistency of displayed evaluation results.
docs/source/user_guide/evaluation.rst (2)
1044-1054
: Documentation structure and accuracy look good
The section header and note about COCO evaluation protocol are clear and accurate. The documentation properly explains that mAR calculations follow the COCO evaluation protocol.
Line range hint 1066-1078
: Code example effectively demonstrates mAR functionality
The code example clearly shows:
- How to enable mAR computation using
compute_mAP=True
- How to retrieve the mAR value using
results.mAR()
- The expected format of the output
The example is concise and follows documentation best practices.
fiftyone/utils/eval/coco.py
Outdated
|
||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid bare except
Using a bare except makes debugging harder and can mask unexpected errors.
Below is a suggested fix narrowing the exception scope:
- except:
- pass
+ except IndexError:
+ pass
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.8.2)
799-799: Do not use bare except
(E722)
def get_mar(self, results): | ||
try: | ||
return results.mAR() | ||
except Exception as e: | ||
return None | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused exception variable
Since “e” is never used, it is preferable to remove the variable name or handle it properly.
def get_mar(self, results):
try:
- return results.mAR()
- except Exception as e:
+ return results.mAR()
+ except Exception:
return None
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def get_mar(self, results): | |
try: | |
return results.mAR() | |
except Exception as e: | |
return None | |
def get_mar(self, results): | |
try: | |
return results.mAR() | |
except Exception: | |
return None | |
🧰 Tools
🪛 Ruff (0.8.2)
140-140: Local variable e
is assigned to but never used
Remove assignment to unused variable e
(F841)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM now, but let's wait until fiftyone==1.2.0
is merged back into develop
and then rebase this PR so that we can confirm that mAR behaves as expected in the Model Evaluation panel in the App:
- It is displayed when
compute_mAP=True
was passed - Otherwise, it does not appear
e87ce31
to
770c7d6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@manushreegangwar update: I went ahead and rebased on latest develop
as it already includes the latest from release/v1.2.0
. When I run the example code in the PR description, I'm not seeing mAR populate in the table:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
fiftyone/utils/eval/coco.py (3)
54-54
: Enhance docstring clarity for compute_mAP parameterThe parameter name
compute_mAP
could be misleading since it now controls both mAP and mAR computation. Consider renaming it tocompute_metrics
or adding a note explicitly mentioning both metrics.
388-416
: Consider refactoring to reduce code duplicationThe mAR implementation shares significant logic with mAP. Consider extracting the common functionality into a helper method to improve maintainability.
+ def _compute_metric_average(self, metric_values, classes=None): + if classes is not None: + class_inds = np.array([self._get_class_index(c) for c in classes]) + values = metric_values[class_inds] + else: + values = metric_values + + values = values[values > -1] + if values.size == 0: + return -1 + + return np.mean(values) + def mAP(self, classes=None): - if classes is not None: - class_inds = np.array([self._get_class_index(c) for c in classes]) - classwise_AP = self._classwise_AP[class_inds] - else: - classwise_AP = self._classwise_AP - - classwise_AP = classwise_AP[classwise_AP > -1] - if classwise_AP.size == 0: - return -1 - - return np.mean(classwise_AP) + return self._compute_metric_average(self._classwise_AP, classes) def mAR(self, classes=None): if self._classwise_AR is None: raise Exception( "Classwise AR is not available. mAR can't be computed." ) - if classes is not None: - class_inds = np.array([self._get_class_index(c) for c in classes]) - classwise_AR = self._classwise_AR[class_inds] - else: - classwise_AR = self._classwise_AR - - classwise_AR = classwise_AR[classwise_AR > -1] - if classwise_AR.size == 0: - return -1 - - return np.mean(classwise_AR) + return self._compute_metric_average(self._classwise_AR, classes)
Line range hint
1-807
: Overall implementation assessmentThe implementation successfully adds mAR support while maintaining backward compatibility. The code is well-structured and documented. Consider the following architectural improvements:
- Consider splitting the evaluation metrics (mAP, mAR) into separate strategy classes to improve modularity and make it easier to add new metrics in the future.
- Add integration tests to verify the interaction between mAP and mAR calculations.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
fiftyone/utils/eval/coco.py
(12 hunks)
🔇 Additional comments (3)
fiftyone/utils/eval/coco.py (3)
301-303
: LGTM: Proper handling of optional recall_sweep
The implementation correctly handles the optional recall_sweep parameter and safely computes _classwise_AR only when recall_sweep is available.
452-452
: LGTM: Proper serialization support
The recall_sweep field is correctly handled in the serialization process with appropriate default values.
Also applies to: 462-462
388-416
: Verify COCO evaluation standards compliance
Please verify that the mAR implementation follows the COCO evaluation standards, particularly regarding:
- The handling of crowd annotations
- The computation across IoU thresholds
✅ Verification successful
Based on the search results, I can now verify the COCO evaluation standards compliance. The code shows that:
-
The implementation follows COCO standards for IoU thresholds:
- Uses multiple IoU thresholds from 0.50 to 0.95 in steps of 0.05 (
[x / 100 for x in range(50, 100, 5)]
) - This matches the official COCO evaluation protocol
- Uses multiple IoU thresholds from 0.50 to 0.95 in steps of 0.05 (
-
For mAR computation:
- The code properly handles classwise AR computation through
_classwise_AR
- The implementation integrates with the broader COCO evaluation framework
- The code properly handles classwise AR computation through
However, let's verify one more aspect of crowd handling:
COCO evaluation standards compliance verified
The implementation correctly follows COCO evaluation standards:
-
IoU thresholds handling:
- Uses standard COCO thresholds (0.50:0.95:0.05) for mAP computation
- Properly integrates thresholds into PR curve computation
-
Crowd annotations handling:
- Correctly implements
iscrowd
attribute handling throughget_attribute_value
- Follows COCO protocol where:
- Only crowd ground truths can have multiple matches
- Crowd matches require same class label
- Non-crowd matches take precedence over crowd matches
- Correctly implements
-
mAR computation:
- Properly integrates with the COCO evaluation framework
- Handles classwise AR computation correctly
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for COCO evaluation standard compliance
# Look for crowd handling in evaluation
rg -A 5 "iscrowd.*true"
# Check IoU threshold handling
ast-grep --pattern 'def _compute_pr_curves($$$) {
$$$
iou_threshs = $$$
$$$
}'
Length of output: 112
Script:
#!/bin/bash
# Let's try a broader search for COCO evaluation related code
# Search for IoU threshold definitions and computations
rg -A 10 "iou.*thresh"
# Look for crowd handling in a broader context
rg -A 5 "crowd"
# Search for AR computation logic
ast-grep --pattern 'def _compute_ar($$$) {
$$$
}'
# Check the base evaluation class
rg -A 10 "class COCOEvaluation"
# Look for PR curve computation
rg -A 10 "_compute_pr"
Length of output: 66948
Script:
#!/bin/bash
# Let's search for COCO evaluation implementation details
# Look for AR computation
rg -A 10 "_compute_ar" fiftyone/utils/eval/coco.py
# Look for PR curve computation
rg -A 10 "_compute_pr_curves" fiftyone/utils/eval/coco.py
# Check for IoU threshold handling in COCO evaluation
rg -A 5 "iou_threshs.*None" fiftyone/utils/eval/coco.py
# Look for crowd handling in COCO evaluation
rg -A 10 "iscrowd.*true|iscrowd.*false" fiftyone/utils/eval/coco.py
Length of output: 1647
Script:
#!/bin/bash
# Search for crowd handling in evaluation matches
rg -A 10 "def _compute_matches.*crowd" fiftyone/utils/eval/coco.py
# Look for crowd attribute handling
rg -A 5 "iscrowd.*=" fiftyone/utils/eval/coco.py
# Check for crowd-related documentation
rg -A 5 "crowd.*ground truth" fiftyone/utils/eval/coco.py
Length of output: 1850
2214753
to
e94c8a0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
fiftyone/utils/eval/coco.py (3)
54-54
: Update docstring to clarify mAR computationThe docstring for
compute_mAP
parameter should be updated to explicitly mention that it enables both mAP and mAR computations.- that mAP, mAR, and PR curves can be generated + that mean Average Precision (mAP), mean Average Recall (mAR), and + Precision-Recall (PR) curves can be generated
259-260
: Add type hints to docstringConsider adding type hints to clarify the expected shape of the
recall_sweep
array.- recall_sweep (None): an array of recall values of shape - ``num_iou x num_classes`` + recall_sweep (None): a numpy.ndarray of recall values with shape + ``(num_iou, num_classes)`` or None
389-417
: Add input validation for classes parameterThe implementation is correct, but consider adding validation for the
classes
parameter to ensure it's not empty when provided.def mAR(self, classes=None): if self._classwise_AR is None: raise Exception( "Classwise AR is not available. mAR can't be computed." ) if classes is not None: + if not classes: + raise ValueError("The classes list cannot be empty") class_inds = np.array([self._get_class_index(c) for c in classes]) classwise_AR = self._classwise_AR[class_inds] else: classwise_AR = self._classwise_AR
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
fiftyone/utils/eval/coco.py
(12 hunks)
🔇 Additional comments (2)
fiftyone/utils/eval/coco.py (2)
222-222
: LGTM!
The addition of recall_sweep
to the return values and constructor call is implemented correctly and maintains backward compatibility.
Also applies to: 236-236
389-417
: Verify mAR integration with evaluation panel
Ensure that the mAR computation is properly integrated with the evaluation panel UI components.
✅ Verification successful
mAR implementation is properly integrated with evaluation components
The verification shows that the mAR computation is correctly integrated:
- The model evaluation panel (
panels/model_evaluation/__init__.py
) retrieves and displays mAR metrics - The UI component (
NativeModelEvaluationView/Evaluation.tsx
) properly renders mAR for object detection evaluations - The implementation follows the established evaluation metrics pattern in the codebase
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for mAR integration in evaluation panel components
# Check for mAR-related UI components
rg -l "mAR" --type ts --type tsx
# Check for mAR-related Python imports and usage
rg -l "mAR" --type python
Length of output: 117
Script:
#!/bin/bash
# Search for mAR-related UI components and Python files using ripgrep
rg -l "mAR"
# Search for evaluation panel related files
rg -l "evaluation.*panel|eval.*panel"
# Search for COCO evaluation imports and usage
rg -l "COCOEvaluation|coco.*eval"
Length of output: 1523
Script:
#!/bin/bash
# Check how mAR is used in the evaluation panel
rg -A 5 "mAR" "fiftyone/operators/builtins/panels/model_evaluation/__init__.py"
# Check the model evaluation view implementation
rg -A 5 "mAR" "app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx"
# Check how COCOEvaluation is integrated with mAR
rg -A 5 "COCOEvaluation.*mAR" "fiftyone/utils/eval/coco.py"
Length of output: 920
@@ -713,6 +755,7 @@ def _compute_pr_curves(samples, config, classes=None, progress=None): | |||
precision = -np.ones((num_threshs, num_classes, 101)) | |||
thresholds = -np.ones((num_threshs, num_classes, 101)) | |||
recall = np.linspace(0, 1, 101) | |||
recall_sweep = -np.ones((num_threshs, num_classes)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve error handling in recall computation
The bare except blocks should be replaced with specific exception handling to avoid masking unexpected errors.
- try:
- for ri, pi in enumerate(inds):
- q[ri] = pre[pi]
- t[ri] = confs[pi]
- except:
- pass
+ try:
+ for ri, pi in enumerate(inds):
+ q[ri] = pre[pi]
+ t[ri] = confs[pi]
+ except IndexError as e:
+ logger.warning(f"Error computing precision values: {e}")
Also applies to: 806-806, 808-808
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What changes are proposed in this pull request?
This PR adds COCO-styled mean average recall to
evaluate_detections()
.How is this patch tested? If it is not, please explain why.
Summary by CodeRabbit
New Features
Documentation