-
Notifications
You must be signed in to change notification settings - Fork 584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Model Evaluation panel callbacks for segmentation tasks (Option 2) #5332
base: develop
Are you sure you want to change the base?
Conversation
WalkthroughThis pull request introduces new functions for color and mask conversion in the FiftyOne library, enhancing segmentation evaluation capabilities. Changes are made across three files: Changes
Sequence DiagramsequenceDiagram
participant Core as Core Fields
participant Eval as Segmentation Evaluation
participant Panel as Evaluation Panel
Core->>Eval: Provide color conversion utilities
Eval->>Eval: Track pixel-wise matches
Eval->>Panel: Pass segmentation results
Panel->>Panel: Process and visualize results
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
📜 Recent review detailsConfiguration used: .coderabbit.yaml 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
🧰 Additional context used🪛 Ruff (0.8.2)plugins/panels/model_evaluation.py17-1717: Remove unused import: (F401) 🔇 Additional comments (5)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
33a2dac
to
33f06a8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
plugins/panels/model_evaluation.py (4)
17-17
: Unused import
import fiftyone.core.fields as fof
is never referenced. Consider removing it to satisfy linters and reduce clutter.🧰 Tools
🪛 Ruff (0.8.2)
17-17:
fiftyone.core.fields
imported but unusedRemove unused import:
fiftyone.core.fields
(F401)
485-485
:mask_targets2
assigned.
The variable is set here but appears to be overwritten later, leading to redundant assignments.
491-491
:mask_targets2
is never effectively used.
Kindly remove or integrate it if necessary; currently it generates lint warnings.🧰 Tools
🪛 Ruff (0.8.2)
491-491: Local variable
mask_targets2
is assigned to but never usedRemove assignment to unused variable
mask_targets2
(F841)
685-734
:_init_segmentation_results
: assembling ID dictionaries.
This function modifies the passed-inresults
object to map(i, j)
pairs to lists of IDs. Be cautious about naming collisions if this is run multiple times; consider clearing any stale data.fiftyone/core/fields.py (1)
1627-1636
: Implementation ofhex_to_int
.
Simple bit-shift logic is correct. Provide error handling for malformed hex strings if user input is allowed.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
fiftyone/core/fields.py
(1 hunks)fiftyone/utils/eval/segmentation.py
(11 hunks)plugins/panels/model_evaluation.py
(5 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
plugins/panels/model_evaluation.py
17-17: fiftyone.core.fields
imported but unused
Remove unused import: fiftyone.core.fields
(F401)
491-491: Local variable mask_targets2
is assigned to but never used
Remove assignment to unused variable mask_targets2
(F841)
🔇 Additional comments (25)
fiftyone/utils/eval/segmentation.py (13)
11-11
: The added import is properly used for creating repeat iterators.
No issues identified; it is used in the _from_dict
method when handling “no ID” scenarios.
337-337
: Validate the hex string keys for consistent usage.
This dictionary comprehension properly converts hex string keys to integers. Consider verifying that all user-supplied hex strings follow the #RRGGBB
format before conversion to avoid potential ValueError
.
353-353
: Initialization of the matches
list.
No issues here; it neatly collects pixel-wise mapping data for subsequent analysis.
396-406
: Appending match details for segmentation.
This loop accurately records ground truth/prediction associations and pixel counts. However, this can grow large for massive images or datasets. Be mindful of memory usage if used repeatedly in large-scale evaluations.
440-440
: Passing the newly built matches
to SegmentationResults
.
Clean approach to provide the collected matches in the results constructor.
455-457
: Docstring accurately reflects the new matches
field.
The description matches the tuple structure from the evaluation loop.
469-469
: New matches
parameter defaults to None
.
This is a good backward-compatible signature update.
Line range hint 475-492
: Conditional handling of matches
.
The fallback to parse pixel_confusion_matrix
when matches
is None
ensures compatibility with legacy workflows. Watch for potential ValueError
if ytrue, ypred, weights
do not align in length.
510-529
: Consistent _from_dict
logic for matches
.
Correctly handles both new and legacy (no IDs) formats, merging them into a uniform list of tuples.
534-534
: Passing the reconstructed matches
in _from_dict
.
Good consistency with the constructor.
594-597
: RGB to integer masking for dimensional consistency.
Properly uses fof.rgb_array_to_int
to handle multi-channel arrays.
670-670
: Use of new utility for RGB array conversion.
Reusing fof.rgb_array_to_int
avoids code duplication.
677-677
: Generating hex class labels from integer-based values.
Efficient approach for color-coded segmentation classes.
plugins/panels/model_evaluation.py (8)
13-13
: Necessary import for ObjectId usage.
This is used in _to_object_ids
.
350-358
: Loading segmentation results and initializing them.
Assigning mask_targets
and calling _init_segmentation_results
is a clear approach to unify the data before proceeding with metrics. Make sure to handle any potential logging or warnings if results are partially missing.
596-611
: Segmentations with legacy format.
Early returns handle older data where IDs don’t exist. Ensure end users receive a clear message if early-exiting leads to partial data in the UI.
612-664
: Match expressions for segmentation subviews.
This logic effectively filters segmentation results by class/matrix cell. It might be beneficial to confirm performance on large datasets, as multiple .is_in()
calls could be costly.
736-752
: _get_segmentation_class_ids
: retrieving matching IDs by class.
Check for key existence in results._classes_map[x]
to avoid KeyError
if x
is not recognized.
755-760
: _get_segmentation_conf_mat_ids
: confusion matrix IDs.
Straightforward approach to isolate matches. This is well-structured.
762-780
: _get_segmentation_tp_fp_fn_ids
: basic classification logic for pixel-level segmentation.
The approach is consistent with typical definitions of TP, FP, and FN. If large sets are expected, consider memory usage.
782-783
: _to_object_ids
: converting string IDs to ObjectId
.
Simple utility that is helpful for consistent typed usage. Ensure _id
is always a valid string to avoid conversion errors.
fiftyone/core/fields.py (4)
1624-1625
: hex_to_int
function declaration and docstring.
Docstring is clear; confirm that input always starts with '#'
and contains exactly 6 hex characters.
1639-1652
: int_to_hex
: Reverse conversion from int to hex.
Logic is standard. No issues observed.
1654-1668
: rgb_array_to_int
: Transforming 3D RGB arrays to 2D integer arrays.
The use of NumPy bit-shifts is efficient and readable. Ensure mask
is always [..., 3]
shape or raise warnings.
1670-1684
: int_array_to_rgb
: Restoring 3D RGB arrays from integer-based masks.
Works in parallel with rgb_array_to_int
. Usage of np.stack
is correct.
matches (None): a list of | ||
``(gt_label, pred_label, pixel_count, gt_id, pred_id)`` | ||
matches |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this data model better - reads cleaner and like you said avoids calling _parse_confusion_matrix when present.
if ytrue_ids is None: | ||
ytrue_ids = itertools.repeat(None) | ||
|
||
if ypred_ids is None: | ||
ypred_ids = itertools.repeat(None) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if ytrue_ids is None: | |
ytrue_ids = itertools.repeat(None) | |
if ypred_ids is None: | |
ypred_ids = itertools.repeat(None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Won't need the None
check here because of the previous if-statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually they are required:
In previous versions of this code, ytrue
, ypred
, and weights
were not persisted as properties of SegmentationResults
. If a user loads such a pre-existing segmentation evaluation and then calls results.save()
, this will create a new version of the results that does persist ytrue
, ypred
and weights
(as parsed from _parse_confusion_matrix()
). However, there still won't be ytrue_ids
and ypred_ids
for these results, so these if
statements are needed to ensure that the next time these results are loaded, we'll be able to construct the matches
object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense!
expr = F(gt_id).is_in(ytrue_ids) | ||
expr &= F(pred_id).is_in(ypred_ids) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmmm, I see this expression should evaluate much faster than select_labels
since you are specifying which labels to look for in which fields. Nice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly!
if field == "tp": | ||
# True positives | ||
inds = results.ytrue == results.ypred | ||
ytrue_ids = _to_object_ids(results.ytrue_ids[inds]) | ||
ypred_ids = _to_object_ids(results.ypred_ids[inds]) | ||
return ytrue_ids, ypred_ids | ||
elif field == "fn": | ||
# False negatives | ||
inds = results.ypred == results.missing | ||
ytrue_ids = _to_object_ids(results.ytrue_ids[inds]) | ||
return ytrue_ids, None | ||
else: | ||
# False positives | ||
inds = results.ytrue == results.missing | ||
ypred_ids = _to_object_ids(results.ypred_ids[inds]) | ||
return None, ypred_ids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we want to move this tp/fp/fn calculation to utils/eval/segmentation.py and make it a sample level field so we can filter on it - similar to detections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what we'd store as a sample-level field. The TP/FP/FN designation has to do with each region in the segmentation mask, so there would usually be multiple per sample (like object detection tasks for example), but the the mask is stored in a single Segmentation
field (unlike object detection).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm.. I see what you are saying- but I guess my confusion is arising from the fact that if we are marking labels as TP/FP/FN, we should be able to filter by it at the sample level too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it would be nice to support more sample-level filtering, but I don't know what to do!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep understood. Makes sense to leave as is for now then. Maybe something to discuss with the ML team
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question about high level TP/FP/FN design
elif view_type == "field": | ||
if field == "tp": | ||
# All true positives | ||
ytrue_ids, ypred_ids = _get_segmentation_tp_fp_fn_ids( | ||
results, field | ||
) | ||
expr = F(gt_id).is_in(ytrue_ids) | ||
expr &= F(pred_id).is_in(ypred_ids) | ||
view = eval_view.match(expr) | ||
elif field == "fn": | ||
# All false negatives | ||
ytrue_ids, _ = _get_segmentation_tp_fp_fn_ids( | ||
results, field | ||
) | ||
expr = F(gt_id).is_in(ytrue_ids) | ||
view = eval_view.match(expr) | ||
else: | ||
# All false positives | ||
_, ypred_ids = _get_segmentation_tp_fp_fn_ids( | ||
results, field | ||
) | ||
expr = F(pred_id).is_in(ypred_ids) | ||
view = eval_view.match(expr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently don't display FP/FN/TP in the summary table- get_tp_fp_fn
function will have to be updated for segmentations if we ever want to reach this section of the code is my understanding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I went ahead and implemented the callback so it would be available if we found a reasonable way to show TP/FP/FN icons in the panel. Not quite sure what the best way to show this info would be though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I understand now! Realized you had also left comment on the other PR explaining this. Apologies
@@ -424,6 +437,7 @@ def evaluate_samples( | |||
eval_key, | |||
confusion_matrix, | |||
classes, | |||
matches=matches, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
matches=matches, | |
matches=matches if matches!=[] else None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is causing tests to fail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Just fixed this in a slightly different way, for consistency with how object detection handles this:
fiftyone/fiftyone/utils/eval/detection.py
Lines 543 to 553 in 46bd6bd
if matches: | |
ytrue, ypred, ious, confs, ytrue_ids, ypred_ids = zip(*matches) | |
else: | |
ytrue, ypred, ious, confs, ytrue_ids, ypred_ids = ( | |
[], | |
[], | |
[], | |
[], | |
[], | |
[], | |
) |
33f06a8
to
5e45561
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 😄
Alternate data model for #5331.
TODO before merging
load_evaluation_results()
; cf this commentload_view()