Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Model Evaluation panel callbacks for segmentation tasks (Option 2) #5332

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

brimoor
Copy link
Contributor

@brimoor brimoor commented Jan 1, 2025

Alternate data model for #5331.

TODO before merging

  • Investigate alternatives that don't require calling load_evaluation_results(); cf this comment
  • Benchmark performance of using label IDs to implement load_view()
  • Figure out a way to show TP/FP/FN data in the ME panel?
  • Figure out a way to support TP/FP/FN filtering in the App sidebar?

@brimoor brimoor requested review from imanjra and prernadh January 1, 2025 00:32
Copy link
Contributor

coderabbitai bot commented Jan 1, 2025

Walkthrough

This pull request introduces new functions for color and mask conversion in the FiftyOne library, enhancing segmentation evaluation capabilities. Changes are made across three files: fiftyone/core/fields.py, which adds functions for converting between hex and integer color formats; fiftyone/utils/eval/segmentation.py, which updates evaluation methods to track pixel-wise matches; and plugins/panels/model_evaluation.py, which improves the handling of segmentation results in the evaluation panel.

Changes

File Change Summary
fiftyone/core/fields.py Added 4 new color conversion functions: hex_to_int(), int_to_hex(), rgb_array_to_int(), int_array_to_rgb()
fiftyone/utils/eval/segmentation.py Updated SimpleEvaluation and SegmentationResults classes to support pixel-wise match tracking and utilize new color conversion functions
plugins/panels/model_evaluation.py Added methods for processing segmentation results, updated load_evaluation and load_view methods to handle segmentation evaluations

Sequence Diagram

sequenceDiagram
    participant Core as Core Fields
    participant Eval as Segmentation Evaluation
    participant Panel as Evaluation Panel
    
    Core->>Eval: Provide color conversion utilities
    Eval->>Eval: Track pixel-wise matches
    Eval->>Panel: Pass segmentation results
    Panel->>Panel: Process and visualize results
Loading

Possibly related PRs

  • model evaluation bug fixes #5166: The changes in this PR enhance the evaluation component, particularly in handling confusion matrices, which relates to the new functions for color and mask conversions in the main PR.
  • fix confusion matrix in model evaluation panel #5186: This PR focuses on fixing the confusion matrix in the model evaluation panel, which is directly related to the new functions introduced in the main PR that handle color data, as confusion matrices often utilize color representations.
  • Show "missing" counts in confusion matrices #5187: This PR adds functionality to show "missing" counts in confusion matrices, which connects to the main PR's enhancements in color and mask handling, as accurate representation of evaluation metrics often relies on proper color coding.
  • Optimize view callbacks for model evaluation panel #5268: This PR optimizes view callbacks for the model evaluation panel, which includes interactions with confusion matrices, linking it to the main PR's new functions that improve color handling in visualizations.

Suggested labels

enhancement

Suggested reviewers

  • imanjra
  • Br2850

Poem

🐰 Colors dance, pixels align,
In FiftyOne's magical design,
Hex to int, a rabbit's delight,
Segmentation metrics shine so bright,
Code transforms with playful grace! 🎨


📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 33f06a8 and 5e45561.

📒 Files selected for processing (2)
  • fiftyone/utils/eval/segmentation.py (11 hunks)
  • plugins/panels/model_evaluation.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • fiftyone/utils/eval/segmentation.py
🧰 Additional context used
🪛 Ruff (0.8.2)
plugins/panels/model_evaluation.py

17-1717: fiftyone.core.fields imported but unused

Remove unused import: fiftyone.core.fields

(F401)

🔇 Additional comments (5)
plugins/panels/model_evaluation.py (5)

13-13: LGTM: Required imports for segmentation support

The new imports are necessary:

  • ObjectId for handling MongoDB object IDs in segmentation results
  • fiftyone.core.fields for color conversion utilities used in segmentation handling

Also applies to: 17-17


350-358: LGTM: Proper initialization of segmentation results

The changes correctly handle segmentation evaluation by:

  1. Retrieving mask targets from the dataset
  2. Initializing segmentation results with the mask targets

592-661: LGTM: Comprehensive segmentation view handling

The implementation properly handles:

  • Legacy format detection
  • Multiple evaluation comparisons
  • Different view types (class, matrix, field)
  • Ground truth and prediction ID mapping

684-731: LGTM: Robust segmentation results initialization

The function handles multiple input formats elegantly:

  1. Stringified pixel values
  2. RGB hex strings
  3. Label strings

The implementation creates efficient lookup dictionaries for quick access during callbacks.


733-780: LGTM: Well-organized helper functions

The helper functions are:

  • Well-organized with clear responsibilities
  • Efficiently implemented using dictionary lookups
  • Properly handle ID conversions

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@brimoor brimoor changed the title Add Model Evaluation panel callbacks for segmentation tasks (Option 1) Add Model Evaluation panel callbacks for segmentation tasks (Option 2) Jan 1, 2025
@brimoor brimoor force-pushed the segmentation-callbacks2 branch from 33a2dac to 33f06a8 Compare January 1, 2025 00:35
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
plugins/panels/model_evaluation.py (4)

17-17: Unused import
import fiftyone.core.fields as fof is never referenced. Consider removing it to satisfy linters and reduce clutter.

🧰 Tools
🪛 Ruff (0.8.2)

17-17: fiftyone.core.fields imported but unused

Remove unused import: fiftyone.core.fields

(F401)


485-485: mask_targets2 assigned.
The variable is set here but appears to be overwritten later, leading to redundant assignments.


491-491: mask_targets2 is never effectively used.
Kindly remove or integrate it if necessary; currently it generates lint warnings.

🧰 Tools
🪛 Ruff (0.8.2)

491-491: Local variable mask_targets2 is assigned to but never used

Remove assignment to unused variable mask_targets2

(F841)


685-734: _init_segmentation_results: assembling ID dictionaries.
This function modifies the passed-in results object to map (i, j) pairs to lists of IDs. Be cautious about naming collisions if this is run multiple times; consider clearing any stale data.

fiftyone/core/fields.py (1)

1627-1636: Implementation of hex_to_int.
Simple bit-shift logic is correct. Provide error handling for malformed hex strings if user input is allowed.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fddb6a4 and 33a2dac.

📒 Files selected for processing (3)
  • fiftyone/core/fields.py (1 hunks)
  • fiftyone/utils/eval/segmentation.py (11 hunks)
  • plugins/panels/model_evaluation.py (5 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
plugins/panels/model_evaluation.py

17-17: fiftyone.core.fields imported but unused

Remove unused import: fiftyone.core.fields

(F401)


491-491: Local variable mask_targets2 is assigned to but never used

Remove assignment to unused variable mask_targets2

(F841)

🔇 Additional comments (25)
fiftyone/utils/eval/segmentation.py (13)

11-11: The added import is properly used for creating repeat iterators.
No issues identified; it is used in the _from_dict method when handling “no ID” scenarios.


337-337: Validate the hex string keys for consistent usage.
This dictionary comprehension properly converts hex string keys to integers. Consider verifying that all user-supplied hex strings follow the #RRGGBB format before conversion to avoid potential ValueError.


353-353: Initialization of the matches list.
No issues here; it neatly collects pixel-wise mapping data for subsequent analysis.


396-406: Appending match details for segmentation.
This loop accurately records ground truth/prediction associations and pixel counts. However, this can grow large for massive images or datasets. Be mindful of memory usage if used repeatedly in large-scale evaluations.


440-440: Passing the newly built matches to SegmentationResults.
Clean approach to provide the collected matches in the results constructor.


455-457: Docstring accurately reflects the new matches field.
The description matches the tuple structure from the evaluation loop.


469-469: New matches parameter defaults to None.
This is a good backward-compatible signature update.


Line range hint 475-492: Conditional handling of matches.
The fallback to parse pixel_confusion_matrix when matches is None ensures compatibility with legacy workflows. Watch for potential ValueError if ytrue, ypred, weights do not align in length.


510-529: Consistent _from_dict logic for matches.
Correctly handles both new and legacy (no IDs) formats, merging them into a uniform list of tuples.


534-534: Passing the reconstructed matches in _from_dict.
Good consistency with the constructor.


594-597: RGB to integer masking for dimensional consistency.
Properly uses fof.rgb_array_to_int to handle multi-channel arrays.


670-670: Use of new utility for RGB array conversion.
Reusing fof.rgb_array_to_int avoids code duplication.


677-677: Generating hex class labels from integer-based values.
Efficient approach for color-coded segmentation classes.

plugins/panels/model_evaluation.py (8)

13-13: Necessary import for ObjectId usage.
This is used in _to_object_ids.


350-358: Loading segmentation results and initializing them.
Assigning mask_targets and calling _init_segmentation_results is a clear approach to unify the data before proceeding with metrics. Make sure to handle any potential logging or warnings if results are partially missing.


596-611: Segmentations with legacy format.
Early returns handle older data where IDs don’t exist. Ensure end users receive a clear message if early-exiting leads to partial data in the UI.


612-664: Match expressions for segmentation subviews.
This logic effectively filters segmentation results by class/matrix cell. It might be beneficial to confirm performance on large datasets, as multiple .is_in() calls could be costly.


736-752: _get_segmentation_class_ids: retrieving matching IDs by class.
Check for key existence in results._classes_map[x] to avoid KeyError if x is not recognized.


755-760: _get_segmentation_conf_mat_ids: confusion matrix IDs.
Straightforward approach to isolate matches. This is well-structured.


762-780: _get_segmentation_tp_fp_fn_ids: basic classification logic for pixel-level segmentation.
The approach is consistent with typical definitions of TP, FP, and FN. If large sets are expected, consider memory usage.


782-783: _to_object_ids: converting string IDs to ObjectId.
Simple utility that is helpful for consistent typed usage. Ensure _id is always a valid string to avoid conversion errors.

fiftyone/core/fields.py (4)

1624-1625: hex_to_int function declaration and docstring.
Docstring is clear; confirm that input always starts with '#' and contains exactly 6 hex characters.


1639-1652: int_to_hex: Reverse conversion from int to hex.
Logic is standard. No issues observed.


1654-1668: rgb_array_to_int: Transforming 3D RGB arrays to 2D integer arrays.
The use of NumPy bit-shifts is efficient and readable. Ensure mask is always [..., 3] shape or raise warnings.


1670-1684: int_array_to_rgb: Restoring 3D RGB arrays from integer-based masks.
Works in parallel with rgb_array_to_int. Usage of np.stack is correct.

Comment on lines +455 to +457
matches (None): a list of
``(gt_label, pred_label, pixel_count, gt_id, pred_id)``
matches
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this data model better - reads cleaner and like you said avoids calling _parse_confusion_matrix when present.

Comment on lines +517 to +524
if ytrue_ids is None:
ytrue_ids = itertools.repeat(None)

if ypred_ids is None:
ypred_ids = itertools.repeat(None)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if ytrue_ids is None:
ytrue_ids = itertools.repeat(None)
if ypred_ids is None:
ypred_ids = itertools.repeat(None)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Won't need the None check here because of the previous if-statement

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually they are required:

In previous versions of this code, ytrue, ypred, and weights were not persisted as properties of SegmentationResults. If a user loads such a pre-existing segmentation evaluation and then calls results.save(), this will create a new version of the results that does persist ytrue, ypred and weights (as parsed from _parse_confusion_matrix()). However, there still won't be ytrue_ids and ypred_ids for these results, so these if statements are needed to ensure that the next time these results are loaded, we'll be able to construct the matches object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense!

Comment on lines +636 to +637
expr = F(gt_id).is_in(ytrue_ids)
expr &= F(pred_id).is_in(ypred_ids)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmmm, I see this expression should evaluate much faster than select_labels since you are specifying which labels to look for in which fields. Nice

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly!

Comment on lines +761 to +776
if field == "tp":
# True positives
inds = results.ytrue == results.ypred
ytrue_ids = _to_object_ids(results.ytrue_ids[inds])
ypred_ids = _to_object_ids(results.ypred_ids[inds])
return ytrue_ids, ypred_ids
elif field == "fn":
# False negatives
inds = results.ypred == results.missing
ytrue_ids = _to_object_ids(results.ytrue_ids[inds])
return ytrue_ids, None
else:
# False positives
inds = results.ytrue == results.missing
ypred_ids = _to_object_ids(results.ypred_ids[inds])
return None, ypred_ids
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we want to move this tp/fp/fn calculation to utils/eval/segmentation.py and make it a sample level field so we can filter on it - similar to detections?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what we'd store as a sample-level field. The TP/FP/FN designation has to do with each region in the segmentation mask, so there would usually be multiple per sample (like object detection tasks for example), but the the mask is stored in a single Segmentation field (unlike object detection).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. I see what you are saying- but I guess my confusion is arising from the fact that if we are marking labels as TP/FP/FN, we should be able to filter by it at the sample level too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it would be nice to support more sample-level filtering, but I don't know what to do!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep understood. Makes sense to leave as is for now then. Maybe something to discuss with the ML team

@prernadh prernadh self-requested a review January 2, 2025 17:41
Copy link
Contributor

@prernadh prernadh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question about high level TP/FP/FN design

Comment on lines +639 to +661
elif view_type == "field":
if field == "tp":
# All true positives
ytrue_ids, ypred_ids = _get_segmentation_tp_fp_fn_ids(
results, field
)
expr = F(gt_id).is_in(ytrue_ids)
expr &= F(pred_id).is_in(ypred_ids)
view = eval_view.match(expr)
elif field == "fn":
# All false negatives
ytrue_ids, _ = _get_segmentation_tp_fp_fn_ids(
results, field
)
expr = F(gt_id).is_in(ytrue_ids)
view = eval_view.match(expr)
else:
# All false positives
_, ypred_ids = _get_segmentation_tp_fp_fn_ids(
results, field
)
expr = F(pred_id).is_in(ypred_ids)
view = eval_view.match(expr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently don't display FP/FN/TP in the summary table- get_tp_fp_fn function will have to be updated for segmentations if we ever want to reach this section of the code is my understanding

Copy link
Contributor Author

@brimoor brimoor Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I went ahead and implemented the callback so it would be available if we found a reasonable way to show TP/FP/FN icons in the panel. Not quite sure what the best way to show this info would be though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I understand now! Realized you had also left comment on the other PR explaining this. Apologies

@@ -424,6 +437,7 @@ def evaluate_samples(
eval_key,
confusion_matrix,
classes,
matches=matches,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
matches=matches,
matches=matches if matches!=[] else None,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is causing tests to fail

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Just fixed this in a slightly different way, for consistency with how object detection handles this:

if matches:
ytrue, ypred, ious, confs, ytrue_ids, ypred_ids = zip(*matches)
else:
ytrue, ypred, ious, confs, ytrue_ids, ypred_ids = (
[],
[],
[],
[],
[],
[],
)

@brimoor brimoor force-pushed the segmentation-callbacks2 branch from 33f06a8 to 5e45561 Compare January 2, 2025 23:44
@prernadh prernadh self-requested a review January 3, 2025 02:14
Copy link
Contributor

@prernadh prernadh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants