Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Model Evaluation panel callbacks for segmentation tasks (Option 1) #5331

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

brimoor
Copy link
Contributor

@brimoor brimoor commented Dec 31, 2024

Example usage

import random

import fiftyone as fo
import fiftyone.zoo as foz
import fiftyone.core.fields as fof
import fiftyone.utils.labels as foul

dataset = foz.load_zoo_dataset("quickstart", max_samples=10, shuffle=True)

model = foz.load_zoo_model("deeplabv3-resnet101-coco-torch")
dataset.apply_model(model, "resnet101")

model = foz.load_zoo_model("deeplabv3-resnet50-coco-torch")
dataset.apply_model(model, "resnet50")

CLASSES = (
    "background,aeroplane,bicycle,bird,boat,bottle,bus,car,cat,chair,cow," +
    "diningtable,dog,horse,motorbike,person,pottedplant,sheep,sofa,train," +
    "tvmonitor"
)
classes = CLASSES.split(",")
mask_targets = {idx: label for idx, label in enumerate(classes)}

rgb_mask_targets = {
    fof.int_to_hex(random.getrandbits(64)): label
    for label in CLASSES.split(",")
}

_mask_targets = {v: k for k, v in mask_targets.items()}
_rgb_mask_targets = {v: k for k, v in rgb_mask_targets.items()}
targets_map = {_mask_targets[c]: _rgb_mask_targets[c] for c in classes}

dataset.clone_sample_field("resnet101", "resnet101_rgb")
foul.transform_segmentations(
    dataset,
    "resnet101_rgb",
    targets_map,
)

dataset.clone_sample_field("resnet50", "resnet50_rgb")
foul.transform_segmentations(
    dataset,
    "resnet50_rgb",
    targets_map,
)

dataset.mask_targets["resnet101"] = mask_targets
dataset.mask_targets["resnet50"] = mask_targets
dataset.mask_targets["resnet101_rgb"] = rgb_mask_targets
dataset.mask_targets["resnet50_rgb"] = rgb_mask_targets
dataset.save()

# Evaluation with int mask targets
dataset.evaluate_segmentations(
    "resnet50",
    gt_field="resnet101",
    eval_key="eval",
)

# Evaluation with RGB mask targets
dataset.evaluate_segmentations(
    "resnet50_rgb",
    gt_field="resnet101_rgb",
    eval_key="eval_rgb",
    mask_targets=dataset.mask_targets["resnet50_rgb"],
)

sesssion = fo.launch_app(dataset)

Summary by CodeRabbit

  • New Features

    • Added utility functions for color and mask conversions.
    • Enhanced segmentation evaluation capabilities with detailed metrics tracking.
    • Improved model evaluation panel with advanced class mapping and ID handling.
  • Improvements

    • Streamlined color conversion processes across different modules.
    • Added more granular tracking of segmentation evaluation results.
    • Expanded support for complex evaluation scenarios in the model evaluation panel.

@brimoor brimoor requested review from imanjra and prernadh December 31, 2024 19:32
Copy link
Contributor

coderabbitai bot commented Dec 31, 2024

Walkthrough

This pull request introduces new utility functions for color and mask conversions in the FiftyOne library. The changes primarily focus on adding methods to convert between hex color strings and integer representations, and to transform RGB mask arrays. The modifications span multiple files, including fields.py, segmentation.py, and model_evaluation.py, enhancing the library's capabilities for handling color and segmentation-related operations.

Changes

File Change Summary
fiftyone/core/fields.py Added utility functions: hex_to_int(), int_to_hex(), rgb_array_to_int(), int_array_to_rgb() for color and mask conversions.
fiftyone/utils/eval/segmentation.py Updated evaluate_samples() and _parse_confusion_matrix() to handle new dictionaries for weights and IDs; removed local conversion functions.
plugins/panels/model_evaluation.py Added get_classes_map(), _to_object_ids(), and _get_ids_slice() methods to improve segmentation evaluation functionality.

Possibly related PRs

  • model evaluation bug fixes #5166: This PR enhances the Evaluation component, which is related to the main PR's focus on color and mask conversions, as it also deals with evaluation metrics and may utilize the new color conversion functions for visual representation.
  • fix confusion matrix in model evaluation panel #5186: This PR modifies the Evaluation.tsx component to improve confusion matrix handling, which could directly benefit from the new color conversion functions introduced in the main PR for visualizing confusion matrices.
  • Show "missing" counts in confusion matrices #5187: This PR focuses on enhancing confusion matrices by including "missing" counts, which may require color conversions for better visualization, linking it to the new functions in the main PR.
  • Optimize view callbacks for model evaluation panel #5268: This PR optimizes view callbacks for the model evaluation panel, which may involve visual elements that could utilize the new color conversion functions from the main PR for improved user interaction and data representation.

Suggested labels

enhancement

Suggested reviewers

  • imanjra

Poem

🐰 Hop, hop, through colors bright,
Masks and hex, now taking flight,
Conversion magic, smooth and clean,
FiftyOne's toolkit, a coder's dream!
Pixels dance with newfound might! 🎨

Tip

CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command @coderabbitai generate docstrings to have CodeRabbit automatically generate docstrings for your pull request. We would love to hear your feedback on Discord.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

def _parse_index_dict(d):
import ast

return {ast.literal_eval(k): v for k, v in d.items()}
Copy link
Contributor Author

@brimoor brimoor Dec 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably slow; may want to consider other data models so we don't need to do this

@@ -585,6 +610,89 @@ def load_view(self, ctx):
view = eval_view.filter_labels(
pred_field, F(eval_key) == field, only_matches=True
)
elif info.config.type == "segmentation":
results = ctx.dataset.load_evaluation_results(eval_key)
Copy link
Contributor Author

@brimoor brimoor Dec 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dataset objects do cache evaluation results, so if load_evaluation_results() is called multiple times on the same dataset singleton instance, the results only have to be loaded from the database once. However, I don't think we can count on multiple API requests re-using the same dataset object...

An alternative would be to store the necessary data as panel state. However that is also not desirable because the necessary data here is potentially huge lists of IDs, and panel state is serialized with every panel callback, not just load_view().

So, the best pattern for this case needs some thought.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
fiftyone/utils/eval/segmentation.py (2)

397-401: **Consider memory implications when storing pixel counts **

Appending large numbers of pixel counts into dictionaries can lead to high memory usage for large images. If memory usage becomes an issue, consider a more aggregated representation, or store only necessary data.


402-409: **Storing IDs for each confusion matrix cell may incur high overhead **

Storing ground truth and predicted IDs per pixel index is powerful but can become costly in memory for large evaluations. Evaluate whether storing all IDs is essential or if a more aggregated approach may suffice.

plugins/panels/model_evaluation.py (2)

17-17: **Remove unused import to avoid clutter **

Static analysis indicates that fiftyone.core.fields as fof is not referenced in this file. Consider removing it.

- import fiftyone.core.fields as fof
🧰 Tools
🪛 Ruff (0.8.2)

17-17: fiftyone.core.fields imported but unused

Remove unused import: fiftyone.core.fields

(F401)


509-509: **mask_targets2 assigned but never used **

You're assigning mask_targets2 but never referencing it later. Remove or repurpose this variable if needed.

- mask_targets2 = self.get_mask_targets(ctx.dataset, gt_field2)
🧰 Tools
🪛 Ruff (0.8.2)

509-509: Local variable mask_targets2 is assigned to but never used

Remove assignment to unused variable mask_targets2

(F841)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fddb6a4 and 3e9a9cc.

📒 Files selected for processing (3)
  • fiftyone/core/fields.py (1 hunks)
  • fiftyone/utils/eval/segmentation.py (10 hunks)
  • plugins/panels/model_evaluation.py (5 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
plugins/panels/model_evaluation.py

17-17: fiftyone.core.fields imported but unused

Remove unused import: fiftyone.core.fields

(F401)


509-509: Local variable mask_targets2 is assigned to but never used

Remove assignment to unused variable mask_targets2

(F841)

🔇 Additional comments (26)
fiftyone/utils/eval/segmentation.py (17)

336-336: **Ensure mask_targets keys are valid hex strings **

While converting hex string keys to integers is sensible, ensure that all mask values are valid 6-digit RGB hex strings. Hex strings with alpha channels or incorrect length will fail this conversion.


352-354: **Good initialization of dictionaries for enhanced evaluation details **

These dictionaries are properly initialized to capture additional segmentation details. This approach looks solid.


443-445: **Properly passing dictionaries to SegmentationResults **

The new arguments clearly convey the additional pixel-level evaluation data.


460-464: **Updated docstrings accurately document new parameters **

These docstrings help clarify the dict mappings of pixel counts and label IDs.


476-478: **Optional dictionary parameters conform to default usage **

Allowing None for each dict field is a clean approach for backward compatibility.


483-495: **_parse_confusion_matrix usage is consistent **

This invocation with the new dict parameters ensures the extended data is processed.


505-506: **Retaining ytrue_ids / ypred_ids in constructor **

Storing ID arrays at the results level is appropriate for post-processing.


513-515: **Instance-level attributes promote refined analysis **

Storing the dictionaries in the result object makes them readily accessible anywhere in the evaluation report.


518-526: **Attributes list updated for new evaluation fields **

Ensuring these new fields are included in the serialization pipeline is good practice.


544-546: **Parsing index dictionaries for reloading results **

Deserializing into the correct data structure preserves ID associations after loading.


552-568: **_parse_confusion_matrix method gracefully handles extended params **

Optional IDs and weights are integrated neatly, boosting segmentation insights.


576-595: **Populating arrays with IDs and weights **

This loop effectively populates ytrue/ypred/weights arrays while maintaining ID traces for advanced analysis.


598-600: **Utility function for safe dictionary parsing **

Neatly evaluates string keys into (i, j) tuples.


642-642: **3D to 2D mask check for pred_mask **

Ensures that RGB masks are converted when performing confusion matrix calculations.


645-645: **3D to 2D mask check for gt_mask **

Same practice for ground truth masks, guaranteeing consistent shapes.


718-718: **Delegating RGB array conversion to core utility **

Replacing local conversions with fof.rgb_array_to_int promotes code reuse and consistency across the library.


725-725: **Verifying conversions to hex strings **

Generating class labels as hex strings is convenient for color-based displays. Ensure all consumers expect hex strings consistently in the pipeline.

plugins/panels/model_evaluation.py (5)

13-13: **ObjectId import is valid for ID handling **

Used for label ID manipulation.


330-349: **New method get_classes_map ensures consistent class indexing **

This method gracefully converts results.classes into label-to-index mappings, honoring mask targets if available.


614-695: **Enhanced segmentation evaluation loading **

This block elegantly handles legacy vs. new segmentation evaluations. Using _get_ids_slice and _to_object_ids for retrieving pixel-level IDs is a robust approach.


718-720: **_to_object_ids utility is concise **

This function properly converts string IDs to ObjectIds for consistent lookups.


722-737: **_get_ids_slice retrieves segmentation IDs for class-based filtering **

This helper merges row- and column-based IDs, enabling slice-based analyses of confusion matrix cells.

fiftyone/core/fields.py (4)

1624-1637: **hex_to_int method for color conversion **

Straightforwardly shifts each RGB component and merges them, enabling integer-based color representation.


1639-1652: **int_to_hex correctly recreates hex strings **

Effectively masks out each channel and formats them as a #RRGGBB string.


1654-1668: **rgb_array_to_int for 3D to 2D integer mask **

Leveraging bit shifts is a clean approach for combining RGB channels.


1670-1684: **int_array_to_rgb for reversing integer masks **

Complements rgb_array_to_int by splitting stored values back into RGB. This maintains color consistency across the pipeline.

if view_type == "class":
# All GT/predictions that contain class `x`
k = classes_map[x]
ytrue_ids, ypred_ids = _get_ids_slice(results, k)
Copy link
Contributor Author

@brimoor brimoor Dec 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the callbacks that fire when a user clicks on a cell in the Class Performance section. The current data model requires building up a list of IDs for entire row/column slices of the ID dicts. We could potentially optimize the data model for this case. Needs some more thought.

i = classes_map[x]
j = classes_map[y]
ytrue_ids = _to_object_ids(
results.ytrue_ids_dict.get((i, j), [])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confusion matrix callbacks require the IDs for a particular (i, j) entry; so the current data model is roughly optimal for this case.

elif view_type == "field":
if field == "tp":
# All true positives
inds = results.ytrue == results.ypred
Copy link
Contributor Author

@brimoor brimoor Dec 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Segmentation tasks do not currently show TP/FP/FN counts in the Summary section. But if we were to add them, then this implements the callback for the "grid" icons that would appear.

ytrue_ids_dict=None,
ypred_ids_dict=None,
):
have_ids = ytrue_ids_dict is not None and ypred_ids_dict is not None
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a non-trivial amount of data transformation, which, as currently implemented, must be performed every time a SegmentationResults object is loaded. May want to consider optimizing the data model here for speed.

@brimoor brimoor force-pushed the segmentation-callbacks branch from 3e9a9cc to 663adf8 Compare December 31, 2024 21:09
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
fiftyone/utils/eval/segmentation.py (1)

397-409: Populate per-pixel tracking dictionaries carefully

Storing pixel-level details in weights_dict, ytrue_ids_dict, and ypred_ids_dict is a powerful approach, but keep an eye on memory usage for large images. Consider streaming or chunking if images get large. Aside from potential memory overhead, the logic is correct—particularly the check if index not in dict, then append.

plugins/panels/model_evaluation.py (2)

17-17: Unused import “fiftyone.core.fields as fof”

According to static analysis, fof is imported here but never referenced in this file. Consider removing it to clarify what is actually needed.

Here's a suggested diff:

-import fiftyone.core.fields as fof
🧰 Tools
🪛 Ruff (0.8.2)

17-17: fiftyone.core.fields imported but unused

Remove unused import: fiftyone.core.fields

(F401)


503-503: Unused local variable “mask_targets2”

mask_targets2 is assigned but never used. Consider removing this assignment or incorporating it if it’s needed for future logic.

Here's a suggested diff:

-        mask_targets2 = mask_targets
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3e9a9cc and 663adf8.

📒 Files selected for processing (3)
  • fiftyone/core/fields.py (1 hunks)
  • fiftyone/utils/eval/segmentation.py (10 hunks)
  • plugins/panels/model_evaluation.py (5 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
plugins/panels/model_evaluation.py

17-17: fiftyone.core.fields imported but unused

Remove unused import: fiftyone.core.fields

(F401)


509-509: Local variable mask_targets2 is assigned to but never used

Remove assignment to unused variable mask_targets2

(F841)

🔇 Additional comments (29)
fiftyone/core/fields.py (2)

1624-1651: Add color-to-integer conversion utility: looks solid

These functions correctly convert between a hex string (e.g. "#ff6d04") and an integer representation. The bit-shift operations and string-slicing logic appear correct for typical 24-bit hex colors. Implementation is straightforward and performs well for single color conversions.


1654-1684: RGB mask array conversions are well-implemented

The rgb_array_to_int and int_array_to_rgb functions correctly perform per-pixel conversions via left/right bit shifts. The usage of np.stack is clean, and the choice of 8-bit masking (& 255) is standard for RGB. The Dtype considerations look fine.

fiftyone/utils/eval/segmentation.py (15)

336-336: Add RGB mask support for integer keys

Conditionally converting RGB hex mask targets to integer keys here is a neat approach, ensuring internal consistency of mask values. Matches nicely with the newly introduced color conversion utilities.


352-354: Use dictionaries for advanced segmentation tracking

The addition of weights_dict, ytrue_ids_dict, and ypred_ids_dict paves the way for storing pixel counts and IDs mapped to confusion matrix indices. This is a good design for deeper analyses (e.g., retrieving the specific segmentation instances).


443-445: Key-value argument passing of dictionaries to SegmentationResults

Passing your new dictionaries in the constructor is a clean approach for reusability. This chunk is straightforward.


460-464: Documentation for new segmentation dictionaries

The docstring nicely clarifies their purpose. Ensuring the user is aware these store per-(i, j) data for confusion matrix usage is helpful.


476-478: Constructor arguments initialization for new dictionaries

They are appropriately defaulted to None. Implementation is consistent with the docstring.


483-490: Orderly parse of confusion-matrix dictionaries

This lines up with _parse_confusion_matrix, extracting the relevant arrays from the confusion matrix before super-constructor is invoked. Straightforward approach.


505-506: Passing per-instance ID arrays up the chain

The addition of ytrue_ids and ypred_ids to the super constructor is consistent. Looks good.


513-515: Store new dictionaries for advanced analysis

Saving weights_dict, ytrue_ids_dict, ypred_ids_dict in SegmentationResults is beneficial. Doing so promotes introspection usage for advanced callbacks.


518-526: Extending attributes() to include new fields

Listing weights_dict, ytrue_ids_dict, and ypred_ids_dict ensures they're persisted in the results. This is consistent with the prior pattern.


544-546: Deserialization of new dictionaries

The _parse_index_dict() usage for weights_dict, ytrue_ids_dict, ypred_ids_dict ensures these dictionaries are re-hydrated appropriately from stored data.


552-560: Refactored parse of confusion matrix with per-pixel data

This function elegantly handles the presence or absence of per-instance IDs. Good conditional logic to handle legacy vs. new style.


598-601: Use of ast.literal_eval to parse back dictionary keys

This is a safe, built-in approach for re-constructing tuple keys. It's also quicker than alternatives for small data. Implementation is straightforward.


642-642: Applying rgb_array_to_int to predicted masks

This aligns with your approach of standardizing mask arrays as integer masks. Implementation is minimal and reduces special-case logic downstream.


645-645: Applying rgb_array_to_int to ground truth masks

Same standardization logic as for predicted masks. Good consistency.


718-718: Converting segmentation masks to int arrays in _get_mask_values

Reusing your rgb_array_to_int utility is consistent with the rest of the pipeline. This helps unify how mask arrays are processed.

plugins/panels/model_evaluation.py (12)

13-13: Import note: “bson.ObjectId” is used but no issues here

The ObjectId import is correctly used for _to_object_ids(). This is valid and necessary.


330-349: New get_classes_map method for segmentation

  1. Reads mask targets from the dataset, converts keys to string, then re-maps them to label strings if present.
  2. Returns a dictionary mapping class label → index.
  3. This is succinct and handles varied class representations well.

Message: Looks good.


498-498: Retrieving mask targets for the predicted field

This is consistent with the code’s usage for segmentation tasks. Straightforward single-line usage.


614-616: Load segmentation results and check new dictionary fields

This logic ensures the loaded evaluation results contain advanced dictionaries (ytrue_ids_dict, ypred_ids_dict). If they’re missing, the code gracefully treats it as a legacy evaluation. Good approach for backward compatibility.


627-630: Similarly load second evaluation if provided

Calls get_classes_map for the second results object. Straightforward logic for side-by-side comparisons.


631-637: Handle legacy segmentation for the second evaluation

Same backward compatibility logic. This is consistent with the prior approach.


640-648: Retrieve ObjectId fields for ground truth and predictions

The _get_label_field_path usage is correct to retrieve the _id field name. This ensures correct references for the next queries.


649-662: Viewing all GT/predictions of class x

This block retrieves the object IDs from _get_ids_slice() and matches them in the dataset. Straightforward approach to filter by relevant IDs.


663-675: Confusion matrix cell view for a single (x, y) pair

Indices are mapped from the classes, and the stored dictionaries are used to retrieve relevant IDs. This approach is efficient.


676-695: Handling TP/FN/FP views for segmentation

You’re using the arrays in results to identify the relevant sample sets. Matching on _id fields works well. This is consistent with the dictionary logic from earlier.


716-720: Utility method _to_object_ids

Simple function that casts the provided list of strings to ObjectId. The design is straightforward and safe.


722-737: Slice function _get_ids_slice for row/col in pixel_confusion_matrix

  1. Loops across columns for ground truth row k, then across rows for predicted column k.
  2. Aggregates lists of IDs from your dictionaries.
  3. Returns them as ObjectIds.

This is an effective approach for partial slicing of confusion matrix data.

@brimoor brimoor changed the title Add Model Evaluation panel callbacks for segmentation tasks Add Model Evaluation panel callbacks for segmentation tasks (Option 1) Jan 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants