[BUG] Model Evaluation panel: `load_evaluation` runtime error #5254

deltheil · 2024-12-10T18:23:51Z

Describe the problem

When using the new model evaluation panel from FiftyOne v1.1.0 on the vanilla tutorial, things work like a charm.

But, it fails with an unsupported operand type(s) for +: 'int' and 'NoneType' runtime error when evaluating detections via this slightly more elaborated example.

Code to reproduce issue

The vanilla example works fine:

import fiftyone.zoo as foz
dataset = foz.load_zoo_dataset("quickstart")  # detections pre-computed
results = dataset.evaluate_detections(
    "predictions",
    gt_field="ground_truth",
    eval_key="eval",
)

But, this example fails:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

model = foz.load_zoo_model('faster-rcnn-resnet50-fpn-coco-torch')
dataset = foz.load_zoo_dataset(
    "coco-2017",
    split="validation",
    dataset_name="evaluate-detections-tutorial",
)
dataset.persistent = True

predictions_view = dataset.take(100, seed=51)
predictions_view.apply_model(model, label_field="faster_rcnn")

high_conf_view = predictions_view.filter_labels("faster_rcnn", F("confidence") > 0.75, only_matches=False)
dataset.save_view("high-confidence", high_conf_view)
results = high_conf_view.evaluate_detections(
    "faster_rcnn",
    gt_field="ground_truth",
    eval_key="eval",
    compute_mAP=True,
)

Then open the app > Model evaluation tab > Select the "eval" evaluation:

With the following stack trace in the console:

File "/home/foo/bar/.venv/lib/python3.12/site-packages/fiftyone/operators/executor.py", line 297, in execute_or_delegate_operator
    result = await do_execute_operator(operator, ctx, exhaust=exhaust)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/foo/bar/.venv/lib/python3.12/site-packages/fiftyone/operators/executor.py", line 341, in do_execute_operator
    result = await (
             ^^^^^^^
  File "/home/foo/bar/.venv/lib/python3.12/site-packages/fiftyone/core/utils.py", line 2355, in run_sync_task
    return await loop.run_in_executor(_get_sync_task_executor(), func, *args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cedric/.rye/py/[email protected]/lib/python3.12/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/foo/bar/.venv/lib/python3.12/site-packages/fiftyone/operators/panel.py", line 165, in execute
    method(ctx)
  File "/home/foo/bar/.venv/lib/python3.12/site-packages/fiftyone/operators/builtins/panels/model_evaluation/__init__.py", line 300, in load_evaluation
    metrics["tp"], metrics["fp"], metrics["fn"] = self.get_tp_fp_fn(
                                                  ^^^^^^^^^^^^^^^^^^
  File "/home/foo/bar/.venv/lib/python3.12/site-packages/fiftyone/operators/builtins/panels/model_evaluation/__init__.py", line 115, in get_tp_fp_fn
    sum(ctx.dataset.values(tp_key))
TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'

This is happening during the call to the "@voxel51/operators/model_evaluation_panel_builtin" operator with the "load_evaluation " method.

Given the line which failed, I checked that all corresponding values are not None:

>>> high_conf_view.values("eval_tp")
[2, 1, 2, 5, 1, 6, 3, 3, 0, 3, 0, 4, 17, 0, 0, 0, 8, 19, 5, 2, 13, 15, 8, 3, 3, 2, 7, 1, 2, 1, 2, 2, 1, 11, 5, 2, 8, 3, 1, 1, 3, 1
, 4, 6, 3, 4, 4, 1, 7, 1, 1, 11, 15, 2, 3, 5, 7, 1, 6, 2, 17, 2, 1, 9, 3, 1, 5, 2, 1, 4, 4, 5, 1, 1, 1, 2, 3, 1, 1, 4, 3, 1, 1, 20
, 3, 3, 1, 2, 2, 3, 2, 2, 1, 5, 3, 1, 2, 2, 4, 2]
>>> high_conf_view.values("eval_fp")
[0, 2, 0, 2, 0, 3, 1, 0, 1, 2, 0, 0, 3, 0, 2, 1, 3, 3, 0, 0, 0, 0, 4, 0, 0, 0, 1, 4, 1, 0, 0, 1, 1, 1, 0, 0, 0, 2, 1, 1, 4, 0, 0,
1, 1, 0, 4, 7, 6, 1, 0, 1, 3, 0, 0, 1, 2, 1, 4, 0, 2, 2, 0, 0, 0, 2, 3, 6, 1, 0, 2, 2, 0, 1, 0, 4, 0, 1, 1, 1, 0, 0, 0, 1, 1, 3, 2
, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 4, 0]
>>> high_conf_view.values("eval_fn")
[0, 10, 0, 1, 0, 15, 1, 0, 2, 4, 1, 0, 6, 2, 11, 2, 6, 15, 2, 0, 4, 23, 8, 0, 1, 0, 0, 3, 0, 0, 2, 0, 3, 9, 1, 0, 1, 1, 1, 1, 6, 0
, 2, 3, 2, 0, 2, 5, 8, 0, 0, 11, 11, 0, 0, 0, 4, 1, 1, 0, 4, 13, 0, 2, 3, 1, 2, 7, 0, 0, 0, 3, 0, 0, 0, 8, 0, 5, 0, 0, 1, 0, 1, 26
, 6, 1, 0, 1, 1, 1, 1, 0, 1, 6, 0, 1, 0, 0, 6, 1]

(since this looks like a sum([1, None, 3]) kind of problem)

=> Is it a bug or misuse? If so, please let me know what I am missing here.

System information

OS Platform and Distribution (e.g., Linux Ubuntu 22.04): Ubuntu 24.04 LTS
Python version (python --version): Python 3.12.7
FiftyOne version (fiftyone --version): FiftyOne v1.1.0, Voxel51, Inc.
FiftyOne installed from (pip or source): pip (via rye)

Other info/logs

One difference with the vanilla tutorial is that the evaluation is computed from a view. (FWIW, evaluate_classifications seems to work fine with views).

Willingness to contribute

The FiftyOne Community encourages bug fix contributions. Would you or another
member of your organization be willing to contribute a fix for this bug to the
FiftyOne codebase?

Yes. I can contribute a fix for this bug independently
Yes. I would be willing to contribute a fix for this bug with guidance
from the FiftyOne community
No. I cannot contribute a bug fix at this time

The text was updated successfully, but these errors were encountered:

deltheil · 2024-12-10T20:55:32Z

Having a look at:

fiftyone/fiftyone/operators/builtins/panels/model_evaluation/__init__.py

Lines 111 to 128 in 0948c7c

    
           tp_key = f"{key}_tp" 
        
           fp_key = f"{key}_fp" 
        
           fn_key = f"{key}_fn" 
        
           tp_total = ( 
        
               sum(ctx.dataset.values(tp_key)) 
        
               if dataset.has_field(tp_key) 
        
               else None 
        
           ) 
        
           fp_total = ( 
        
               sum(ctx.dataset.values(fp_key)) 
        
               if dataset.has_field(fp_key) 
        
               else None 
        
           ) 
        
           fn_total = ( 
        
               sum(ctx.dataset.values(fn_key)) 
        
               if dataset.has_field(fn_key) 
        
               else None 
        
           )

Isn't it an issue to load the TP, FP and FN values (via ctx.dataset.values(tp_key), etc) over the whole dataset and not just the corresponding view on which the evaluation has been performed? This could explain the None-s values and why it works fine with the vanilla example.

(and these fields seems specific to the detections eval which could explain why classification works fine - i.e. dataset.has_field(tp_key) is False, etc)

deltheil · 2024-12-10T21:09:44Z

Not sure it's correct, but a patch could be:

diff --git a/fiftyone/operators/builtins/panels/model_evaluation/__init__.py b/fiftyone/operators/builtins/panels/model_evaluation/__init__.py
index cb33082f9..fa638198d 100644
--- a/fiftyone/operators/builtins/panels/model_evaluation/__init__.py
+++ b/fiftyone/operators/builtins/panels/model_evaluation/__init__.py
@@ -104,7 +104,7 @@ class EvaluationPanel(Panel):
                 total += metrics["confidence"]
         return total / count if count > 0 else None
 
-    def get_tp_fp_fn(self, ctx):
+    def get_tp_fp_fn(self, ctx, results):
         view_state = ctx.panel.get_state("view") or {}
         key = view_state.get("key")
         dataset = ctx.dataset
@@ -112,17 +112,17 @@ class EvaluationPanel(Panel):
         fp_key = f"{key}_fp"
         fn_key = f"{key}_fn"
         tp_total = (
-            sum(ctx.dataset.values(tp_key))
+            sum(results.samples.values(tp_key))
             if dataset.has_field(tp_key)
             else None
         )
         fp_total = (
-            sum(ctx.dataset.values(fp_key))
+            sum(results.samples.values(fp_key))
             if dataset.has_field(fp_key)
             else None
         )
         fn_total = (
-            sum(ctx.dataset.values(fn_key))
+            sum(results.samples.values(fn_key))
             if dataset.has_field(fn_key)
             else None
         )
@@ -298,7 +298,7 @@ class EvaluationPanel(Panel):
                 per_class_metrics
             )
             metrics["tp"], metrics["fp"], metrics["fn"] = self.get_tp_fp_fn(
-                ctx
+                ctx, results
             )
             metrics["mAP"] = self.get_map(results)
             evaluation_data = {

cc @imanjra

brimoor · 2024-12-13T06:22:51Z

@deltheil thanks for the detailed bug report! This is fixed by #5267 and will be available in fiftyone==1.2.0 which we plan to publish next week 🎉

Fixing #5254

daniel-bogdoll · 2024-12-16T21:30:02Z

@brimoor I don't find the time for a more detailed report so close to Christmas, but I have other (maybe related?) issues with the Model Eval Panel - for me, I saw two issues:

FP/TP/... values sometimes did not reflect the actual values I extracted from the fields programmatically. They especially repeated across fields with similar names (like ...dino_tiny... and ...dino_base...), which caught my attention in the first place, just seemed unlikely.
The panel did sometimes not update the eval fields, even old ones I deleted still showed. Restarting the session did not help and I did not know how to 'reload/restart' the panel

Generally speaking great feature though! Love the 'generate view with 1 click' feature. If 1.2.0 won't fix it, I might write a proper report in January. Keep up the good work! And looking forward towards comparing more than 2 models :D

deltheil added the bug Bug fixes label Dec 10, 2024

brimoor mentioned this issue Dec 13, 2024

Fixing #5254 #5267

Merged

brimoor added a commit that referenced this issue Dec 13, 2024

Merge pull request #5267 from voxel51/model-eval-fixes

f627b82

Fixing #5254

benjaminpkane closed this as completed in 686be45 Dec 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Model Evaluation panel: `load_evaluation` runtime error #5254

[BUG] Model Evaluation panel: `load_evaluation` runtime error #5254

deltheil commented Dec 10, 2024

deltheil commented Dec 10, 2024

deltheil commented Dec 10, 2024 •

edited

Loading

brimoor commented Dec 13, 2024

daniel-bogdoll commented Dec 16, 2024

[BUG] Model Evaluation panel: load_evaluation runtime error #5254

[BUG] Model Evaluation panel: load_evaluation runtime error #5254

Comments

deltheil commented Dec 10, 2024

Describe the problem

Code to reproduce issue

System information

Other info/logs

Willingness to contribute

deltheil commented Dec 10, 2024

deltheil commented Dec 10, 2024 • edited Loading

brimoor commented Dec 13, 2024

daniel-bogdoll commented Dec 16, 2024

[BUG] Model Evaluation panel: `load_evaluation` runtime error #5254

[BUG] Model Evaluation panel: `load_evaluation` runtime error #5254

deltheil commented Dec 10, 2024 •

edited

Loading