Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor the max_tracking code #1896

Open
wants to merge 11 commits into
base: develop
Choose a base branch
from

Conversation

getzze
Copy link
Contributor

@getzze getzze commented Aug 2, 2024

ONLY AFTER #1894 is merged.

I refactored the max_tracking feature introduced in #1447 and simplified the implementation. It may give different results as the goal is similar but implemented differently.

One problem I had with the current implementation is that instances cannot be assigned to created but unused tracks. Say we have max_tracks=2, and the 2 tracks are created. If a track is not assigned instances for track_window timesteps, it does not contribute to instance candidates anymore, hence instances cannot be assigned to this track anymore, forever. What was happening is that instances stopped being tracked in the middle of the video. Therefore it was unusable for me and the "pre_cull" options are not available from the GUI anymore, so I need to use the CLI.

I added an option to reassign unused tracks to unmatched instances if the maximum number of tracks was reached.

I also added an option to merge instances in the "pre_cull" stage. That is because I often have two instances from inference: front of mouse and back of mouse, that could perfectly be merged. The idea is that only "good" merges are kept by the tracking step, but it is not perfect.

Summary by CodeRabbit

  • New Features

    • Streamlined command-line documentation for tracking options enhances user clarity.
    • New configuration parameter for saving shifted instances improves tracking performance.
    • Enhanced prediction and tracking processes with backward compatibility for older Python versions.
    • Improved flexibility in tracking tests with new image scaling options.
  • Bug Fixes

    • Improved robustness of inference result handling by preventing negative counts.
  • Documentation

    • Updated user guides to reflect changes in command usage and tracking options.
  • Tests

    • Adjusted tests to align with the new streamlined tracking methods, improving clarity and performance.

Copy link

coderabbitai bot commented Aug 2, 2024

Walkthrough

The recent updates enhance the tracking capabilities and usability of the system. Key modifications include streamlined command-line usage, refined configuration parameters, improved inference logic, and better handling of tracking instances. These changes collectively aim to simplify user interactions, optimize performance, and clarify documentation, ensuring a more robust and efficient tracking experience.

Changes

Files Change Summary
docs/guides/cli.md Updated command-line examples, replacing --max_tracking with --max_tracks to simplify options.
sleap/config/pipeline_form.yaml Removed tracking.max_tracking and added tracking.save_shifted_instances for improved tracking configuration.
sleap/gui/learning/runners.py Improved handling of inference parameters and tracker name compatibility.
sleap/instance.py Introduced all_disjoint and create_merged_instances functions for better instance processing.
sleap/nn/inference.py Enhanced prediction efficiency and backward compatibility, refined data processing workflow.
sleap/nn/tracker/components.py Added merging logic to cull_frame_instances for better instance management.
sleap/nn/tracking.py Restructured candidate-making logic with new abstract base class for flexibility.
tests/nn/test_inference.py Updated tests to reflect new tracking approach and improved performance handling.
tests/nn/test_tracker_components.py Renamed functions and added parameters for dynamic image scaling in tracking tests.
tests/nn/test_tracking_integration.py Refactored tracking tests for improved encapsulation and clarity.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI
    participant Tracker
    participant Inference

    User->>CLI: Execute tracking command
    CLI->>Tracker: Initialize tracker
    Tracker->>Inference: Process data
    Inference-->>Tracker: Return results
    Tracker-->>CLI: Output results
    CLI-->>User: Display tracking results
Loading

🐇 In the meadow, I hop with glee,
New changes made, oh what a spree!
Tracks refined and paths made clear,
With every hop, I shed a cheer!
Let’s bound ahead, no need to look back,
In our swift dance, we’re on the right track! 🌼✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Outside diff range, codebase verification and nitpick comments (2)
sleap/nn/tracker/components.py (1)

367-369: Clarify the merging penalty description.

The description of the merging_penalty parameter should mention that it is a value between 0 and 1.

-        merging_penalty: a float between 0 and 1. All scores of the merged
+        merging_penalty: A float between 0 and 1. All scores of the merged
sleap/config/pipeline_form.yaml (1)

458-463: Clarify the help text for tracking.save_shifted_instances.

The help text should be more explicit about the impact on memory usage.

-      help: 'Save the flow-shifted instances between elapsed frames. It improves
-      instance matching at the cost of using a bit more of memory.'
+      help: 'Save the flow-shifted instances between elapsed frames. This improves
+      instance matching but increases memory usage.'
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 5ab305c and ab24b50.

Files selected for processing (13)
  • docs/guides/cli.md (2 hunks)
  • sleap/config/pipeline_form.yaml (2 hunks)
  • sleap/gui/learning/dialog.py (1 hunks)
  • sleap/gui/learning/runners.py (6 hunks)
  • sleap/gui/widgets/video.py (2 hunks)
  • sleap/instance.py (2 hunks)
  • sleap/nn/inference.py (13 hunks)
  • sleap/nn/tracker/components.py (8 hunks)
  • sleap/nn/tracking.py (35 hunks)
  • sleap/util.py (2 hunks)
  • tests/nn/test_inference.py (8 hunks)
  • tests/nn/test_tracker_components.py (8 hunks)
  • tests/nn/test_tracking_integration.py (6 hunks)
Files skipped from review due to trivial changes (1)
  • docs/guides/cli.md
Additional context used
Ruff
sleap/util.py

38-38: Undefined name Task

(F821)

sleap/nn/tracking.py

7-7: collections.defaultdict imported but unused

Remove unused import: collections.defaultdict

(F401)


532-532: Use enumerate() for index variable n_processed in for loop

(SIM113)

tests/nn/test_inference.py

1820-1820: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

sleap/instance.py

21-21: operator imported but unused

Remove unused import: operator

(F401)

Additional comments not posted (27)
tests/nn/test_tracking_integration.py (3)

22-22: Verify the change in expected number of tracks.

The expected number of tracks has changed from 27 to 8. Ensure this change is intentional and aligns with the updated tracking logic or input data.


Line range hint 25-34:
LGTM! Verify the function's behavior.

The function has been renamed and updated to test the max tracking feature. Ensure the function behaves as expected with the new parameters.


44-44: LGTM! Verify the function's behavior.

The functions have been updated to use the tracker.run_tracker method with verbosity set to 'none'. Ensure the functions behave as expected with the new method.

Also applies to: 51-51

tests/nn/test_tracker_components.py (3)

18-29: LGTM! Verify the function's behavior.

The function has been renamed and updated to support dynamic adjustment of image scale. Ensure the function behaves as expected with the new parameter.


36-58: LGTM! Verify the test cases.

The test function has been updated to include the new img_scale parameter. Ensure the test cases validate the behavior of the tracker with varying image scales.


Line range hint 286-298:
LGTM! Verify the test cases.

The test function has been updated to include the max_tracking parameter. Ensure the test cases validate the behavior of the tracker with the new parameter.

sleap/nn/tracker/components.py (1)

355-356: Add type hints for new parameters.

The new parameters merge_instances and merging_penalty should have type hints for better readability and type checking.

-    merge_instances: bool = False,
-    merging_penalty: float = 0.2,
+    merge_instances: bool = False,
+    merging_penalty: float = 0.2,

Likely invalid or redundant comment.

sleap/gui/learning/runners.py (1)

241-247: Ensure compatibility with all tracker names.

The compatibility logic should ensure that all tracker names are correctly handled.

Verification successful

Ensure compatibility with all tracker names.

The compatibility logic correctly handles tracker names with the "maxtracks" suffix by stripping the suffix and setting max_tracking to True. The search results confirm that the tracker names are handled consistently across the codebase.

  • sleap/gui/learning/runners.py: Lines 241-247
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all tracker names are correctly handled.

# Test: Search for all occurrences of tracker names in the codebase. Expect: Correct handling of tracker names.
rg --type python -A 5 $'tracking.tracker'

Length of output: 71


Script:

#!/bin/bash
# Description: Verify all tracker names are correctly handled.

# Test: Search for all occurrences of tracker names in the codebase. Expect: Correct handling of tracker names.
rg --type py -A 5 $'tracking.tracker'

Length of output: 7299

sleap/gui/learning/dialog.py (1)

739-739: LGTM! But verify the function usage in the codebase.

The code changes are approved.

However, ensure that all function calls to run are tested to confirm the new logic works as expected.

sleap/nn/tracking.py (6)

118-131: Abstract base class CandidateMaker looks good.

The abstract method get_candidates enforces a consistent interface for candidate makers.


Line range hint 390-405:
Class SimpleCandidateMaker implementation looks good.

The method for generating candidates from prior frames is well-defined.

Tools
Ruff

408-408: Local variable ref_t is assigned to but never used

Remove assignment to unused variable ref_t

(F841)


Line range hint 621-892:
Class Tracker implementation looks good.

The new attributes and methods enhance the control over the tracking process.

However, ensure that all methods are thoroughly tested to confirm the new logic works as expected.

Verification successful

The previous shell script failed because rg did not recognize the python file type. I will modify the script to use the correct file type for Python files and rerun the search for Tracker class usage in the codebase.


Ensure all Tracker methods are thoroughly tested.

The Tracker class and its methods are used extensively across multiple test files, including tests/nn/test_tracking_integration.py, tests/nn/test_tracker_components.py, and tests/nn/test_inference.py. These tests cover various aspects of the Tracker functionality, ensuring that the new logic is well-tested.

  • Test files:
    • tests/nn/test_tracking_integration.py
    • tests/nn/test_tracker_components.py
    • tests/nn/test_inference.py

Please verify that these tests are comprehensive and cover all new functionality introduced in the Tracker class.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all method calls to `Tracker` in the codebase.

# Test: Search for the class usage. Expect: Only occurrences of the new logic.
rg --type python -A 5 $'Tracker'

Length of output: 62


Script:

#!/bin/bash
# Description: Verify all method calls to `Tracker` in the codebase.

# Test: Search for the class usage. Expect: Only occurrences of the new logic.
rg --type py -A 5 Tracker

Length of output: 30182


Line range hint 133-384:
Class FlowCandidateMaker implementation looks good.

The methods for generating and managing shifted instances are well-defined.

However, ensure that all methods are thoroughly tested to confirm the new logic works as expected.

Verification successful

Class FlowCandidateMaker implementation looks good.

The methods for generating and managing shifted instances are well-defined and the class is being tested in tests/nn/test_inference.py.

  • sleap/nn/tracking.py
  • tests/nn/test_inference.py
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all method calls to `FlowCandidateMaker` in the codebase.

# Test: Search for the class usage. Expect: Only occurrences of the new logic.
rg --type python -A 5 $'FlowCandidateMaker'

Length of output: 73


Script:

#!/bin/bash
# Description: Verify all method calls to `FlowCandidateMaker` in the codebase.

# Test: Search for the class usage. Expect: Only occurrences of the new logic.
rg --type py -A 5 $'FlowCandidateMaker'

Length of output: 2364


438-572: Class BaseTracker implementation looks good.

The new attributes and methods enhance the functionality and flexibility of the tracking process.

However, ensure that all methods are thoroughly tested to confirm the new logic works as expected.

Tools
Ruff

532-532: Use enumerate() for index variable n_processed in for loop

(SIM113)


1425-1430: Class KalmanTracker implementation looks good.

The new attributes and methods enhance the robustness and flexibility of the tracking process.

However, ensure that all methods are thoroughly tested to confirm the new logic works as expected.

tests/nn/test_inference.py (4)

Line range hint 1273-1315:
LGTM!

The function test_topdown_predictor_save is well-structured and covers various aspects of the model saving process. The use of @pytest.mark.slow() is appropriate for this test.


Line range hint 1316-1360:
LGTM!

The function test_topdown_id_predictor_save is well-structured and comprehensive in testing various aspects of the model saving process. The use of @pytest.mark.slow() is appropriate for this test.


Line range hint 1784-1829:
LGTM!

The function test_max_tracks_matching_queue is well-structured and covers the necessary checks for maximum tracks instance generation. The use of parameterization is appropriate for this test.

Tools
Ruff

1820-1820: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


Line range hint 1851-1873:
LGTM!

The function test_movenet_predictor is well-structured and covers various aspects of the MoveNetPredictor model. The use of @pytest.mark.slow() is appropriate for this test.

sleap/instance.py (1)

1183-1184: LGTM!

The all_disjoint function is well-implemented and correctly checks for disjoint sequences.

sleap/gui/widgets/video.py (2)

807-808: Initialization of _down_pos

The addition of _down_pos initialized to None is a good practice to ensure it is defined before being used in other methods.


1032-1032: Safeguard for _down_pos

The added check to ensure _down_pos is not None before comparing it with the current mouse position is a robust safeguard against potential errors.

sleap/nn/inference.py (5)

48-51: Ensure backward compatibility with Python versions prior to 3.8.

The code correctly defines cached_property as a standard property when the version check fails, ensuring backward compatibility.


163-165: Correct use of cached_property for report_period.

The report_period method is correctly optimized with cached_property to cache its value after the first computation.


414-416: Performance improvement by pre-compiling loop examples.

Storing the dataset in examples before entering the loop prevents repeated calls to make_dataset(), enhancing performance.


4917-4919: Conditional tracker initialization logic.

The code correctly checks if tracker_max_instances is greater than zero before setting the tracker, ensuring that the tracker is only created when necessary.


5339-5341: Correct handling of labeled frames.

The function correctly creates a LabelsReader for labeled frames, ensuring only labeled frames are processed.

sleap/instance.py Outdated Show resolved Hide resolved
tests/nn/test_inference.py Outdated Show resolved Hide resolved
sleap/instance.py Outdated Show resolved Hide resolved
sleap/nn/tracker/components.py Outdated Show resolved Hide resolved
sleap/util.py Outdated Show resolved Hide resolved
sleap/gui/learning/runners.py Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Outside diff range, codebase verification and nitpick comments (15)
tests/nn/test_tracking_integration.py (3)

25-28: Approve changes and suggest documentation updates.

The function has been renamed for clarity, and the CLI parameters have been updated to reflect the new tracking configuration.

Consider updating the related documentation to reflect these changes:

  1. The renaming of the test function from test_simplemax_tracker to test_simple_max_tracks.
  2. The change in the tracker parameter from "simplemaxtracks" to "simple".
  3. The addition of the new "--tracking.max_tracks 2" parameter.

This will ensure that the documentation remains in sync with the code changes.


110-116: Approve simplification of make_tracker function.

The make_tracker function has been simplified by removing the max_tracking parameter and related conditional logic. The max_tracks parameter is now directly passed to the tracker constructor, which aligns with the changes observed in the test functions.

Consider updating any related documentation or comments that might reference the old max_tracking parameter or the removed conditional logic. This will ensure that the documentation accurately reflects the current implementation.


44-44: Complete the transition to tracker.run_tracker across the codebase.

The codebase still contains instances of the run_tracker function definition and calls, indicating that the transition to using tracker.run_tracker is incomplete. To maintain consistency and adhere to object-oriented principles, please update all occurrences of run_tracker to tracker.run_tracker.

  • run_tracker function is defined in sleap/nn/tracking.py.
  • Calls to run_tracker are found in:
    • tests/nn/test_tracking_integration.py
    • tests/nn/test_tracker_components.py
    • sleap/nn/inference.py
    • sleap/nn/tracking.py
Analysis chain

Approve the change to use tracker.run_tracker.

The modification to use tracker.run_tracker instead of a separate run_tracker function improves encapsulation and follows object-oriented principles.

To ensure consistency across the codebase, please run the following script to check for any remaining instances of the old run_tracker function:

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for remaining instances of the old run_tracker function.

# Test: Search for remaining instances of run_tracker function. Expect: No results.
rg --type python 'def run_tracker\(' 

# Test: Search for calls to run_tracker function. Expect: No results.
rg --type python 'run_tracker\('

# Test: Search for new tracker.run_tracker calls. Expect: Multiple results.
rg --type python 'tracker\.run_tracker\('

Length of output: 210


Script:

#!/bin/bash
# Description: Check for remaining instances of the old run_tracker function.

# Test: Search for remaining instances of run_tracker function. Expect: No results.
rg 'def run_tracker\(' --glob '*.py'

# Test: Search for calls to run_tracker function. Expect: No results.
rg 'run_tracker\(' --glob '*.py'

# Test: Search for new tracker.run_tracker calls. Expect: Multiple results.
rg 'tracker\.run_tracker\(' --glob '*.py'

Length of output: 980

tests/nn/test_tracker_components.py (4)

18-32: Approve changes to run_tracker_by_name function with a minor suggestion.

The renaming and addition of the img_scale parameter improve the function's flexibility. The new logic for handling img_scale is well-implemented.

Consider raising a more specific exception instead of using pytest.skip when img_scale is not defined for the tracker. This would provide clearer error information outside of the testing context. For example:

if img_scale:
    if hasattr(t, "candidate_maker") and hasattr(t.candidate_maker, "img_scale"):
        t.candidate_maker.img_scale = img_scale
    else:
        raise ValueError("img_scale is not defined for this tracker")

36-40: Approve changes to test parameters with suggestions for additional test cases.

The updates to the tracker options and the addition of the img_scale parameter align well with the changes made to the run_tracker_by_name function.

Consider adding more diverse test cases for img_scale to ensure robust testing of the scaling feature. For example:

@pytest.mark.parametrize("img_scale", [0, 1, 0.25, 0.5, 2])

This would test the behavior with no scaling (0), normal scale (1), downscaling (0.25, 0.5), and upscaling (2).


Line range hint 248-285: Approve changes to test_max_tracks_large_gap_single_track with a suggestion.

The function renaming and the modification of the max_tracks parameter improve the test coverage by checking behavior both with and without track limits.

Consider adding a comment to explain the significance of setting max_tracks=-1. For example:

# Test with unlimited tracks
tracker = Tracker.make_tracker_by_name(
    tracker="simple",
    match="hungarian",
    track_window=2,
    max_tracks=-1,  # Allows unlimited tracks
)

# ... (existing code) ...

# Test with limited tracks
tracker = Tracker.make_tracker_by_name(
    tracker="simple",
    match="hungarian",
    track_window=2,
    max_tracks=2,  # Limits the number of tracks to 2
)

This would enhance the readability and clarify the purpose of each test case.

Also applies to: 297-300


Line range hint 312-345: Approve changes to test_max_tracks_small_gap_on_both_tracks and test_max_tracks_extra_detections with a suggestion.

The function renamings and modifications to the max_tracks parameter are consistent with the changes made to test_max_tracks_large_gap_single_track. These changes improve test coverage by checking behavior both with and without track limits.

For consistency and clarity, consider adding comments similar to those suggested for test_max_tracks_large_gap_single_track to explain the significance of max_tracks=-1 and max_tracks=2 in both of these functions as well.

Also applies to: 357-360, 372-410, 422-425

docs/guides/cli.md (2)

127-127: LGTM! Consider clarifying the description of --max_tracks.

The changes align with the PR objectives to simplify the max_tracking feature. The removal of --max_tracking and the addition of --max_tracks streamline the command usage.

Consider updating the description of --max_tracks to provide more context:

- Maximum number of tracks to be tracked by the tracker. No limit if None or -1. (default: None)
+ Maximum number of tracks to be maintained by the tracker. Unlimited if set to None or -1. (default: None)

This change clarifies that the parameter sets a limit on the number of maintained tracks, which may help users better understand its purpose.

Also applies to: 188-188


262-262: LGTM! Consider adding a brief explanation to the examples.

The examples have been correctly updated to use --tracking.max_tracks instead of --max_tracking, maintaining consistency with the changes in the command-line arguments.

Consider adding a brief explanation to examples 5 and 6 to clarify the purpose of the --tracking.max_tracks argument:

- sleap-track -m "models/my_model" --tracking.tracker simple --tracking.max_tracks 4 -o "output_predictions.slp" "input_video.mp4"
+ sleap-track -m "models/my_model" --tracking.tracker simple --tracking.max_tracks 4 -o "output_predictions.slp" "input_video.mp4"
+ # This example limits the tracker to maintain a maximum of 4 tracks

- sleap-track --tracking.tracker simple --tracking.max_tracks 4 -o "retracked.slp" "input_predictions.slp"
+ sleap-track --tracking.tracker simple --tracking.max_tracks 4 -o "retracked.slp" "input_predictions.slp"
+ # This example re-tracks existing predictions, limiting the tracker to 4 tracks

These brief explanations will help users understand the purpose and effect of the --tracking.max_tracks argument in different scenarios.

Also applies to: 268-268

sleap/nn/tracker/components.py (2)

355-356: Updated function signature with new parameters.

The cull_frame_instances function now includes two new parameters: merge_instances and merging_penalty. This change aligns with the PR objectives to introduce an option for merging instances.

However, consider adding type hints for these new parameters to maintain consistency with the existing type hints:

def cull_frame_instances(
    instances_list: List[InstanceType],
    instance_count: int,
    iou_threshold: Optional[float] = None,
-   merge_instances: bool = False,
-   merging_penalty: float = 0.2,
+   merge_instances: bool = False,
+   merging_penalty: float = 0.2,
) -> List["LabeledFrame"]:

Also, update the function's docstring to include descriptions for these new parameters.

Also applies to: 367-369


377-384: New merging functionality added.

The new code block implements the merging of instances when merge_instances is True. This aligns with the PR objectives and enhances the functionality of the cull_frame_instances function.

Consider adding error handling for potential exceptions that might be raised by create_merged_instances. Also, you might want to add a debug log to show how many instances were merged:

if merge_instances:
    logger.info("Merging instances with penalty: %f", merging_penalty)
+   original_count = len(instances_list)
    merged_instances = create_merged_instances(
        instances_list, penalty=merging_penalty
    )
    instances_list.extend(merged_instances)
+   logger.debug("Merged instances: original=%d, after_merge=%d", original_count, len(instances_list))
sleap/gui/learning/runners.py (1)

256-261: New tracking options added to boolean-to-integer conversion list

The addition of new tracking options to the bool_items_as_ints list is approved. These changes align with the PR objectives of enhancing tracking capabilities.

Consider grouping related items together for improved readability. For example:

 bool_items_as_ints = (
     "tracking.pre_cull_to_target",
     "tracking.pre_cull_merge_instances",
     "tracking.post_connect_single_breaks",
     "tracking.save_shifted_instances",
     "tracking.oks_score_weighting",
     "tracking.prefer_reassigning_track",
     "tracking.allow_reassigning_track",
 )
sleap/nn/tracking.py (1)

Line range hint 4-130: LGTM: New imports and CandidateMaker abstract base class.

The new imports and the introduction of the CandidateMaker abstract base class improve the overall structure and extensibility of the code. This change allows for different implementations of candidate makers to inherit from a common interface.

Consider removing the unused import of defaultdict from collections as flagged by the static analysis tool:

-from collections import deque, defaultdict
+from collections import deque
tests/nn/test_inference.py (1)

Line range hint 1783-1825: LGTM: Good extension of test coverage.

The changes to test_max_tracks_matching_queue improve the test by covering both 'flow' and 'simple' trackers. The conditional checks and generalized assertions make the test more robust and comprehensive.

Consider renaming the 'trackername' variable to 'tracker_name' for better readability and consistency with Python naming conventions.

sleap/instance.py (1)

1182-1183: Approve implementation with a minor suggestion for readability.

The all_disjoint function is implemented correctly and efficiently. It uses a list comprehension with all() and set().isdisjoint() to check if all pairs of sequences are disjoint.

Consider adding a type hint for the return value and using tuple unpacking for slightly improved readability:

-def all_disjoint(x: Sequence[Sequence]) -> bool:
-    return all((set(p0).isdisjoint(set(p1))) for p0, p1 in combinations(x, 2))
+def all_disjoint(x: Sequence[Sequence]) -> bool:
+    return all(set(p0).isdisjoint(set(p1)) for p0, p1 in combinations(x, 2))
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between ab24b50 and c656bb8.

Files selected for processing (11)
  • docs/guides/cli.md (4 hunks)
  • sleap/config/pipeline_form.yaml (3 hunks)
  • sleap/gui/learning/runners.py (6 hunks)
  • sleap/instance.py (2 hunks)
  • sleap/nn/inference.py (13 hunks)
  • sleap/nn/tracker/components.py (8 hunks)
  • sleap/nn/tracking.py (35 hunks)
  • sleap/util.py (3 hunks)
  • tests/nn/test_inference.py (8 hunks)
  • tests/nn/test_tracker_components.py (11 hunks)
  • tests/nn/test_tracking_integration.py (6 hunks)
Additional context used
Ruff
sleap/nn/tracking.py

7-7: collections.defaultdict imported but unused

Remove unused import: collections.defaultdict

(F401)


532-532: Use enumerate() for index variable n_processed in for loop

(SIM113)

Additional comments not posted (43)
tests/nn/test_tracking_integration.py (2)

51-51: Approve consistent use of tracker.run_tracker.

The change to use tracker.run_tracker is consistent with the modification in the make_ground_truth function, reinforcing the object-oriented approach.

This change maintains consistency across the codebase and improves the overall design.


22-22: Verify the change in the number of expected tracks.

The assertion for the number of tracks has been changed from 27 to 8. This significant reduction in the number of tracks suggests a major change in the tracking algorithm's behavior or output.

Please confirm if this change is intentional and aligns with the expected behavior of the refactored tracking algorithm. Consider running the following script to verify the change:

tests/nn/test_tracker_components.py (1)

47-58: Approve changes to test_tracker_by_name function.

The addition of the img_scale parameter to the function signature and its inclusion in the run_tracker_by_name function call correctly implement the new scaling feature in the test.

sleap/util.py (2)

17-17: LGTM: New import for rich.progress

The addition of import rich.progress is appropriate for the new RateColumn class.


Line range hint 1-44: Overall assessment: Good addition to the utility module

The new RateColumn class enhances the progress reporting capabilities of the module. The implementation is clean and follows best practices.

To ensure that the RateColumn class is used correctly throughout the project, please run the following script:

This will help verify the integration of the new class and identify any potential areas where it could be utilized.

Verification successful

RateColumn Integration Verified

The RateColumn class is successfully integrated into the project, with multiple usages found. There are no remaining instances of ProgressColumn that require replacement with RateColumn. The implementation is complete and correct.

  • Instances of RateColumn are found in the project, confirming its usage.
  • No additional ProgressColumn instances need replacement.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for usage of RateColumn class in the project

# Search for RateColumn usage
echo "Searching for RateColumn usage:"
rg --type python "RateColumn" -C 3

# Check if there are any remaining uses of ProgressColumn that could be replaced with RateColumn
echo "\nChecking for potential RateColumn replacement opportunities:"
rg --type python "ProgressColumn" -C 3

Length of output: 343


Script:

#!/bin/bash
# Description: Check for usage of RateColumn class in the project

# Find all Python files and search for RateColumn usage
echo "Searching for RateColumn usage:"
fd -e py --exec rg "RateColumn" -C 3

# Find all Python files and check if there are any remaining uses of ProgressColumn that could be replaced with RateColumn
echo "\nChecking for potential RateColumn replacement opportunities:"
fd -e py --exec rg "ProgressColumn" -C 3

Length of output: 1778

sleap/config/pipeline_form.yaml (3)

458-463: New parameter added for saving shifted instances

A new configuration option tracking.save_shifted_instances has been introduced. This parameter allows saving flow-shifted instances between frames, which can improve instance matching at the cost of increased memory usage.

The addition of this parameter provides users with more control over the tracking process and memory usage. It's a good trade-off between performance and resource utilization.


Line range hint 1-463: Cleanup of commented-out sections

Several commented-out sections related to tracking parameters have been removed from the file.

This cleanup improves the readability of the configuration file by removing unused or outdated options. It's a good practice to keep configuration files clean and up-to-date.


Line range hint 1-463: Summary of changes

The modifications to this file enhance the tracking capabilities of the system by introducing a new option to save flow-shifted instances. This addition allows for improved instance matching at the cost of slightly increased memory usage. Additionally, the removal of commented-out sections contributes to a cleaner, more maintainable configuration file.

These changes appear to be well-considered and align with the PR objectives of refactoring and improving the max_tracking code. The new option provides users with more control over the tracking process, potentially addressing some of the issues mentioned in the PR summary.

sleap/nn/tracker/components.py (2)

27-27: New import for merging instances functionality.

The import of create_merged_instances from sleap.instance is correctly added to support the new merging functionality in cull_frame_instances.


404-407: Minor formatting changes in sorting extra instances.

The formatting changes improve code readability by breaking the sorted function call into multiple lines. This is a good practice for maintaining clean and readable code.

sleap/gui/learning/runners.py (3)

241-246: Improved compatibility for tracker names with "maxtracks" suffix

This change enhances the flexibility of the system by allowing compatibility with tracker names that include the "maxtracks" suffix. It aligns well with the PR objective of simplifying the max_tracking feature implementation.


Line range hint 1-1000: Overall assessment of changes in runners.py

The modifications in this file align well with the PR objectives of simplifying the max_tracking feature and improving tracking functionality. Key improvements include:

  1. Enhanced compatibility for tracker names with "maxtracks" suffix.
  2. Addition of new tracking options for more flexible control.
  3. Simplification of the max_tracking feature by removing it from the boolean-to-integer conversion list.

These changes should contribute to a more robust and flexible tracking system. The code quality is maintained, and the modifications are implemented cleanly.

To ensure the changes don't introduce any regressions, consider running the existing test suite, particularly focusing on tests related to tracking functionality.


256-261: Removal of "tracking.max_tracking" from boolean-to-integer conversion list

The removal of "tracking.max_tracking" from the bool_items_as_ints list is approved. This change is consistent with the PR's objective of simplifying the max_tracking feature implementation.

To ensure this change doesn't introduce any unintended side effects, please run the following script to check for any remaining references to "tracking.max_tracking" in the codebase:

If any results are found, please review them to ensure that the removal of "tracking.max_tracking" from bool_items_as_ints doesn't cause any issues in other parts of the codebase.

Verification successful

No remaining references to "tracking.max_tracking" found

The removal of "tracking.max_tracking" from the bool_items_as_ints list is confirmed to be safe. There are no remaining references to "tracking.max_tracking" in the codebase, ensuring that this change does not impact other parts of the code.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for remaining references to "tracking.max_tracking"

# Search for "tracking.max_tracking" in Python files
echo "Searching for 'tracking.max_tracking' references:"
rg --type python "tracking\.max_tracking"

# Check if "tracking.max_tracking" is still used in function calls or assignments
echo "Checking for 'tracking.max_tracking' in function calls or assignments:"
ast-grep --lang python --pattern '$_($$$, tracking.max_tracking: $_, $$$)'
ast-grep --lang python --pattern '$_ = $$$tracking.max_tracking$$$'

Length of output: 493


Script:

#!/bin/bash
# Description: Check for remaining references to "tracking.max_tracking"

# Corrected search for "tracking.max_tracking" in all files
echo "Searching for 'tracking.max_tracking' references:"
rg "tracking\.max_tracking"

# Check if "tracking.max_tracking" is still used in function calls or assignments
echo "Checking for 'tracking.max_tracking' in function calls or assignments:"
ast-grep --lang python --pattern '$_($$$, tracking.max_tracking: $_, $$$)'
ast-grep --lang python --pattern '$_ = $$$tracking.max_tracking$$$'

Length of output: 448

sleap/nn/tracking.py (11)

Line range hint 133-384: LGTM: FlowCandidateMaker updates.

The FlowCandidateMaker class now properly inherits from the CandidateMaker abstract base class, improving the overall structure of the code. The addition of the save_shifted_instances attribute provides more control over memory usage.

The updated get_candidates method now handles the logic for retrieving shifted instances from earlier times and pruning old shifted instances, which should improve performance and memory management.


Line range hint 390-413: LGTM: SimpleCandidateMaker updates.

The SimpleCandidateMaker class now correctly inherits from the CandidateMaker abstract base class, maintaining consistency with the new structure. The updated get_candidates method is more streamlined and focused on its core functionality.

Tools
Ruff

408-408: Local variable ref_t is assigned to but never used

Remove assignment to unused variable ref_t

(F841)


Line range hint 621-650: LGTM: Enhanced Tracker class attributes.

The addition of new attributes to the Tracker class improves its configurability and reporting capabilities:

  1. verbosity and report_rate allow for more flexible progress reporting.
  2. found_tracks helps in tracking the last time an instance was found for each track.
  3. max_tracks provides the ability to limit the number of tracks, which can be useful for performance or application-specific requirements.
  4. prefer_reassigning_track and allow_reassigning_track offer more control over track assignment strategies.

These changes should make the Tracker more versatile and easier to use in various scenarios.


Line range hint 692-892: LGTM: Improved Tracker class methods.

The changes to the Tracker class methods significantly enhance its functionality and code organization:

  1. The new infer_next_timestep method centralizes the logic for determining the current timestep, reducing code duplication.
  2. The refactored track method is more organized and easier to follow, with clear separation of concerns.
  3. New methods spawn_new_track and assign_to_track provide more granular control over track management.
  4. The updated spawn_for_untracked_instances method incorporates the new track assignment strategies, making it more flexible.

These improvements should make the Tracker class more maintainable and easier to extend in the future.


Line range hint 958-1028: LGTM: Enhanced Tracker.make_tracker_by_name method.

The updates to the Tracker.make_tracker_by_name method improve its configurability and align with the new functionality added to the Tracker class:

  1. New parameters max_tracks, prefer_reassigning_track, and allow_reassigning_track provide more control over the tracking process.
  2. The addition of pre_cull_merge_instances and pre_cull_merging_penalty options allows for more sophisticated instance culling strategies.
  3. Updated progress reporting options (verbosity and report_rate) improve the usability of the tracker in different environments.

These changes make the tracker more versatile and easier to customize for specific use cases.


Line range hint 1059-1163: LGTM: Updated Tracker.get_by_name_factory_options method.

The changes to the Tracker.get_by_name_factory_options method improve the documentation and configurability of the tracker:

  1. New options for max_tracks, pre_cull_merge_instances, pre_cull_merging_penalty, allow_reassigning_track, and prefer_reassigning_track have been added, reflecting the new functionality in the Tracker class.
  2. The help text for existing options has been updated to provide more detailed information.

These updates make it easier for users to understand and configure the tracker according to their specific needs.


1265-1279: LGTM: Improved CLI argument handling in Tracker.add_cli_parser_args.

The changes to the Tracker.add_cli_parser_args method enhance the CLI argument handling:

  1. The 'tracker' argument is now handled separately from other arguments.
  2. No default value is set for the 'tracker' argument, which allows for better detection of malformed command lines.

This improvement should make it easier to identify and report issues with user-provided CLI arguments, leading to a better user experience.


1421-1426: LGTM: Consistent progress reporting in KalmanTracker.

The addition of verbosity and report_rate attributes to the KalmanTracker class aligns it with the progress reporting capabilities of the Tracker class. This consistency improves the overall user experience and makes the codebase more maintainable.


1643-1643: LGTM: Simplified retrack function with improved progress reporting.

The retrack function now uses the run_tracker method of the tracker object for progress reporting. This change:

  1. Simplifies the retrack function by removing redundant progress reporting code.
  2. Ensures consistency in progress reporting across different parts of the codebase.
  3. Leverages the new progress reporting capabilities built into the Tracker class.

This improvement demonstrates good code reuse and maintains consistency throughout the tracking system.


532-532: Static analysis suggestion not applicable in this context.

The static analysis tool suggests using enumerate() for the index variable n_processed in the for loop. However, in this specific case, using enumerate() may not be beneficial:

  1. The n_processed variable is used for more than just counting iterations. It's part of the progress tracking and reporting logic.
  2. The loop doesn't directly use the items from the frames iterable, making enumerate() less relevant.
  3. The current implementation allows for more flexibility in how n_processed is incremented and used within the loop.

Given these factors, the current implementation is appropriate for its purpose, and changing it to use enumerate() might make the code less clear or flexible.

Tools
Ruff

532-532: Use enumerate() for index variable n_processed in for loop

(SIM113)


Line range hint 1-1664: Overall improvements to the tracking system.

The changes made to sleap/nn/tracking.py significantly enhance the tracking system:

  1. Introduction of the CandidateMaker abstract base class improves code organization and extensibility.
  2. Updates to FlowCandidateMaker and SimpleCandidateMaker classes align with the new abstract base class structure.
  3. Enhanced Tracker class with new attributes and methods provides more flexibility and control over the tracking process.
  4. Improved progress reporting capabilities across different parts of the system.
  5. Better CLI argument handling for improved user experience.
  6. Consistent updates to related classes like KalmanTracker maintain overall system coherence.

These changes should make the tracking system more robust, flexible, and easier to maintain and extend in the future.

Tools
Ruff

532-532: Use enumerate() for index variable n_processed in for loop

(SIM113)

tests/nn/test_inference.py (3)

1273-1273: LGTM: Good use of @pytest.mark.slow() decorator.

The addition of the @pytest.mark.slow() decorator is a good practice for identifying tests that may take longer to execute. This allows for better management of test suites and the ability to selectively run faster tests when needed.


1316-1316: LGTM: Consistent use of @pytest.mark.slow() decorator.

The addition of the @pytest.mark.slow() decorator to test_topdown_id_predictor_save is consistent with the previous change and follows good testing practices for managing test execution times.


1848-1848: LGTM: Consistent use of @pytest.mark.slow() decorator.

The addition of the @pytest.mark.slow() decorator to test_movenet_predictor is consistent with the previous changes and continues to follow good testing practices for managing test execution times.

sleap/instance.py (6)

1186-1189: Function signature and docstring look good.

The function signature and docstring are well-defined. The use of type hints is appreciated.


1202-1212: Ensure same skeleton and track for all instances.

Good practice to check for consistency in skeleton and track across all instances before proceeding with the merge.


1214-1223: Efficient generation of instance subsets.

The use of combinations and chain.from_iterable for generating instance subsets is efficient. The all_disjoint function is used correctly to ensure non-intersecting visible nodes.


1228-1235: Score calculation and penalty application.

The score calculation and penalty application look correct. Good use of the reduce function for calculating the instance score.


1237-1245: Creation and appending of merged instances.

The creation of PredictedInstance objects and appending them to the merged_instances list is implemented correctly.


1182-1246: Well-integrated new functionality.

The new functions all_disjoint and create_merged_instances are well-integrated into the existing codebase. They provide new functionality without disrupting the existing code structure.

To ensure that these new functions don't introduce any unintended side effects, it would be beneficial to add unit tests. Here's a script to check for existing tests:

sleap/nn/inference.py (10)

48-52: Conditional import of cached_property based on Python version

This change adds compatibility for Python versions prior to 3.8 by conditionally importing cached_property. If the Python version is 3.8 or higher, it's imported from functools; otherwise, it falls back to using the standard property decorator.

This is a good practice for maintaining backwards compatibility while still leveraging newer features when available.


163-166: Replace property decorator with cached_property for report_period

The report_period method in the Predictor class now uses the cached_property decorator instead of the standard property decorator.

This change can improve performance by caching the result of report_period after the first access, which is beneficial if this method is called frequently and the computation is expensive.


414-416: Pre-compile loop examples before starting time measurement

The examples variable is now created before the time measurement and progress reporting logic. This change helps to provide a more accurate ETA by excluding the time taken to compile the dataset from the progress calculation.

This modification improves the accuracy of the progress reporting by separating the dataset compilation time from the actual processing time.


432-433: Simplify loop iteration over examples

The loop now directly iterates over examples instead of using self.pipeline.make_dataset().

This change is consistent with the earlier modification where examples was pre-compiled. It simplifies the loop and avoids repeated calls to make_dataset().


452-453: Simplify loop iteration over examples in JSON verbosity mode

Similar to the previous change, the loop in the JSON verbosity mode now directly iterates over examples.

This change is consistent with the earlier modifications and maintains consistency across different verbosity modes.


489-490: Simplify loop iteration over examples in non-verbose mode

The loop in the non-verbose mode also now directly iterates over examples.

This change maintains consistency with the earlier modifications across all verbosity modes.


5335-5337: Update LabelsReader initialization in _make_provider_from_cli function

The LabelsReader is now initialized with an example_indices parameter, which is set to the result of frame_list(args.frames).

This change allows for more flexible frame selection when creating the LabelsReader. It uses the frame_list function to parse the frames argument, which can improve usability by allowing users to specify frames in various formats (e.g., comma-separated list, range).


5431-5434: Update Tracker initialization in _make_tracker_from_cli function

The Tracker.make_tracker_by_name method is now called with an additional progress_reporting parameter set to args.verbosity.

This change allows the tracker to use the same verbosity setting as the rest of the application, which can lead to more consistent progress reporting across different components.


5593-5596: Optimize frame filtering in main function

When provider.example_indices is not None, the code now converts the indices to a set for O(1) search time, and uses a list comprehension to filter the frames.

This optimization can significantly improve performance when dealing with large numbers of frames, as it reduces the time complexity of the filtering operation from O(n^2) to O(n).


5599-5599: Update tracker execution in main function

The tracker.run_tracker method is now called with a frames parameter instead of separate video and frames parameters.

This change simplifies the tracker execution by passing all necessary information in the frames parameter. It's likely that the run_tracker method has been updated to work with this new signature.

tests/nn/test_tracking_integration.py Outdated Show resolved Hide resolved
sleap/util.py Outdated Show resolved Hide resolved
sleap/instance.py Outdated Show resolved Hide resolved
@getzze getzze force-pushed the simplify-max-tracks branch from c656bb8 to bb67467 Compare September 4, 2024 16:14
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between c656bb8 and bb67467.

Files selected for processing (13)
  • docs/guides/cli.md (4 hunks)
  • sleap/config/pipeline_form.yaml (3 hunks)
  • sleap/gui/learning/dialog.py (1 hunks)
  • sleap/gui/learning/runners.py (6 hunks)
  • sleap/gui/widgets/video.py (2 hunks)
  • sleap/instance.py (2 hunks)
  • sleap/nn/inference.py (12 hunks)
  • sleap/nn/tracker/components.py (8 hunks)
  • sleap/nn/tracking.py (33 hunks)
  • sleap/util.py (2 hunks)
  • tests/nn/test_inference.py (8 hunks)
  • tests/nn/test_tracker_components.py (11 hunks)
  • tests/nn/test_tracking_integration.py (6 hunks)
Files skipped from review as they are similar to previous changes (7)
  • sleap/config/pipeline_form.yaml
  • sleap/gui/learning/runners.py
  • sleap/nn/inference.py
  • sleap/nn/tracker/components.py
  • sleap/util.py
  • tests/nn/test_inference.py
  • tests/nn/test_tracker_components.py
Additional context used
Ruff
sleap/nn/tracking.py

486-486: Use enumerate() for index variable n_processed in for loop

(SIM113)

sleap/instance.py

28-28: typing.ForwardRef imported but unused

Remove unused import: typing.ForwardRef

(F401)

Additional comments not posted (18)
tests/nn/test_tracking_integration.py (4)

22-22: Clarify the reason for the change in the expected number of tracks.

The assertion has been updated to check if the length of labels.tracks is equal to 8 instead of 27. What is the reason for this change? Is it due to modifications in the tracking algorithm or configuration parameters?


25-28: Approve the function name change for improved clarity.

The function name change from test_simplemax_tracker to test_simple_max_tracks improves clarity by explicitly mentioning max_tracks.

Clarify the reason for the changes in the CLI command.

The CLI command within the test has been modified:

  • The tracker has been changed from simplemaxtracks to simple.
  • The max_tracks parameter has been added and set to 2.

What is the reason for these changes? Do they reflect modifications in the tracking algorithm being tested?


45-45: LGTM: Approve calling run_tracker directly on the tracker object.

The change to call run_tracker directly on the tracker object improves code organization and maintainability by encapsulating the tracking logic within the tracker instance.


110-116: LGTM: Approve simplifying the make_tracker function.

The changes to the make_tracker function, including:

  • Removing the max_tracking parameter from the function signature.
  • Directly passing the max_tracks parameter to the tracker constructor.

These changes improve code readability and maintainability by simplifying the function and removing conditional logic based on max_tracking.

docs/guides/cli.md (2)

127-127: LGTM!

The removal of the --max_tracking argument simplifies the tracking options and the documentation has been updated accordingly.


191-191: LGTM!

The documentation now emphasizes the --max_tracks argument and the examples have been updated accordingly. The changes are consistent with the removal of the --max_tracking argument.

Also applies to: 265-271

sleap/gui/learning/dialog.py (1)

736-736: LGTM!

The code change ensures that no_result_count is always non-negative by using max(). This improves the robustness of the code and aligns with the AI-generated summary.

sleap/nn/tracking.py (7)

118-131: LGTM!

The changes to make CandidateMaker an abstract base class are good architectural improvements. It enforces a common interface for all candidate makers.


Line range hint 133-388: LGTM!

The changes to FlowCandidateMaker look good. Inheriting from CandidateMaker ensures it implements the required interface, and the updated get_candidates method is more streamlined.


Line range hint 390-414: LGTM!

The changes to SimpleCandidateMaker look good. Inheriting from CandidateMaker ensures it implements the required interface.

Tools
Ruff

408-408: Local variable ref_t is assigned to but never used

Remove assignment to unused variable ref_t

(F841)


438-580: LGTM!

The changes to BaseTracker look good. The new verbosity and report_rate attributes allow configuring the progress reporting, and the updated run_tracker method handles the new verbosity option correctly.

Tools
Ruff

486-486: Use enumerate() for index variable n_processed in for loop

(SIM113)


Line range hint 629-1293: LGTM!

The changes to Tracker look good overall:

  • The new attributes allow for more flexible configuration of the tracking behavior.
  • The infer_next_timestep method reduces code duplication by centralizing the timestep logic.
  • The refactored track method is cleaner and easier to follow.
  • The new spawn_new_track and assign_to_track methods provide clearer logic for managing tracks.
  • The updated spawn_for_untracked_instances method incorporates the new track assignment strategies based on user preferences.

1435-1440: LGTM!

The addition of the verbosity and report_rate attributes to KalmanTracker is consistent with the changes made to BaseTracker and Tracker. It allows configuring the progress reporting for the Kalman filter-based tracking.


486-486: Ignore the Ruff suggestion.

The variable n_processed is not used as an index within the loop. It is a counter that is incremented on each iteration. Using enumerate() would not provide any benefit here and would make the code less readable.

Tools
Ruff

486-486: Use enumerate() for index variable n_processed in for loop

(SIM113)

sleap/instance.py (2)

1182-1183: LGTM!

The code changes are approved.


1186-1248: LGTM!

The code changes are approved.

sleap/gui/widgets/video.py (2)

819-820: LGTM!

The code changes are approved. Initializing _down_pos to None and checking it before use improves the robustness of the mouse event handling.


1044-1044: LGTM!

The code changes are approved. Checking _down_pos is not None before comparing it with event.pos() prevents potential errors when _down_pos is uninitialized.

sleap/instance.py Outdated Show resolved Hide resolved
@getzze getzze force-pushed the simplify-max-tracks branch from bb67467 to 06ef136 Compare September 4, 2024 16:23
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Outside diff range, codebase verification and nitpick comments (2)
tests/nn/test_tracker_components.py (2)

18-33: Add a docstring to document the function.

The function can be improved by adding a docstring to document the parameters and return value.

def run_tracker_by_name(frames=None, img_scale: float = 0, **kwargs):
    """
    Run a tracker by name with the specified parameters.

    Args:
        frames: The frames to track.
        img_scale: The image scale to use for tracking.
        **kwargs: Additional keyword arguments to pass to the tracker.

    Returns:
        The tracked frames.
    """
    ...

Line range hint 61-81: Add the img_scale parameter to the test function.

For consistency with the other test functions, add the img_scale parameter to the run_tracker_by_name function call.

def test_oks_tracker_by_name(
    centered_pair_predictions_sorted,
    tracker,
    oks_score_weighting,
    oks_normalization,
):
    # This is slow, so limit to 5 time points
    frames = centered_pair_predictions_sorted[:5]

    run_tracker_by_name(
        frames=frames,
        tracker=tracker,
        similarity="object_keypoint",
        matching="greedy",
        oks_score_weighting=oks_score_weighting,
        oks_normalization=oks_normalization,
        max_tracks=2,
+       img_scale=0,
    )
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between bb67467 and 06ef136.

Files selected for processing (10)
  • docs/guides/cli.md (4 hunks)
  • sleap/config/pipeline_form.yaml (3 hunks)
  • sleap/gui/learning/runners.py (6 hunks)
  • sleap/instance.py (2 hunks)
  • sleap/nn/inference.py (12 hunks)
  • sleap/nn/tracker/components.py (8 hunks)
  • sleap/nn/tracking.py (33 hunks)
  • tests/nn/test_inference.py (8 hunks)
  • tests/nn/test_tracker_components.py (11 hunks)
  • tests/nn/test_tracking_integration.py (6 hunks)
Files skipped from review as they are similar to previous changes (5)
  • sleap/config/pipeline_form.yaml
  • sleap/gui/learning/runners.py
  • sleap/nn/inference.py
  • sleap/nn/tracker/components.py
  • tests/nn/test_inference.py
Additional context used
Ruff
sleap/nn/tracking.py

486-486: Use enumerate() for index variable n_processed in for loop

(SIM113)

Additional comments not posted (22)
tests/nn/test_tracking_integration.py (4)

22-22: Please provide more context for the change in expected tracks.

The assertion has been updated to expect 8 tracks instead of 27. What is the reason behind this change? Have there been updates to the tracking algorithm or test data that would explain the reduced number of expected tracks?


25-28: LGTM!

The renaming of the function and the update to the CLI command improve clarity and consistency. The test still verifies the expected behavior of limiting the maximum number of tracks.


45-45: LGTM!

The integration of the run_tracker functionality directly into the tracker object simplifies the code structure and improves readability.


109-114: Please provide more context for the removal of max_tracking checks.

The conditional checks for max_tracking have been removed from the make_tracker function. Can you explain the reason behind this change? Are there any potential side effects or unintended consequences of always passing the max_tracks parameter to the tracker constructor?

tests/nn/test_tracker_components.py (2)

35-58: LGTM!

The code changes are approved.


Line range hint 244-306: LGTM!

The code changes are approved.

Also applies to: 308-366, 368-430

docs/guides/cli.md (4)

127-127: LGTM!

The removal of --max_tracking and the updated description of --max_tracks simplify the tracking options available to users.


191-191: LGTM!

The updated description of --tracking.max_tracks clarifies that there is no limit if set to None or -1.


265-265: LGTM!

The removal of --max_tracking from the example usage is consistent with the changes made to the sleap-track usage.


271-271: LGTM!

The removal of --max_tracking from the example usage is consistent with the changes made to the sleap-track usage.

sleap/nn/tracking.py (10)

118-130: LGTM!

The changes to the CandidateMaker class are approved. Making it an abstract base class with the get_candidates abstract method improves extensibility.


Line range hint 133-384: LGTM!

The changes to the FlowCandidateMaker class are approved. Inheriting from CandidateMaker, updating the get_candidates method, and streamlining the candidate generation logic improves the code.


Line range hint 390-413: LGTM!

The changes to the SimpleCandidateMaker class are approved. Inheriting from CandidateMaker and updating the get_candidates method is consistent with the new design.

Tools
Ruff

408-408: Local variable ref_t is assigned to but never used

Remove assignment to unused variable ref_t

(F841)


438-579: LGTM!

The changes to the BaseTracker class are approved. Adding the verbosity and report_rate attributes and updating the run_tracker method with verbosity handling logic improves configurability and progress reporting.

Tools
Ruff

486-486: Use enumerate() for index variable n_processed in for loop

(SIM113)


Line range hint 629-908: LGTM!

The changes to the Tracker class are approved. The new attributes improve configurability, the infer_next_timestep method centralizes timestep inference logic, and the new spawn_new_track, assign_to_track, and update_matched_instance_tracks methods provide clearer track management. The updated spawn_for_untracked_instances method incorporates the new track assignment strategies effectively.


Line range hint 962-1057: LGTM!

The changes to the Tracker.make_tracker_by_name method are approved. Adding the new parameters max_tracks, prefer_reassigning_track, allow_reassigning_track, pre_cull_merge_instances, and pre_cull_merging_penalty improves tracking options and pre-culling functionality. The parsing of max_tracks and the incorporation of the new parameters in pre_cull_function are implemented correctly.


Line range hint 1073-1224: LGTM!

The changes to the Tracker.get_by_name_factory_options method are approved. Adding the new options for max_tracks, allow_reassigning_track, prefer_reassigning_track, pre_cull_merge_instances, and pre_cull_merging_penalty improves tracking configurability. The updated help text for the save_shifted_instances option provides clearer information.


1279-1293: LGTM!

The change to the Tracker.add_cli_parser_args method for the tracker argument is approved. Avoiding the definition of a default value helps detect malformed command-line arguments, which is a good practice.


1435-1439: LGTM!

The changes to the KalmanTracker class are approved. Adding the verbosity and report_rate attributes is consistent with the changes made to the BaseTracker class and improves configurability and progress reporting.


486-486: Skipping Ruff suggestion.

The suggestion to use enumerate() for the n_processed variable is a false positive. n_processed is used as a counter and not as an index, so using enumerate() would not provide any benefit and would reduce code readability.

Tools
Ruff

486-486: Use enumerate() for index variable n_processed in for loop

(SIM113)

sleap/instance.py (2)

1182-1183: LGTM!

The code changes are approved.


1186-1248: LGTM!

The code changes are approved. The function follows the specified merging criteria and correctly applies the penalty factor to the scores of the merged instances.

Copy link

codecov bot commented Sep 4, 2024

Codecov Report

Attention: Patch coverage is 65.46053% with 105 lines in your changes missing coverage. Please review.

Project coverage is 75.21%. Comparing base (7ed1229) to head (06ef136).
Report is 56 commits behind head on develop.

Files with missing lines Patch % Lines
sleap/nn/tracking.py 74.28% 45 Missing ⚠️
sleap/instance.py 15.62% 27 Missing ⚠️
sleap/nn/inference.py 66.17% 23 Missing ⚠️
sleap/nn/tracker/components.py 16.66% 5 Missing ⚠️
sleap/gui/learning/runners.py 60.00% 2 Missing ⚠️
sleap/gui/learning/dialog.py 0.00% 1 Missing ⚠️
sleap/gui/widgets/video.py 50.00% 1 Missing ⚠️
sleap/util.py 93.33% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1896      +/-   ##
===========================================
+ Coverage    73.30%   75.21%   +1.90%     
===========================================
  Files          134      133       -1     
  Lines        24087    24558     +471     
===========================================
+ Hits         17658    18471     +813     
+ Misses        6429     6087     -342     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gitttt-1234 gitttt-1234 mentioned this pull request Sep 4, 2024
11 tasks
@gitttt-1234
Copy link
Contributor

gitttt-1234 commented Sep 5, 2024

Hi @getzze,

Thanks for this!

We notice that you tried to simplify our tracking method by removing max_tracking. In SLEAP, we currently support 2 candidate generation methods: Fixed window and Local queues. In Fixed window method, instances from the last window_size frames are stored in the tracker queue. This method focuses on recent predictions but can lose a track if no detections are assigned for window_size frames. In Local queues method, the last window_size instances for each track are stored in the tracker queue. This method is better at connecting detections over gaps but doesn't rely on recent predictions. To support both the methods, we use two different data structures (a deque for fixed window method and a dict[track_id, deque] for local queues) and also max_tracking parameter to determine the type of candidate generation method to be used.

The PR refactors the code to implement a fixed window method with the option of limiting the number of new tracks being created. However, this does not support the local queues method, which is preferred in some cases (videos where tracks are to be connected after short gaps).

Let us know if you have any questions!

Thanks!

Divya

@getzze
Copy link
Contributor Author

getzze commented Sep 5, 2024

Hi @gitttt-1234,

Thanks for reviewing the PR!

I chose to remove the Local queues alternative because I thought its only point was to limit the number of tracks/identities. I didn't get that it could be used if only one instance is hidden for some time. Anyway, I think it's very unlikely that the instance would reappear exactly at the exact same place it disappeared, so it is matched to the track that disappeared. In my tests, I always got tracks that become empty after the instance is lost for window_size time steps, they are never "caught" again by the track.

Instead I added an (optional) step to assign an instance to a track that has disappeared for more than window_size time steps, which works better in my tests (two mice).

Something I didn't do in this PR and that should probably be done, is to put back the pre_cull_to_target option in the GUI, to cap the number of instances before assigning them to tracks (whereas max_tracks cap them after assigning them to tracks), to be able to get back the pre-#1447 behavior.

@talmo
Copy link
Collaborator

talmo commented Sep 6, 2024

Hi @getzze,

Thanks for the contributions! Apologies it's taken us so long to work through it all, but we did have some questions and concerns about the changes. @gitttt-1234 described those inline, but I think maybe there's also some misunderstandings that I want to clarify to make sure we're on the same page -- see below.

Hi @gitttt-1234,

Thanks for reviewing the PR!

I chose to remove the Local queues alternative because I thought its only point was to limit the number of tracks/identities.

As @gitttt-1234 mentioned, the point of the local queues is indeed to limit the maximum number of tracks that can be possibly spawned in an entire project, but it does so using a specific algorithmic approach.

By specifying the maximum number of tracks, what we can do is create buffers (up to that limit) that hold the past window_size detected instances that have been assigned to that track. This means that if we don't see an instance from track_0 for 1000+ frames, we're still able to assign a new instance to track_0 back to that track.

The main thing this solves is that it prevents the explosion of new tracklets being created in scenarios where there is a lot of track fragmentation due to frequent occlusions (or lots of animals in the FOV).

I didn't get that it could be used if only one instance is hidden for some time.

Again, to be clear, with local track queues, the instances can be used no matter how many instances are hidden and no matter the amount of time.

Anyway, I think it's very unlikely that the instance would reappear exactly at the exact same place it disappeared, so it is matched to the track that disappeared.

Using instances from a long time in the past is not great since the detections will be pretty stale, but it can work well when animals are sleeping or hiding under a hut or something else out of view. It does not, however, mean that the instance needs to reappear in the exact same place. It just means that the last detection(s) from that track will still be in consideration during the candidate generation and matching steps.

If we lose a track for a long amount of time, the truly appropriate thing to do would be to explicitly use ReID features, but we don't have that implemented in here yet, so in the meantime, this is a "best effort" approach that doesn't require any new information to be integrated in the algorithm.

In my tests, I always got tracks that become empty after the instance is lost for window_size time steps, they are never "caught" again by the track.

Could you elaborate on what you mean by that? How does a track "become empty"?

Again, by design, if an instance is lost for more than window_size time steps, when using the local queues in simplemaxtracks or flowmaxtracks, last few instances that have been assigned to a track can never be lost no matter the gap length since they're stored in queues local to the track (ergo, "track local queues").

Instead I added an (optional) step to assign an instance to a track that has disappeared for more than window_size time steps, which works better in my tests (two mice).

The track local queues should handle this explicitly. We don't make a distinction between tracks that are "stale" vs "active", but treat all instances from all queues as valid candidates. I could see an advantage to separating these two (e.g., for preferring active tracks that have recent detections before considering the stale ones), but this is fundamentally a different tracking algorithm and not a replacement for what we have now.

Something I didn't do in this PR and that should probably be done, is to put back the pre_cull_to_target option in the GUI, to cap the number of instances before assigning them to tracks (whereas max_tracks cap them after assigning them to tracks), to be able to get back the pre-#1447 behavior.

We removed pre-culling as this was superseded by a feature set for limiting the number of instances right at the model inference level. This is exposed via the --max_instances flag.

We implemented these in:

The advantage is that it allows the users to cull to a specific number of instances regardless of whether tracking will be applied next -- so it's usable during the human-in-the-loop annotation stage as well to filter out some false positives when models aren't trained well enough yet.


We also had a question about the instance merging functionality you added. It makes total sense, but you might imagine how this will have lots of edge cases.

For example: if the two node sets are exactly disjoint, but correctly belong to two different animals, we'd be incorrectly merging them into what we call a "chimera" (top of one mouse connected to the bottom of another mouse).

Another scenario we see often that this wouldn't quite resolve is when we have two almost disjoint sets, except for one node that is maybe off by a couple of pixels. Adding a bit of fuzziness to the keypoint matching to test for "disjoint-ness" before triggering this function could help cover those cases as well.

The thing I'm a bit more confused by is that you're mentioning this as useful when there are two instances on the same frame that also belong to the same track. How is this even possible given any of the current tracking approaches? We always do matching that ensures that no more than one instance is assigned to a given track within a frame.

Establishing that two (partial) instances indeed belong to the same track is the hard problem that's at the crux of this instance merging problem -- and why we haven't implemented this before.

Some clarity on what you had in mind for dealing with this would help us understand the solution.


In general, we're super appreciative of the contributions, but we want to make sure that:

  1. New algorithms are applicable to the broader use cases that we see across the user base;
  2. We are very careful not to break backwards compatibility, especially when removing core algorithms; and
  3. We make best efforts to separate different changes into different PRs for future maintainability and being able to track down when bugs or regressions are introduced (for example, the instance merging versus max tracking changes here are touching on very different things that should probably be implemented and tested separately).

If you prefer to have a chat in realtime, just drop me an email --> [email protected].

Thanks again for all the work! Looking forward to getting this merged in :)

Talmo

@getzze
Copy link
Contributor Author

getzze commented Sep 12, 2024

Now I understand the use of Local queues.

Maybe the best approach is to make 2 PR, one for bringing the max_tracks feature to the Window_size candidate maker and another one to add the merging option. But the name flowmaxtracks would need to be changed in my opinion, to better reflect that it's using a local queue.

I will try to send a screenshot to explain what I mean by a track becoming empty. But basically, when a track is not detected for window_size timesteps, no instance is never assigned to it anymore, ever. I think it has to do with the loop in FlowMaxTracksCandidateMaker.get_candidates, line 408. But I didn't investigate much.
When using pre-culling to 2 instances, it would not happen.

I didn't realize that you could limit the number of instances from inference, thanks. Although I like to try different tracking method with new videos, so I run inference and tracking separately and I would still keep all the instances in the inference stage and pre-cull when tracking. Maybe other people would do that also, I don't know.

@getzze
Copy link
Contributor Author

getzze commented Sep 12, 2024

Regarding the merging function:

For example: if the two node sets are exactly disjoint, but correctly belong to two different animals, we'd be incorrectly merging them into what we call a "chimera" (top of one mouse connected to the bottom of another mouse).

Yes but this chimera will be less likely to be the best match for the track (but it's not sure).

Another scenario we see often that this wouldn't quite resolve is when we have two almost disjoint sets, except for one node that is maybe off by a couple of pixels. Adding a bit of fuzziness to the keypoint matching to test for "disjoint-ness" before triggering this function could help cover those cases as well.

I also see that. I only implemented merging totally disjoint nodes, but we could think of merging group of nodes with one overlapping node (taking the average position of the duplicate node). It would not be too hard to add.

The thing I'm a bit more confused by is that you're mentioning this as useful when there are two instances on the same frame that also belong to the same track. How is this even possible given any of the current tracking approaches? We always do matching that ensures that no more than one instance is assigned to a given track within a frame.

The problem I tried to solve with this merging function is the following:

  1. the 2 mice are close-by, 2 full instances correctly found, say male and female.
  2. next timestep, one instance (male) partially or completely disappears
  3. next timestep, the female instance is split in two: front nodes and back nodes. Because only one partial instance can be assigned to the female track, say the front nodes are assigned to female. the back nodes are not assigned to any track. The male instance is still partially or completely absent.
  4. after some timesteps like this, the back nodes of the "female" is assigned to the male track. The front is still female.
  5. then the front disappears, no instance is assigned to female. The back of the "female" is still male.
  6. the full "female" is assigned to male.

So the idea of merging instances was to avoid the step 3, stopping the identity swap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants