[detector][adaptive] Make adaptive_ratio calculation online

Removes requirement for StatsManager, and thus AdaptiveDetector can be used with frame skip. #283 Fix callbacks by adding a buffer for the required frames.
Breakthrough · Aug 11, 2022 · b9e129e · b9e129e
1 parent 2ae70f3
commit b9e129e
Show file tree

Hide file tree

Showing 10 changed files with 197 additions and 171 deletions.
diff --git a/README.md b/README.md
@@ -53,7 +53,9 @@ from scenedetect import detect, ContentDetector
 scene_list = detect('my_video.mp4', ContentDetector())
 ```
 
-`scene_list` will now be a list containing the start/end times of all scenes found in the video. Try calling `print(scene_list)`, or iterating over each scene:
+`scene_list` will now be a list containing the start/end times of all scenes found in the video.  There also exists a two-pass version `AdaptiveDetector` which handles fast camera movement better, and `ThresholdDetector` for handling fade out/fade in events.
+
+Try calling `print(scene_list)`, or iterating over each scene:
 
 ```python
 from scenedetect import detect, ContentDetector

diff --git a/docs/changelog.md b/docs/changelog.md
@@ -13,13 +13,14 @@ PySceneDetect Releases
 #### Changelog
 
 **Command-Line Changes:**
-
  - [feature] Add `moviepy` backend wrapping the MoviePy package, uses `ffmpeg` binary on the system for video decoding
- - [feature] Edge detection can now be enabled with `detect-content` to improve accuracy in some cases, especially under lighting changes, see [new `-w`/`--weights` option](http://scenedetect.com/projects/Manual/en/latest/cli/detectors.html#detect-content) for more information
-    - Edge differences are typically larger than other components, so you may need to increase `-t`/`--threshold` higher when increasing the edge weight (the last component)
-    - For example, a good starting point is to place 100% weight on the change in a frame's hue, 50% on saturation change, 100% on luma (brightness) change, and 25% on the change in edges, with a threshold of 32 (final score is normalized, sum of weights does not need to equal 100%):
+ - [feature] Edge detection can now be enabled with `detect-content` and `detect-adaptive` to improve accuracy in some cases, especially under lighting changes, see [new `-w`/`--weights` option](http://scenedetect.com/projects/Manual/en/latest/cli/detectors.html#detect-content) for more information
+    - A good starting point is to place 100% weight on the change in a frame's hue, 50% on saturation change, 100% on luma (brightness) change, and 25% on change in edges, with a threshold of 32:
+    `detect-adaptive -w 1.0 0.5 1.0 0.25`
+    - Edge differences are typically larger than other components, so you may need to increase `-t`/`--threshold` higher when increasing the edge weight (the last component) with `detect-content, for example:
     `detect-content -w 1.0 0.5 1.0 0.25 -t 32`
     - May be enabled by default in the future once it has been more thoroughly tested, further improvements for `detect-content` are being investigated as well (e.g. motion compensation, flash suppression)
+   - Short-form of `detect-content` option `--frame-window` has been changed from `-w` to `-f` to accomodate this change
  - [enhancement] Progress bar now displays number of detections while processing, no longer conflicts with log message output
  - [enhancement] When using ffmpeg to split videos, `-map 0` has been added to the default arguments so other audio tracks are also included when present ([#271](https://github.com/Breakthrough/PySceneDetect/issues/271))
  - [enhancement] Add `-a` flag to `version` command to print more information about versions of dependencies/tools being used
@@ -30,29 +31,25 @@ PySceneDetect Releases
 **General:**
 
  - [feature] Add new backend `VideoStreamMoviePy` using the MoviePy package`
- - [feature] Add edge detection to `ContentDetector` ([#35](https://github.com/Breakthrough/PySceneDetect/issues/35))
+ - [feature] Add edge detection to `ContentDetector` and `AdaptiveDetector` ([#35](https://github.com/Breakthrough/PySceneDetect/issues/35))
     - Add ability to specify content score weights of hue, saturation, luma, and edge differences between frames
     - Default remains as `1.0, 1.0, 1.0, 0.0` so there is no change in behavior
     - Kernel size used for improving edge overlap can also be customized
+ - [feature] `AdaptiveDetector` no longer requires a `StatsManager` and can now be used with `frame_skip` ([#283](https://github.com/Breakthrough/PySceneDetect/issues/283))
  - [bugfix] Fix `scenedetect.detect()` throwing `TypeError` when specifying `stats_file_path`
  - [bugfix] Fix off-by-one error in end event timecode when `end_time` was set (reported end time was always one extra frame)
- - [enhancement] Add optional `start_time` and `end_time` arguments to `scenedetect.detect()`
- - [enhancement] If available, the `ffmpeg` binary from the `imageio_ffmpeg` package will be used if one could not be found in PATH
+ - [enhancement] Add optional `start_time`, `end_time`, and `start_in_scene` arguments to `scenedetect.detect()` ([#282](https://github.com/Breakthrough/PySceneDetect/issues/282))
  - [enhancement] Add `-map 0` option to default arguments of `split_video_ffmpeg` to include all audio tracks by default ([#271](https://github.com/Breakthrough/PySceneDetect/issues/271))
  - [docs] Add example for [using a callback](http://scenedetect.com/projects/Manual/en/v0.6.1/api/scene_manager.html#usage) ([#273](https://github.com/Breakthrough/PySceneDetect/issues/273))
- - [enhancement] Add thread-safe `stop()` method to `SceneManager` ([#274](https://github.com/Breakthrough/PySceneDetect/issues/274))
  - [enhancement] Add new `VideoCaptureAdapter` to make existing `cv2.VideoCapture` objects compatible with a `SceneManager` ([#276](https://github.com/Breakthrough/PySceneDetect/issues/276))
     - Primary use case is for handling input devices/webcams and gstreamer pipes, [see updated examples](http://scenedetect.com/projects/Manual/en/latest/api/backends.html#devices-cameras-pipes)
     - Files, image sequences, and network streams/URLs should continue to use `VideoStreamCv2`
- - [enhancement] No-op progress bar and log capture objects are now provided in `scenedetect.platform` for systems without `tqdm`
- - [enhancement] Add `start_in_scene` argument to `detect()` function ([#282](https://github.com/Breakthrough/PySceneDetect/issues/282))
- - [api] The `SceneManager` methods `get_cut_list()` and `get_event_list()` are now deprecated, along with the `base_timecode` argument, and will be removed in a future version
- - [api] The `base_timecode` argument of `get_scenes_from_cuts()` in `scenedetect.stats_manager` is now deprecated and will be removed in a future version (the signature of this function has been changed accordingly)
- - [general] The default `crf` used for `split_video_ffmpeg` has been changed from 21 to 22 to match the CLI default
- - [enhancement] Add `interpolation` property to `SceneManager` to allow setting interpolation method for frame downscaling
- - [enhancement] `SceneManager` now downscales using linear interpolation by default, previously used nearest neighbor
- - [enhancement] Add `interpolation` argument to `save_images` to allow setting interpolation method when resizing images
+ - [api] The `SceneManager` methods `get_cut_list()` and `get_event_list()` are deprecated, along with the `base_timecode` argument
+ - [api] The `base_timecode` argument of `get_scenes_from_cuts()` in `scenedetect.stats_manager` is deprecated (the signature of this function has been changed accordingly)
  - [api] Rename `AdaptiveDetector` constructor parameter `min_delta_hsv` to `min_content_val
+ - [general] The default `crf` for `split_video_ffmpeg` has been changed from 21 to 22 to match command line default
+ - [enhancement] Add `interpolation` property to `SceneManager` to allow setting method of frame downscaling, use linear interpolation by default (previously nearest neighbor)
+ - [enhancement] Add `interpolation` argument to `save_images` to allow setting image resize method (default remains bicubic)
 
 ### 0.6 (May 29, 2022)
 

diff --git a/scenedetect/cli/__init__.py b/scenedetect/cli/__init__.py
@@ -554,7 +554,7 @@ def detect_content_command(
 )
 @click.option(
     '--frame-window',
-    '-w',
+    '-f',
     metavar='VAL',
     type=click.INT,
     default=None,

diff --git a/scenedetect/cli/context.py b/scenedetect/cli/context.py
@@ -372,6 +372,13 @@ def handle_detect_adaptive(
                 else:
                     min_scene_len = self.config.get_value("detect-adaptive", "min-scene-len")
             min_scene_len = parse_timecode(min_scene_len, self.video_stream.frame_rate).frame_num
+
+        if weights is not None:
+            try:
+                weights = scenedetect.detectors.ContentDetector.Components(*weights)
+            except ValueError as ex:
+                logger.debug(str(ex))
+                raise click.BadParameter(str(ex), param_hint='weights')
         # Log detector args for debugging before we construct it.
         detector_args = {
             'adaptive_threshold':

diff --git a/scenedetect/detectors/adaptive_detector.py b/scenedetect/detectors/adaptive_detector.py
@@ -49,11 +49,11 @@ def __init__(
     ):
         """
         Arguments:
-            adaptive_threshold: Threshold value (float) that the calculated frame score must exceed to
-                trigger a new scene (see frame metric adaptive_ratio in stats file).
+            adaptive_threshold: Threshold (float) that score ratio must exceed to trigger a
+                new scene (see frame metric adaptive_ratio in stats file).
             min_scene_len: Minimum length of any scene.
-            window_width: Size of window (number of frames) before and after each frame to average together in'
-                order to detect deviations from the mean.
+            window_width: Size of window (number of frames) before and after each frame to
+                average together in order to detect deviations from the mean. Must be at least 1.
             min_content_val: Minimum threshold (float) that the content_val must exceed in order to
                 register as a new scene. This is calculated the same way that `detect-content`
                 calculates frame score based on `weights`/`luma_only`/`kernel_size`.
@@ -75,6 +75,8 @@ def __init__(
         if min_delta_hsv is not None:
             logger.error('min_delta_hsv is deprecated, use min_content_val instead.')
             min_content_val = min_delta_hsv
+        if window_width < 1:
+            raise ValueError('window_width must be at least 1.')
 
         super().__init__(
             threshold=255.0,
@@ -84,27 +86,33 @@ def __init__(
             kernel_size=kernel_size,
         )
 
-        # TODO: Make all properties private.
+        # TODO: Turn these options into properties.
         self.min_scene_len = min_scene_len
         self.adaptive_threshold = adaptive_threshold
         self.min_content_val = min_content_val
         self.window_width = window_width
+
         self._adaptive_ratio_key = AdaptiveDetector.ADAPTIVE_RATIO_KEY_TEMPLATE.format(
             window_width=window_width, luma_only='' if not luma_only else '_lum')
         self._first_frame_num = None
         self._last_frame_num = None
 
+        self._last_cut: Optional[int] = None
+
+        self._buffer = []
+
+    @property
+    def event_buffer_length(self) -> int:
+        """Number of frames any detected cuts will be behind the current frame due to buffering."""
+        return self.window_width
+
     def get_metrics(self) -> List[str]:
-        """ Combines base ContentDetector metric keys with the AdaptiveDetector one. """
+        """Combines base ContentDetector metric keys with the AdaptiveDetector one."""
         return super().get_metrics() + [self._adaptive_ratio_key]
 
     def stats_manager_required(self) -> bool:
-        """ Overload to indicate that this detector requires a StatsManager.
-
-        Returns:
-            True as AdaptiveDetector requires stats.
-        """
-        return True
+        """Not required for AdaptiveDetector."""
+        return False
 
     def process_frame(self, frame_num: int, frame_img: Optional[ndarray]) -> List[int]:
         """ Similar to ThresholdDetector, but using the HSV colour space DIFFERENCE instead
@@ -121,93 +129,56 @@ def process_frame(self, frame_num: int, frame_img: Optional[ndarray]) -> List[in
             Empty list
         """
 
-        # Call the process_frame function of ContentDetector but ignore any
-        # returned cuts
-        if self.is_processing_required(frame_num):
-            super().process_frame(frame_num=frame_num, frame_img=frame_img)
-
-        if self._first_frame_num is None:
-            self._first_frame_num = frame_num
-        self._last_frame_num = frame_num
+        # TODO(#283): Merge this with ContentDetector and turn it on by default.
 
-        return []
+        super().process_frame(frame_num=frame_num, frame_img=frame_img)
 
-    def get_content_val(self, frame_num: int) -> float:
-        """
-        Returns the average content change for a frame.
-        """
-        return self.stats_manager.get_metrics(frame_num, [ContentDetector.FRAME_SCORE_KEY])[0]
-
-    def post_process(self, _unused_frame_num: int):
-        """
-        After an initial run through the video to detect content change
-        between each frame, we try to identify fast cuts as short peaks in the
-        `content_val` value. If a single frame has a high `content-val` while
-        the frames around it are low, we can be sure it's fast cut. If several
-        frames in a row have high `content-val`, it probably isn't a cut -- it
-        could be fast camera movement or a change in lighting that lasts for
-        more than a single frame.
-        """
-        cut_list = []
-        if self._first_frame_num is None:
+        required_frames = 1 + (2 * self.window_width)
+        self._buffer.append((frame_num, self._frame_score))
+        if not len(self._buffer) >= required_frames:
             return []
-        adaptive_threshold = self.adaptive_threshold
-        window_width = self.window_width
-        last_cut = None
-
-        assert self.stats_manager is not None
+        self._buffer = self._buffer[-required_frames:]
+        target = self._buffer[self.window_width]
+        average_window_score = (
+            sum(frame[1] for i, frame in enumerate(self._buffer) if i != self.window_width) /
+            (2.0 * self.window_width))
+
+        average_is_zero = abs(average_window_score) < 0.00001
+
+        adaptive_ratio = 0.0
+        if not average_is_zero:
+            adaptive_ratio = min(target[1] / average_window_score, 255.0)
+        elif average_is_zero and target[1] >= self.min_content_val:
+            # if we would have divided by zero, set adaptive_ratio to the max (255.0)
+            adaptive_ratio = 255.0
+        if self.stats_manager is not None:
+            self.stats_manager.set_metrics(target[0], {self._adaptive_ratio_key: adaptive_ratio})
 
+        cut_list = []
+        # Check to see if adaptive_ratio exceeds the adaptive_threshold as well as there
+        # being a large enough content_val to trigger a cut
+        if (adaptive_ratio >= self.adaptive_threshold and target[1] >= self.min_content_val):
+
+            if self._last_cut is None:
+                # No previously detected cuts
+                cut_list.append(target[0])
+                self._last_cut = target[0]
+            elif (target[0] - self._last_cut) >= self.min_scene_len:
+                # Respect the min_scene_len parameter
+                cut_list.append(target[0])
+                # TODO: Should this be updated every time the threshold is exceeded?
+                # It might help with flash suppression for example.
+                self._last_cut = target[0]
+
+        return cut_list
+
+    # TODO(v0.6.1): Deprecate & remove this method.
+    def get_content_val(self, frame_num: int) -> Optional[float]:
+        """Returns the average content change for a frame."""
         if self.stats_manager is not None:
-            # Loop through the stats, building the adaptive_ratio metric
-            for frame_num in range(self._first_frame_num + window_width + 1,
-                                   self._last_frame_num - window_width):
-                # If the content-val of the frame is more than
-                # adaptive_threshold times the mean content_val of the
-                # frames around it, then we mark it as a cut.
-                denominator = 0
-                for offset in range(-window_width, window_width + 1):
-                    if offset == 0:
-                        continue
-                    else:
-                        denominator += self.get_content_val(frame_num + offset)
-
-                denominator = denominator / (2.0 * window_width)
-                denominator_is_zero = abs(denominator) < 0.00001
-
-                if not denominator_is_zero:
-                    adaptive_ratio = self.get_content_val(frame_num) / denominator
-                elif denominator_is_zero and self.get_content_val(
-                        frame_num) >= self.min_content_val:
-                    # if we would have divided by zero, set adaptive_ratio to the max (255.0)
-                    adaptive_ratio = 255.0
-                else:
-                    # avoid dividing by zero by setting adaptive_ratio to zero if content_val
-                    # is still very low
-                    adaptive_ratio = 0.0
-
-                self.stats_manager.set_metrics(frame_num,
-                                               {self._adaptive_ratio_key: adaptive_ratio})
-
-            # Loop through the frames again now that adaptive_ratio has been calculated to detect
-            # cuts using adaptive_ratio
-            for frame_num in range(self._first_frame_num + window_width + 1,
-                                   self._last_frame_num - window_width):
-                # Check to see if adaptive_ratio exceeds the adaptive_threshold as well as there
-                # being a large enough content_val to trigger a cut
-                if (self.stats_manager.get_metrics(
-                        frame_num, [self._adaptive_ratio_key])[0] >= adaptive_threshold
-                        and self.get_content_val(frame_num) >= self.min_content_val):
-
-                    if last_cut is None:
-                        # No previously detected cuts
-                        cut_list.append(frame_num)
-                        last_cut = frame_num
-                    elif (frame_num - last_cut) >= self.min_scene_len:
-                        # Respect the min_scene_len parameter
-                        cut_list.append(frame_num)
-                        last_cut = frame_num
-
-            return cut_list
-
-        # Stats manager must be used for this detector
+            return self.stats_manager.get_metrics(frame_num, [ContentDetector.FRAME_SCORE_KEY])[0]
+        return 0.0
+
+    def post_process(self, _unused_frame_num: int):
+        """Not required for AdaptiveDetector."""
         return []
diff --git a/scenedetect/detectors/content_detector.py b/scenedetect/detectors/content_detector.py
@@ -139,12 +139,12 @@ def __init__(
             if kernel_size < 3 or kernel_size % 2 == 0:
                 raise ValueError('kernel_size must be odd integer >= 3')
             self._kernel = numpy.ones((kernel_size, kernel_size), numpy.uint8)
+        self._frame_score: Optional[float] = None
 
     def get_metrics(self):
         return ContentDetector.METRIC_KEYS
 
     def is_processing_required(self, frame_num):
-        # TODO(v0.6.1): Deprecate this method and prepare for transition in v0.7.
         return True
 
     def _calculate_frame_score(self, frame_num: int, frame_img: numpy.ndarray) -> float:
@@ -222,13 +222,13 @@ def process_frame(self, frame_num: int, frame_img: numpy.ndarray) -> List[int]:
         if self._last_scene_cut is None:
             self._last_scene_cut = frame_num
 
-        frame_score = self._calculate_frame_score(frame_num, frame_img)
-        if frame_score is None:
+        self._frame_score = self._calculate_frame_score(frame_num, frame_img)
+        if self._frame_score is None:
             return []
 
         # We consider any frame over the threshold a new scene, but only if
         # the minimum scene length has been reached (otherwise it is ignored).
-        if frame_score >= self._threshold and (
+        if self._frame_score >= self._threshold and (
             (frame_num - self._last_scene_cut) >= self._min_scene_len):
             self._last_scene_cut = frame_num
             return [frame_num]