Reduce false positives in ContentDetector due to camera movement #153

scarwire · 2020-02-10T18:19:32Z

Hi there! First of all, thanks for all your work on this clean, fast, and extremely promising module.

The current algorithm for detecting fast cuts with ContentDetector is already quite reliable, but looking through the raw data to find the correct threshold value can be tiresome, and lowering the threshold can sometimes produce false positives, especially in clips where there's no cut but there is fast camera movement. Camera movement can produce a series of frames with high content_val, and if any of the frames go over the threshold, it's interpreted as a cut. You can try with these examples (1, 2) that I've taken from an old film I'm studying. There are real cuts that have a content_val of only 9 or 10 (file 2, frame 1172, 00:46), but also false positives with content_val greater than 10 (file 1, frame 403, 00:16).

One possible way to deal with this would be to look for fast peaks in the amount of change -- peaks that can't last for more than a frame or two. Once we've calculated the content change for all the frames in the clip, we can then calculate the ratio of each frame's content_val to the content_val of the surrounding frames. In pseudocode:

metathreshold = 3
cut_list = []
while i < len(frames):
    content_val_ratio = content_val(frames[i])
        / (content_val(frames[i - 2]) 
        + content_val(frames[i - 1])
        + content_val(frames[i + 1])
        + content_val(frames[i + 2])) / 4)
    if (content_val_ratio >= metathreshold and content_val(frames[i]) >= 5):
        cut_list.append(i)
    i += 1

From the tests I've done, the "metathreshold" value of 3 (mark a frame as a cut if its content-val is more than three times the content-val of its neighbors) and the baseline content-val threshold of 5 (still, don't mark anything as a cut if its content-val is less than 5) seem like sane defaults. I think they're easier to pick out than the current threshold value, but this is open to discussion and more testing.

I've been messing around with the code and come up with a way of implementing this as a SceneDetector object using the model you've already created. However, the current API calculates HSV values one by one as each frame is passed to the detector, so I also added a second post-processing step in scene_manager.py after all those values have been stored. This step calculates the running average of change in surrounding frames and produces a list of cuts based on this. There are probably other ways of doing this, though.

Let me know if you think this is useful. If you're interested, I'll make a pull request.

The text was updated successfully, but these errors were encountered:

wjs018 · 2020-02-24T18:55:34Z

I think that this kind of approach could be valuable as fast camera movement leading to anomalous scene detection is something I have come across many times before. The way the code is written now, there is a post_process function in the generic scene_detector class (link). Some detector types use this (threshold and motion) while others don't (content).

So, if I am understanding the process correctly, the detector method you are discussing would operate like this:

The process_frame function does not actually add anything to the cut list as it does its analysis of frames, but instead is just used to build the stats for the video. If a stats file was provided, this could basically be skipped over altogether.
When the last frame has been input, the post_process function runs over the compiled stats for the videos and, using the method you describe, chooses the frames to mark as scene boundaries, adding them to the cut list.

Without actually trying to implement this I see a one thing that would need to be considered. Specifically, that the stats for the whole video would need to be accessible to the post_process function. This probably means saving them as a class variable in the detector class (potentially a memory issue for very long videos). Alternatively, perhaps a stats_file could be part of the __init__ constructor for this detector class and then the post_process function could parse this file.

Something else worth thinking about (perhaps @Breakthrough has thoughts on this) would be if this is the desired approach overall with PySceneDetect. What I mean is that would this kind of post-processing be better performed as part of a SceneDetector object, or would it be better to just make a function to directly analyze a stats file, after it has been analyzed with an existing detector. I would potentially lean towards this second option as, under the hood, the proposed method is still using HSV values just like content_detector, but with a different interpretation of what constitutes a detected scene.

scarwire · 2020-02-26T10:38:53Z

Yeah, that's basically my idea. The only problem is that the current post_process function takes a single frame as an argument, and if it detects a cut, that cut is added to the existing list. I don't want to mess with that function since it's used by the threshold and motion detectors. The new post-processing function would return a complete new cut list.

I've forked the repository and created a branch with a (still somewhat messy) implementation of this if anyone is interested.

wjs018 · 2020-02-26T14:35:47Z

Thanks for sharing the code! I think this is a first good pass at the issue. I will take a closer look soon to better integrate this into the rest of PySceneDetect (docs, cli, etc.). First thoughts are that it would be possible to use the existing post_process function, but make the video_manager need to be specified in the __init__ function when the detector is made. This would mean that the post_process function could just refer to the video manager that was specified when the SmartContentDetector was initialized.

One other thing that should be called out in the final version of this is that the stats_manager is required for this detector when the SceneManager is constructed. Currently, it is listed as an optional argument, but this kind of post-processing would need it.

I will take a look at these when I get a chance.

wjs018 · 2020-03-02T20:20:10Z

@scarwire I worked some more on your branch and renamed it to AdaptiveContentDetector since its sensitivity should adapt to the neighboring frames around it. Feel free to check it out here and let me know how it performs on your example cases to make sure functionality is retained. I was testing it on some of my videos, but just want to make sure it works for more than just me.

A summary of the changes I made to your implementation to try to generalize it:

I made the video_manager be specified in the constructor of the AdaptiveContentDetector
I brought all the analysis and cut detection logic into the post_process function
I implemented a check in the cli interface so that an error is thrown if a stats manager isn't used
I made sure that the scene detection respects the min_scene_len parameter
I added the min_delta_hsv parameter, which was previously hard coded as 5.0
I also added the window_width parameter which specifies the number of frames previous and after the current frame over which to do averaging. This was previously hard coded as 2

Breakthrough · 2020-06-29T14:25:25Z

Hi @scarwire;

Would you be able to provide updated links to the video files, or attach them to this issue?

I'd like to start expanding the test cases PySceneDetect has with more real-world media such as this to help future development, and prevent regressions when such features are implemented.

This is definitely something I would like to consider adding in a future version of PySceneDetect (post-v0.6 most likely), as automatic threshold determination is a highly sought-after feature. It sounds like this approach might drive some underlying API changes, which I am more than happy to support after the upcoming v0.6 release (the primary goals of which are suppression of camera flashes and quality-of-life improvements).

Thank you (and @wjs018!!) for your research, methods, and implementation, much appreciated. Looking forwards to getting this cleaned up and merged!

scarwire · 2020-07-08T07:43:12Z

Hi there, and sorry for taking so long to get back to you.

Here are updated links for the two examples I used for testing. The new detector gets more accurate results with them, but it's still not perfect.

Thanks so much for taking up this idea and also thanks again to @wjs018 for refactoring and cleaning up my code!

enginefeeder101 · 2021-01-05T07:24:27Z

Hej there,

This is exactly what i needed. I had to index 30 year old home video and this works like a charm. I updated @wjs018 fork against @Breakthrough current master, merged without conflicts and results are pretty good. Thank you!

Breakthrough · 2021-01-11T23:46:53Z

Awesome - would you be willing to create a pull request with the detector, @wjs018? Not sure the state of the code but it looks like a very interesting approach!

wjs018 · 2021-01-12T21:32:09Z

Brought my branch up to date with master and made sure nothing broke. As mentioned in #198, I haven't really tested this other than to make sure that things ran smoothly and integrated nicely with other parts of the software (such as cli).

Breakthrough · 2021-01-16T01:53:08Z

Awesome, thanks so much for the PR - I'll try to get that merged as soon as possible, I want to release v0.5.5 before merging it in however, as this will also drive some changes to v0.6 as you mentioned (re: post process). I also want to try and reduce code duplication between this and detect-content somehow, but can figure that out after it's merged.

I wanted to ask you all for your thoughts on:

In terms of the command line, should this be a new command, or should it just be additional arguments to detect-content? I was thinking to add it as an option to enable the post-processing stage. Another option would be to add this as a separate command after detect-content to perform some post-processing.
I ask this mostly because I was planning on adding flash suppression to detect-content (and is compatible with both detectors), and it looks like most of the work for this detector happens in post_process. Thus I thought that perhaps it makes sense to conditionally add that behavior rather than using a separate command (activating it with a flag, which would also reset the threshold default).
Just curious, why is min_hsv_delta so low as a default? Sorry I haven't tested this on any material yet, planning on doing so as soon as possible though, so it may not be an issue. At a glance though it seemed rather low compared to the current default, and some of the source material is greyscale (leading to lower than expected delta HSV values due to no change in hue/saturation).
As @wjs018 wrote above:

What I mean is that would this kind of post-processing be better performed as part of a SceneDetector object, or would it be better to just make a function to directly analyze a stats file, after it has been analyzed with an existing detector. I would potentially lean towards this second option as, under the hood, the proposed method is still using HSV values just like content_detector, but with a different interpretation of what constitutes a detected scene.

What are your thoughts on the best way to proceed with post_process()? I agree the current format isn't that useful - does it make sense to pass the existing cutting list to it, or nothing at all? If the API was modified to suit these needs, then I think the existing implementation might be able to be merged with ContentDetector. In my head I feel like the post_process function should still go with the detector itself, but if you think it should be a separate object entirely, I'm open to that as well (e.g. some kind of filter object). If that's the case though, I would probably prefer to add it as a separate command that gets run after detect-content.

Thank you so much for this everyone (esp. @wjs018 and @scarwire), looking forward to getting it merged in.

Breakthrough · 2021-02-21T00:47:45Z

TODO: In regards to the CLI, I think the new detector is different enough in functionality that it should be a separate command. I'm thinking detect-adaptive to make it follow convention. After further review of the new detector most of my above questions have been resolved, although I definitely am still looking for any feedback on the third point / any potential API implications this might have for the v0.6 refactor.

This has been merged into the v0.5.6 branch right now though, and will definitely be in the next release of PySceneDetect :)

Breakthrough · 2021-02-21T03:23:53Z

@scarwire could you let me know some more details about the video, or is there an IMDB page I could reference? I'd like to add the clips you shared as a test case, but need to work out the copyright details. Thanks!

Also the defaults seem to be a bit too sensitive for colour video - I realize now, that an option should be provided to use either ONLY the change in luminance/intensity, both for ContentDetector, as well as AdaptiveDetector, as the change in hue/saturation will always be neglegible/zero for greyscale material (thus those frame scores are always 1/3 of what they would be for colour material).

My initial thought is to multiply your default threshold by 3 to compensate for this, and allow an additional flag indicating if the source material is greyscale or not (thus normalizing the approximate threshold difference between greyscale and colour source material). Does this sound reasonable?

Refactor metric key implementation accordingly. #153

division by zero (#153).

Rename AdaptiveContentDetector to AdaptiveDetector. (#153)

Breakthrough · 2021-03-08T00:47:23Z

Woo, thanks again everyone! There's now some test cases in the repo for the detector, as well as better integration overall. This detector definitely will influence the API direction for v0.6, as there's a few circular dependencies I want to see if they can be sorted out better (specifically, I think the detection algorithms need some kind of factory to disallow construction without a stats manager). I've also removed the requirement on a statsfile by making the SceneManager implicitly create a StatsManager for any detector that requires one, even if the user doesn't provide one (it will simply be discarded when the program closes).

You can also use detect-adaptive with a statsfile created by using detect-content and still gain a speedup. Additionally, changing the window size or toggling luma-only mode will generate a new column in the statsfile.

Lastly, I've multiplied the default threshold by 3, and added a new -l flag to both content and adaptive detectors which only looks at luma information rather than all of HSL/HSV as was previously done. When using greyscale material, specify the -l flag so that the threshold value will be equivalent to the same material in RGB. I tested it on my end and get equivalent results to the original threshold when specifying this flag.

I'm not sure if there's a reliable automatic way to detect greyscale inputs, so will leave it as an explicit flag for now. If anyone sees value in it, feel free to file a feature request.

This shall now be included in the next release of PySceneDetect v0.5.6 :)

Breakthrough added improvement discussion feature and removed improvement labels Jun 29, 2020

wjs018 mentioned this issue Jan 12, 2021

Adaptive Threshold Content Detector #198

Merged

Breakthrough added status: accepted and removed discussion labels Jan 16, 2021

Breakthrough changed the title ~~Camera movement produces false positives in ContentDetector~~ Reduce false positives in ContentDetector due to camera movement Jan 16, 2021

Breakthrough mentioned this issue Jan 16, 2021

v1.0 Planned API Changes & Feedback #177

Closed

Breakthrough added this to the v0.5.6 milestone Jan 17, 2021

Breakthrough linked a pull request Jan 22, 2021 that will close this issue

Adaptive Threshold Content Detector #198

Merged

Breakthrough added in progress and removed status: accepted labels Feb 21, 2021

Breakthrough added a commit that referenced this issue Feb 21, 2021

Fix bug in AdaptiveContentDetector when reusing existing statsfile.

3281aa2

Refactor metric key implementation accordingly. #153

Breakthrough added a commit that referenced this issue Feb 21, 2021

Change adaptive_ratio to 255.0 instead of threshold+1 when handling

d12c4d7

division by zero (#153).

Breakthrough added a commit that referenced this issue Feb 21, 2021

Rename adaptive-detect-content to detect-adaptive.

c737a6f

Rename AdaptiveContentDetector to AdaptiveDetector. (#153)

Breakthrough added a commit that referenced this issue Mar 7, 2021

Add unit tests for AdaptiveDetector. (#153)

2922d4e

Breakthrough closed this as completed Mar 8, 2021

Breakthrough added status: completed and removed in progress labels Mar 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce false positives in ContentDetector due to camera movement #153

Reduce false positives in ContentDetector due to camera movement #153

scarwire commented Feb 10, 2020 •

edited

Loading

wjs018 commented Feb 24, 2020 •

edited

Loading

scarwire commented Feb 26, 2020 •

edited

Loading

wjs018 commented Feb 26, 2020

wjs018 commented Mar 2, 2020

Breakthrough commented Jun 29, 2020 •

edited

Loading

scarwire commented Jul 8, 2020

enginefeeder101 commented Jan 5, 2021

Breakthrough commented Jan 11, 2021

wjs018 commented Jan 12, 2021

Breakthrough commented Jan 16, 2021 •

edited

Loading

Breakthrough commented Feb 21, 2021

Breakthrough commented Feb 21, 2021 •

edited

Loading

Breakthrough commented Mar 8, 2021 •

edited

Loading

Reduce false positives in ContentDetector due to camera movement #153

Reduce false positives in ContentDetector due to camera movement #153

Comments

scarwire commented Feb 10, 2020 • edited Loading

wjs018 commented Feb 24, 2020 • edited Loading

scarwire commented Feb 26, 2020 • edited Loading

wjs018 commented Feb 26, 2020

wjs018 commented Mar 2, 2020

Breakthrough commented Jun 29, 2020 • edited Loading

scarwire commented Jul 8, 2020

enginefeeder101 commented Jan 5, 2021

Breakthrough commented Jan 11, 2021

wjs018 commented Jan 12, 2021

Breakthrough commented Jan 16, 2021 • edited Loading

Breakthrough commented Feb 21, 2021

Breakthrough commented Feb 21, 2021 • edited Loading

Breakthrough commented Mar 8, 2021 • edited Loading

scarwire commented Feb 10, 2020 •

edited

Loading

wjs018 commented Feb 24, 2020 •

edited

Loading

scarwire commented Feb 26, 2020 •

edited

Loading

Breakthrough commented Jun 29, 2020 •

edited

Loading

Breakthrough commented Jan 16, 2021 •

edited

Loading

Breakthrough commented Feb 21, 2021 •

edited

Loading

Breakthrough commented Mar 8, 2021 •

edited

Loading