Keyframe Selection: Rework and add new selection methods #1343

cbentejac · 2023-01-25T10:56:44Z

Description

This PR is a replacement for #1213 and aims at completely reworking the Keyframe Selection utility.

It removes the existing selection methods and replaces them with two selection methods:

A "regular" selection method, which samples frames regularly over time with respect to some user-provided constraints (namely the maximum number of wanted keyframes, the minimum number of frames between two selected keyframes, and the maximum number of frames between two keyframes).
This was already possible with the existing Keyframe Selection utility, but has been simplified to be more user-friendly.
A "smart" selection method, which computes scores for each frame in the input sequence(s) and uses them to perform the final selection. Scores are based on the sharpness, motion and temporal position of the evaluated frame.
The goal is to select a frame that is as sharp as possible with significant motion compared to the previously selected frame: consecutive frames should not be selected as keyframes if they do not contain enough motion, even if they are both very sharp. The level of motion that is deemed significant can be adjusted with parameters, as well as the minimum and maximum number of expected keyframes.

Additionally, debug options have been added:

The scores computed for each frame can be exported as a CSV file;
The motion vectors of each frame can be exported as .png images to be visualised, with the color indicating the motion direction and the intensity indicating its magnitude.

The name of output frames can also be chosen between two options:

The frames are named using their index number within the sequence like [00012, 00253, 00498] (default);
The frames are named with consecutive frame numbers like [00000, 00001, 00002].

Videos, sequences of images and camera rigs are supported as inputs.

The corresponding Meshroom pull request can be found here: alicevision/Meshroom#1880

Features list

Remove the existing selection methods;
Add a simple selection method that samples frames regularly;
Add a smart selection method that evaluates the sharpness and optical flow of each frame to select only the most relevant;
Add debug options that allow to write scores to a CSV file and visualise the optical flow frames;
Add an option that allows choose how the ouptut frames will be named.

Implementation remarks

The sharpness of every frame is evaluated with a sliding window instead of as a whole. The goal is to ensure that an image will not be thrown away if it contains a sharp object while the rest of the image is blurry.
N.B: The most obvious performance improvement that could be done would concern the sliding window: it currently slides pixel by pixel across all the image, which is very time-consuming but doesn't necessarily bring in more information than if it was sliding faster.
The optical flow is computed for cells within the image, giving for each of these cells an average pixel displacement within that cell. The median flow value of all the cells is then selected as the motion score.
Once the sharpness and optical flow scores have been computed, the accumulated motion across frames is computed, and using a threshold, subsequences are identified. Within each subsequence, a keyframe is selected based on both its sharpness
and its position in the subsequence (frames located at the middle have an advantage over frames located on the subsequence's borders). The goal is to select frames that show a significant difference in their motion, are not too close to each other in time, and are as sharp as possible.
By default, scores are computed on downscaled frames. The downscale levels for the sharpness and motion computations may differ depending on the set parameters.

Known limitations

The "Orientation" exif tag is not properly propagated for PNG outputs, as PNG metadata differs from JPG/EXR files.
The "PixelAspectRatio" information is not propagated to JPG outputs: for unknown reasons, writing the PixelAspectRatio in JPG the same way it is for PNG/EXR outputs leads to some mistakes when it is bigger than 1 (writing a PixelAspectRatio of 2 leads to a JPG image with a PixelAspectRatio of 0.5).

src/aliceVision/keyframe/KeyframeSelector.cpp

src/aliceVision/keyframe/KeyframeSelector.hpp

mugulmd · 2023-01-26T16:44:47Z

Would be great to be able to export in other formats (jpg, png, etc.)

mugulmd · 2023-01-26T16:46:47Z

There seem to be a problem when writing the selected frames: images are brighter than they should be (maybe a gamma is applied ?)

src/aliceVision/keyframe/KeyframeSelector.cpp

simogasp · 2023-01-30T10:13:33Z

It would be also nice to add a README.md in keyframe like for the other modules to have some documentation of the methods implemented, references if any etc...

cbentejac · 2023-02-01T10:07:33Z

Would be great to be able to export in other formats (jpg, png, etc.)

Added! Currently supported formats are exr, jpg and png.

There seem to be a problem when writing the selected frames: images are brighter than they should be (maybe a gamma is applied ?)

There was indeed an issue with the colorspace for input videos, it is now fixed!

cbentejac · 2023-02-01T17:54:06Z

It would be also nice to add a README.md in keyframe like for the other modules to have some documentation of the methods implemented, references if any etc...

All done!

src/aliceVision/keyframe/README.md

mugulmd · 2023-02-07T08:40:23Z

An option that could be interesting for users would be to automatically rename the exported frames with their frame number within the output sequence.

Example:

I want to do a keyframe selection on a sequence with 1000 frames
the automatic process selects frames 15, 294, 600 and 825
by default they will be saved as 0015.exr, 0294.exr, 0600.exr, 0825.exr
with this option they would be respectively named 0000.exr, 0001.exr, 0002.exr, 0003.exr.

I don't know if it makes sense to add that feature here or if we should juste let users take care of the renaming with some other software, but in my case I know I would definitely use it 😅

cbentejac · 2023-02-07T10:43:23Z

An option that could be interesting for users would be to automatically rename the exported frames with their frame number within the output sequence.

Example:

I want to do a keyframe selection on a sequence with 1000 frames

the automatic process selects frames 15, 294, 600 and 825

by default they will be saved as 0015.exr, 0294.exr, 0600.exr, 0825.exr

with this option they would be respectively named 0000.exr, 0001.exr, 0002.exr, 0003.exr.

I don't know if it makes sense to add that feature here or if we should juste let users take care of the renaming with some other software, but in my case I know I would definitely use it 😅

This was very easy to add and not costly at all, so I added an option to do what you described (renameKeyframes, which is disabled by default). Thanks for the suggestion!

mugulmd · 2023-02-07T16:43:32Z

Might be a good idea to add some info logs (using ALICEVISION_LOG_INFO) to keep track of the process:
I just tested it on a 2600 frames sequence and it took about 40 minutes, which without any logging can seem quite suspicious to users (as they might want at least a rough idea of how fast the process is going and how long it is going to last).

cbentejac · 2023-02-08T17:03:00Z

Might be a good idea to add some info logs (using ALICEVISION_LOG_INFO) to keep track of the process: I just tested it on a 2600 frames sequence and it took about 40 minutes, which without any logging can seem quite suspicious to users (as they might want at least a rough idea of how fast the process is going and how long it is going to last).

I have promoted some "DEBUG"-level logs to "INFO" for the smart selection, so we now have some info about the progress of the score computation with the default verbose setting.

Keyframes may be written as JPG, PNG or EXR files. If the EXR format is selected, the storage data type can be specified as well.

…scores The rescaled frames used to compute the sharpness and motion scores used to be the same, with a single parameter to specify the rescale value. As we may want to use different rescale values depending on whether we are computing the sharpness or the motion score (or no rescale for one but a rescale for the other), the existing "rescaledWidth" parameter is split into two new parameters, "rescaledWidthSharpness" and "rescaledWidthFlow".

src/aliceVision/keyframe/README.md

By default, the selected keyframes are written with their index within the input sequence / video as their name. If frames at index 15, 294 and 825 are selected as keyframes, they will be written as 00015.exr, 00294.exr and 00825.exr. This commit adds an option that allows to name them as consecutive frames instead. Frames at index 15, 294 and 825 are now written as 00000.exr, 00001.exr and 00002.exr if the option is enabled.

If a frame is missing in a video sequence, instead of throwing an exception straight away, try reading the next frame. If the next frame is valid, then push dummy scores for the missing frame, and keep processing the input video. Otherwise, do throw the exception and stop the process. The dummy scores will be ignored in the final keyframe selection (explicitly in the case of the motion accumulation computation, implicitly when applying the weights during the sharpness selection).

"getSupportedExtensions" used to get the content of OIIO's "extension_list" and parse it. OIIO now provides a utility function that does the parsing of "extension_list" and returns it into a map. Using it directly simplifies the function's body. The documentation is also updated with more details.

…format "isSupported()" relies exclusively on the content of OpenImageIO's "extension_list", which contains all the formats supported as inputs, including video formats, with no distinction. A function "isVideoExtension" is added to determine whether the input extension is part of the OIIO-supported movie formats. The list of supported formats is hard-coded based on OIIO's documentation.

For ImageFeed: use image::isSupported and image::isVideoExtension to check that the input is supported by OpenImageIO but is not a video. For VideoFeed: add a VideoFeed::isSupported method to ensure that we do not try to open unsupported videos with OpenCV. The list of supported video formats is provided by OpenImageIO, which is based on ffmpeg like OpenCV. Both ImageFeed and VideoFeed have their own implementation of isSupported to ensure they can check whether they support a given input on their own.

Propagate the pixel aspect ratio for EXR and PNG outputs. For JPG outputs, writing the pixel aspect ratio in that way leads to errors, as any aspect ratio > 1 is written as < 1.

JPG images are always supposed to be in sRGB. When this information is available, it is better to include it in the "fromColorSpace" when writing the output to ensure the colorspace is appropriate.

Add a "skipSharpnessComputation" debug option that allows to compute the scores with performing the sharpness score computations. All frames will be assigned a fixed sharpness score of 1.0. If the smart selection is applied with this option enabled, the selected frames will be those located in the center of each subset (determined with the motion scores) as the weights will be applied on constant scores. This option is useful to determine the impact of the sharpness score computation on the global processing time.

changes done

cbentejac mentioned this pull request Jan 25, 2023

[nodes] KeyframeSelection: Rework the node and add parameters for new selection methods alicevision/Meshroom#1880

Merged

4 tasks

cbentejac self-assigned this Jan 25, 2023

cbentejac force-pushed the dev/cleanKeyframeSelection branch from 52da6df to e9567f9 Compare January 25, 2023 17:17

cbentejac requested review from fabiencastan and mugulmd January 25, 2023 17:18

cbentejac force-pushed the dev/cleanKeyframeSelection branch from e9567f9 to 2098f0a Compare January 25, 2023 17:30

cbentejac mentioned this pull request Jan 25, 2023

[keyframeselection] Algorithms are rewritten #1213

Closed