Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keyframe Selection: Rework and add new selection methods #1343

Merged
merged 28 commits into from
Mar 14, 2023

Conversation

cbentejac
Copy link
Contributor

@cbentejac cbentejac commented Jan 25, 2023

Description

This PR is a replacement for #1213 and aims at completely reworking the Keyframe Selection utility.

It removes the existing selection methods and replaces them with two selection methods:

  • A "regular" selection method, which samples frames regularly over time with respect to some user-provided constraints (namely the maximum number of wanted keyframes, the minimum number of frames between two selected keyframes, and the maximum number of frames between two keyframes).
    This was already possible with the existing Keyframe Selection utility, but has been simplified to be more user-friendly.
  • A "smart" selection method, which computes scores for each frame in the input sequence(s) and uses them to perform the final selection. Scores are based on the sharpness, motion and temporal position of the evaluated frame.
    The goal is to select a frame that is as sharp as possible with significant motion compared to the previously selected frame: consecutive frames should not be selected as keyframes if they do not contain enough motion, even if they are both very sharp. The level of motion that is deemed significant can be adjusted with parameters, as well as the minimum and maximum number of expected keyframes.

Additionally, debug options have been added:

  • The scores computed for each frame can be exported as a CSV file;
  • The motion vectors of each frame can be exported as .png images to be visualised, with the color indicating the motion direction and the intensity indicating its magnitude.

The name of output frames can also be chosen between two options:

  • The frames are named using their index number within the sequence like [00012, 00253, 00498] (default);
  • The frames are named with consecutive frame numbers like [00000, 00001, 00002].

Videos, sequences of images and camera rigs are supported as inputs.

The corresponding Meshroom pull request can be found here: alicevision/Meshroom#1880

Features list

  • Remove the existing selection methods;
  • Add a simple selection method that samples frames regularly;
  • Add a smart selection method that evaluates the sharpness and optical flow of each frame to select only the most relevant;
  • Add debug options that allow to write scores to a CSV file and visualise the optical flow frames;
  • Add an option that allows choose how the ouptut frames will be named.

Implementation remarks

  • The sharpness of every frame is evaluated with a sliding window instead of as a whole. The goal is to ensure that an image will not be thrown away if it contains a sharp object while the rest of the image is blurry.
    N.B: The most obvious performance improvement that could be done would concern the sliding window: it currently slides pixel by pixel across all the image, which is very time-consuming but doesn't necessarily bring in more information than if it was sliding faster.
  • The optical flow is computed for cells within the image, giving for each of these cells an average pixel displacement within that cell. The median flow value of all the cells is then selected as the motion score.
  • Once the sharpness and optical flow scores have been computed, the accumulated motion across frames is computed, and using a threshold, subsequences are identified. Within each subsequence, a keyframe is selected based on both its sharpness
    and its position in the subsequence (frames located at the middle have an advantage over frames located on the subsequence's borders). The goal is to select frames that show a significant difference in their motion, are not too close to each other in time, and are as sharp as possible.
  • By default, scores are computed on downscaled frames. The downscale levels for the sharpness and motion computations may differ depending on the set parameters.

Known limitations

  • The "Orientation" exif tag is not properly propagated for PNG outputs, as PNG metadata differs from JPG/EXR files.
  • The "PixelAspectRatio" information is not propagated to JPG outputs: for unknown reasons, writing the PixelAspectRatio in JPG the same way it is for PNG/EXR outputs leads to some mistakes when it is bigger than 1 (writing a PixelAspectRatio of 2 leads to a JPG image with a PixelAspectRatio of 0.5).

@mugulmd
Copy link
Contributor

mugulmd commented Jan 26, 2023

Would be great to be able to export in other formats (jpg, png, etc.)

@mugulmd
Copy link
Contributor

mugulmd commented Jan 26, 2023

There seem to be a problem when writing the selected frames: images are brighter than they should be (maybe a gamma is applied ?)

@cbentejac cbentejac marked this pull request as draft January 27, 2023 13:21
src/aliceVision/keyframe/KeyframeSelector.cpp Outdated Show resolved Hide resolved
src/aliceVision/keyframe/KeyframeSelector.cpp Outdated Show resolved Hide resolved
src/aliceVision/keyframe/KeyframeSelector.cpp Outdated Show resolved Hide resolved
src/aliceVision/keyframe/KeyframeSelector.cpp Outdated Show resolved Hide resolved
src/aliceVision/keyframe/KeyframeSelector.cpp Outdated Show resolved Hide resolved
src/aliceVision/keyframe/KeyframeSelector.cpp Outdated Show resolved Hide resolved
@simogasp
Copy link
Member

It would be also nice to add a README.md in keyframe like for the other modules to have some documentation of the methods implemented, references if any etc...

@cbentejac
Copy link
Contributor Author

Would be great to be able to export in other formats (jpg, png, etc.)

Added! Currently supported formats are exr, jpg and png.

There seem to be a problem when writing the selected frames: images are brighter than they should be (maybe a gamma is applied ?)

There was indeed an issue with the colorspace for input videos, it is now fixed!

@cbentejac
Copy link
Contributor Author

It would be also nice to add a README.md in keyframe like for the other modules to have some documentation of the methods implemented, references if any etc...

All done!

@cbentejac cbentejac marked this pull request as ready for review February 1, 2023 17:54
@mugulmd
Copy link
Contributor

mugulmd commented Feb 7, 2023

An option that could be interesting for users would be to automatically rename the exported frames with their frame number within the output sequence.

Example:

  • I want to do a keyframe selection on a sequence with 1000 frames
  • the automatic process selects frames 15, 294, 600 and 825
  • by default they will be saved as 0015.exr, 0294.exr, 0600.exr, 0825.exr
  • with this option they would be respectively named 0000.exr, 0001.exr, 0002.exr, 0003.exr.

I don't know if it makes sense to add that feature here or if we should juste let users take care of the renaming with some other software, but in my case I know I would definitely use it 😅

@cbentejac
Copy link
Contributor Author

An option that could be interesting for users would be to automatically rename the exported frames with their frame number within the output sequence.

Example:

  • I want to do a keyframe selection on a sequence with 1000 frames
  • the automatic process selects frames 15, 294, 600 and 825
  • by default they will be saved as 0015.exr, 0294.exr, 0600.exr, 0825.exr
  • with this option they would be respectively named 0000.exr, 0001.exr, 0002.exr, 0003.exr.

I don't know if it makes sense to add that feature here or if we should juste let users take care of the renaming with some other software, but in my case I know I would definitely use it 😅

This was very easy to add and not costly at all, so I added an option to do what you described (renameKeyframes, which is disabled by default). Thanks for the suggestion!

@mugulmd
Copy link
Contributor

mugulmd commented Feb 7, 2023

Might be a good idea to add some info logs (using ALICEVISION_LOG_INFO) to keep track of the process:
I just tested it on a 2600 frames sequence and it took about 40 minutes, which without any logging can seem quite suspicious to users (as they might want at least a rough idea of how fast the process is going and how long it is going to last).

@cbentejac
Copy link
Contributor Author

Might be a good idea to add some info logs (using ALICEVISION_LOG_INFO) to keep track of the process: I just tested it on a 2600 frames sequence and it took about 40 minutes, which without any logging can seem quite suspicious to users (as they might want at least a rough idea of how fast the process is going and how long it is going to last).

I have promoted some "DEBUG"-level logs to "INFO" for the smart selection, so we now have some info about the progress of the score computation with the default verbose setting.

Keyframes may be written as JPG, PNG or EXR files. If the EXR format is
selected, the storage data type can be specified as well.
…scores

The rescaled frames used to compute the sharpness and motion scores used
to be the same, with a single parameter to specify the rescale value.
As we may want to use different rescale values depending on whether we are
computing the sharpness or the motion score (or no rescale for one but a
rescale for the other), the existing "rescaledWidth" parameter is split
into two new parameters, "rescaledWidthSharpness" and "rescaledWidthFlow".
mugulmd
mugulmd previously approved these changes Mar 9, 2023
src/aliceVision/keyframe/README.md Outdated Show resolved Hide resolved
src/aliceVision/keyframe/README.md Outdated Show resolved Hide resolved
By default, the selected keyframes are written with their index within
the input sequence / video as their name. If frames at index 15, 294 and
825 are selected as keyframes, they will be written as 00015.exr, 00294.exr
and 00825.exr.

This commit adds an option that allows to name them as consecutive frames
instead. Frames at index 15, 294 and 825 are now written as 00000.exr,
00001.exr and 00002.exr if the option is enabled.
If a frame is missing in a video sequence, instead of throwing an exception
straight away, try reading the next frame. If the next frame is valid, then
push dummy scores for the missing frame, and keep processing the input
video. Otherwise, do throw the exception and stop the process.

The dummy scores will be ignored in the final keyframe selection
(explicitly in the case of the motion accumulation computation, implicitly
when applying the weights during the sharpness selection).
"getSupportedExtensions" used to get the content of OIIO's "extension_list"
and parse it. OIIO now provides a utility function that does the parsing
of "extension_list" and returns it into a map.

Using it directly simplifies the function's body.

The documentation is also updated with more details.
…format

"isSupported()" relies exclusively on the content of OpenImageIO's
"extension_list", which contains all the formats supported as inputs,
including video formats, with no distinction.

A function "isVideoExtension" is added to determine whether the input
extension is part of the OIIO-supported movie formats. The list of
supported formats is hard-coded based on OIIO's documentation.
For ImageFeed: use image::isSupported and image::isVideoExtension to check
that the input is supported by OpenImageIO but is not a video.

For VideoFeed: add a VideoFeed::isSupported method to ensure that we do
not try to open unsupported videos with OpenCV. The list of supported
video formats is provided by OpenImageIO, which is based on ffmpeg like
OpenCV.

Both ImageFeed and VideoFeed have their own implementation of isSupported
to ensure they can check whether they support a given input on their own.
Propagate the pixel aspect ratio for EXR and PNG outputs.
For JPG outputs, writing the pixel aspect ratio in that way leads to
errors, as any aspect ratio > 1 is written as < 1.
JPG images are always supposed to be in sRGB. When this information is
available, it is better to include it in the "fromColorSpace" when writing
the output to ensure the colorspace is appropriate.
Add a "skipSharpnessComputation" debug option that allows to compute the
scores with performing the sharpness score computations. All frames will
be assigned a fixed sharpness score of 1.0.

If the smart selection is applied with this option enabled, the selected
frames will be those located in the center of each subset (determined with
the motion scores) as the weights will be applied on constant scores.

This option is useful to determine the impact of the sharpness score
computation on the global processing time.
mugulmd
mugulmd previously approved these changes Mar 10, 2023
@fabiencastan fabiencastan added this to the 3.0.0 milestone Mar 14, 2023
@fabiencastan fabiencastan merged commit 8eb3db8 into develop Mar 14, 2023
@fabiencastan fabiencastan deleted the dev/cleanKeyframeSelection branch March 14, 2023 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants