Skip to content

Conversation

@Sergio0694
Copy link
Member

@Sergio0694 Sergio0694 commented Nov 27, 2019

Prerequisites

  • I have written a descriptive pull-request title
  • I have verified that there are no overlapping pull-requests open
  • I have verified that I am following matches the existing coding patterns and practice as demonstrated in the repository. These follow strict Stylecop rules 👮.
  • I have provided test coverage for my change (where applicable)

Description

This PR includes a refactoring of the bokeh blur effect, with a complete removal of the two "low memory usage" and "maximum performance" options, and a single implementation that is both faster than the previous "maximum performance", and uses less memory than the previous "low memory usage" mode. Basically, the best of both worlds, and even better than that! 🚀

Here's a benchmark with the 3 different modes. The images are of size 720, 1080 and 2160.

ImageSize Method Mean Error StdDev Allocated
Small Low memory 356.8 ms 6.98 ms 10.87 ms 124.59 KB
Small Max speed 165.1 ms 1.43 ms 1.20 ms 115.46 KB
Small Refactor 165.2 ms 2.13 ms 1.89 ms 105.88 KB
Medium Low memory 1,355.9 ms 27.03 ms 45.16 ms 66921.5 KB
Medium Max speed 656.0 ms 3.00 ms 2.34 ms 129792.2 KB
Medium Refactor 667.0 ms 9.13 ms 7.62 ms 64982.3 KB
Large Low memory 5,029.0 ms 70.54 ms 65.99 ms 392994.26 KB
Large Max speed 2,765.6 ms 84.41 ms 70.49 ms 648345.34 KB
Large Refactor 2,684.7 ms 14.81 ms 12.36 ms 389135.74 KB

In particular:

  • The new implementation completely removes one buffer of ComplexVector4 values, so that's a reduction of sizeof(Vector4) 2 * imageHeight * imageWidth compared to the "maximum performance" mode, and a reduction of sizeof(Vector4) * 2 * kernelRadius * imageWidth compared to the previous "low memory usage" mode. This is achieved by modifying the vertical convolution step so that the partial components of the fourier transformation for the bokeh kernels are directly added to the target processing buffer, which makes is possible to only use a temporary buffer for the results of the first horizontal 1D convolution pass.
  • The speed improvements are achieved by the fact that the processing partials are being computed right after each vertical convolution pass, so an entire Parallel.For invocation on the entire image is not needed anymore. Additionally, this new implementation can always execute the code in parallel, unlike the previous "low memory usage" mode which was forced to execute the second 1D convolution sequentially, to be able to work on a shared buffer.

@JimBobSquarePants @antonfirsov I think you'll be pretty happy with this new PR 😊
Also, please make sure that everything checks out on your end (all the checks are passing for me, so this should be easy to do) and merge this as soon as possible, since it includes a breaking change in a couple of exposed APIs, so it'd be better to do this while the lib is still in beta.

@codecov
Copy link

codecov bot commented Nov 27, 2019

Codecov Report

Merging #1058 into master will increase coverage by 0.01%.
The diff coverage is 93.75%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1058      +/-   ##
==========================================
+ Coverage    89.9%   89.91%   +0.01%     
==========================================
  Files        1131     1131              
  Lines       50654    50605      -49     
  Branches     3717     3713       -4     
==========================================
- Hits        45539    45502      -37     
+ Misses       4357     4347      -10     
+ Partials      758      756       -2
Impacted Files Coverage Δ
src/ImageSharp/Memory/Buffer2D{T}.cs 100% <ø> (ø) ⬆️
...Sharp/Processing/Extensions/BokehBlurExtensions.cs 50% <ø> (+25%) ⬆️
...ssing/Processors/Convolution/BokehBlurProcessor.cs 83.33% <0%> (+5.55%) ⬆️
src/ImageSharp/Common/Helpers/Buffer2DUtils.cs 100% <100%> (ø) ⬆️
...ocessors/Convolution/BokehBlurProcessor{TPixel}.cs 99.59% <100%> (+2.09%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ec4eca1...8a89fdc. Read the comment docs.

Copy link
Member

@JimBobSquarePants JimBobSquarePants left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it! Really nice optimization.

@JimBobSquarePants JimBobSquarePants merged commit 0322ee2 into SixLabors:master Nov 27, 2019
@Sergio0694
Copy link
Member Author

Wo-hooo! 😄🍻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants