Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for TBB multi-threading backend in addition to OpenMP #1236

Closed
wants to merge 25 commits into from

Conversation

p12tic
Copy link
Contributor

@p12tic p12tic commented Sep 23, 2022

Currently AliceVision uses OpenMP as the only multi-threading backend. This is problematic due to multiple reasons.

  • OpenMP support is not uniform among compilers. In particular, Apple mobile platforms do not support OpenMP. As a result, the performance of the algorithms is as much as 8 times lower than possible.

  • OpenMP critical sections are global. That is, all #pragma omp critical lock the same global mutex. As a consequence it is inefficient to run multiple instances of the same parallelized aliceVision algorithm because these instances will share the same mutex even though data races are possible only among threads running single instance of the algorithm. Ideally each instance would have its own mutex.

  • It is not possible to efficiently integrate third-party libraries that use another multi-threading framework because OpenMP assumes that it is the only user of the CPU. As a result, the CPU will be oversubscribed which leads to poor performance. Note that as currently used OpenMP will oversubscribe the CPU all by itself even right now if multiple instances of the same parallelized algorithm are invoked in parallel.

This PR takes inspiration from OpenCV to hide the usage of multi-threading framework behind an API. This will eventually allow supporting multiple multi-threading frameworks. For more details in how it works in OpenCV, see this document.

This PR implements the following:

  • Migrate off OpenMP synchronization primitives to standard mutexes, atomics and boost::atomic_ref (once we can use C++20 we can migrate to std::atomic_ref).
  • Move pragma omp parallel uses behind an interface exposing system::parallelFor and system::parallelLoop functions instead.
  • Implement support for multiple underlying multi-threading backends
  • Implement support for oneTBB library as the underlying multi-threading backend.

As a result the OpenMP code can be converted as follows. For example:

#pragma omp parallel for
for (int i = 10; i < size; ++i)
{
    doStuff(i);
}

Equivalent implementation of this loop using system::parallelFor is the following:

system::parallelFor(10, size, [&](int i)
{
    doStuff(i);
});

As a result, the performance has been improved in several cases where libgomp implementation of OpenMP currently doesn't handle well (many requests to parallelize relatively small problems). This reduces the need for PRs like #1277 as the multithreading runtime handles a wider variety of tasks. For example, before #1277, the this PR made the following tests faster running on a machine with AMD 2990WX with disabled turbo boost:

  • test_voctree_vocabularyTreeBuild: before ~5.1-7s (extremely variable), after ~1.4s
  • test_voctree_kmeans: before ~5.8-10s (extremely variable), after ~4.7s
  • test_sfm_sequentialSfM: before ~1.7s, after ~1.5s

The PR is split into a large number of commits to allow easy bisection in case a bug slips through. As a result the risk of the PR is low as any bugs will be easily diagnosed and fixed.

@p12tic p12tic changed the title Wrap OpenMP invocations in an interface to support other parallelization backends in the future Wrap OpenMP invocations in an interface to support other multi-threading backends in the future Sep 23, 2022
@p12tic p12tic force-pushed the wrap-openmp branch 5 times, most recently from f903a85 to 54bfb78 Compare September 24, 2022 05:19
@fabiencastan
Copy link
Member

need a rebase

@p12tic p12tic force-pushed the wrap-openmp branch 2 times, most recently from dc4760d to 21e395e Compare October 7, 2022 13:12
@p12tic
Copy link
Contributor Author

p12tic commented Oct 7, 2022

@fabiencastan Rebased, thanks.

@p12tic p12tic force-pushed the wrap-openmp branch 3 times, most recently from 4045093 to 8ade13b Compare October 11, 2022 04:15
@p12tic p12tic changed the title Wrap OpenMP invocations in an interface to support other multi-threading backends in the future Add support for TBB multi-threading backend in addition to OpenMP Oct 11, 2022
@p12tic
Copy link
Contributor Author

p12tic commented Oct 11, 2022

@fabiencastan I've expanded the scope of this PR and it now includes full support for oneTBB multithreading backend. This will allow to run aliceVision with multithreading enabled on macOS.

@p12tic p12tic force-pushed the wrap-openmp branch 2 times, most recently from 2bbcc95 to 5c5da60 Compare October 12, 2022 06:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants