-
Notifications
You must be signed in to change notification settings - Fork 508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] OpenMP support #260
[MRG] OpenMP support #260
Conversation
Restored : Epsilon and Debug mode Replaced : parmap => multiprocessing is now replace by multithreading
Hello @kguerda-idris this is awesome, you nearly have it and all the tests seem to be working. Still the documenation does not seem to be building. Could you run the example plot_optim_OTreg.py on your side, it seems to be stuck when building the doc? |
ot/lp/__init__.py
Outdated
@@ -173,7 +181,7 @@ def estimate_dual_null_weights(alpha0, beta0, a, b, M): | |||
return center_ot_dual(alpha, beta, a, b) | |||
|
|||
|
|||
def emd(a, b, M, numItermax=100000, log=False, center_dual=True): | |||
def emd(a, b, M, numItermax=100000, log=False, center_dual=True, numThreads="max"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set default to 1 as we discussed (no loss of performance on simple machines)
Codecov Report
@@ Coverage Diff @@
## master #260 +/- ##
==========================================
- Coverage 93.07% 92.64% -0.43%
==========================================
Files 17 19 +2
Lines 3681 3754 +73
==========================================
+ Hits 3426 3478 +52
- Misses 255 276 +21 |
Hello, The etst are passing now but teher is a large loss in code coverage. Could you kease design some new tests to check that all the varaints of the solver works ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more comments on the code
from pre_build_helpers import compile_test_program | ||
|
||
|
||
def get_openmp_flag(compiler): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add short documentation to this function and describe what it does.
The function should be tested
return ['-fopenmp'] | ||
|
||
|
||
def check_openmp_support(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test this function also
from numpy.distutils.command.config_compiler import config_cc | ||
|
||
|
||
def _get_compiler(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test this function or at least the function that is calling it
// beware M and C are stored in row major C style!!! | ||
|
||
using namespace lemon; | ||
int n, m, i, cur; | ||
|
||
typedef FullBipartiteDigraph Digraph; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the typedef if you don't use it anymore?
@@ -25,6 +26,13 @@ | |||
__all__ = ['emd', 'emd2', 'barycenter', 'free_support_barycenter', 'cvx', | |||
'emd_1d', 'emd2_1d', 'wasserstein_1d'] | |||
|
|||
def check_number_threads(numThreads): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add documentation to the function
ot/lp/__init__.py
Outdated
@@ -349,6 +362,9 @@ def emd2(a, b, M, processes=multiprocessing.cpu_count(), | |||
center_dual: boolean, optional (default=True) | |||
If True, centers the dual potential using function | |||
:ref:`center_ot_dual`. | |||
numThreads: int or "max", optional (default=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not clear, be clear that by defalt no OpenMP is used but if numThread>1 then attempt to use OpenMP if available
ot/lp/__init__.py
Outdated
|
||
return res | ||
|
||
|
||
def free_support_barycenter(measures_locations, measures_weights, X_init, b=None, weights=None, numItermax=100, | ||
stopThr=1e-7, verbose=False, log=None): | ||
stopThr=1e-7, verbose=False, log=None, numThreads="max"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default to 1
if numThreads == 1: | ||
result_code = EMD_wrap(n1, n2, <double*> a.data, <double*> b.data, <double*> M.data, <double*> G.data, <double*> alpha.data, <double*> beta.data, <double*> &cost, max_iter) | ||
else: | ||
result_code = EMD_wrap_omp(n1, n2, <double*> a.data, <double*> b.data, <double*> M.data, <double*> G.data, <double*> alpha.data, <double*> beta.data, <double*> &cost, max_iter, numThreads) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if numThreads==2 and omp is not present? Is there a bug or thit is compiled without OMP and just works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If openmp is not supported, then the library is compiled without the omp flag. So every reference to OMP is ignored (pragmas are ignored, and references to omp functions like omp_get_max_threads are replaced). A warning message could be added if necessary, because it is still using the omp code (just not parallelized), which happens to be slower with 1 thread than the older (and not parallelized) version of the solver
Added : OpenMP support
Restored : Epsilon and Debug mode
Replaced : parmap => deprecated since emd solver can use OpenMP
Types of changes
Motivation and context / Related issue
How has this been tested (if it applies)
The speedup has been tested on multiple problems sizes.
A problem of size 120k has been solved in 57 minutes and 51 seconds using 24 CPUs
Checklist