[MRG] Efficient Discrete Multi Marginal Optimal Transport #454

xzyu02 · 2023-04-08T18:12:16Z

Types of changes

This introduce DEMD modules, a file demd.py contains all the vanilla modules of Efficient Discrete Multi Marginal Optimal Transport Regularization.
Also includes two examples, examples/others/plot_demd_1d.py and examples/others/plot_demd_gradient_minimize.py

Motivation and context / Related issue

Add new methods for Efficient Discrete Multi Marginal Optimal Transport Regularization, paper on ICLR 2023.

How has this been tested (if it applies)

Example plot_demd_1d.py uses two 1d Gaussian Distribution as test data and compares computing time with LP.
Example plot_demd_gradient_minimize.py compares loss between demd minimizes using gradient decent and lp method.
Also tested three functions separately in test_demd.py for sanity checks.

PR checklist

I have read the CONTRIBUTING document.
The documentation is up-to-date with the changes I made (check build artifacts).
All tests passed, and additional code has been covered with new tests.
I have added the PR and Issue fix to the RELEASES.md file.

… ot, build failed need to fix

rflamary · 2023-04-11T10:17:35Z

Thanks for the PR. We will do a code reveiw s soon as possible.

xzyu02 · 2023-04-12T03:07:17Z

Thanks for the PR. We will do a code reveiw s soon as possible.

Thank you for considering!

xzyu02 · 2023-04-21T21:45:59Z

Thanks for the PR. We will do a code reveiw s soon as possible.

Dear POT Team,

I hope you're well. Just a quick reminder about our group's pull request (#454) from two weeks ago. I understand the team's been busy, but we'd appreciate your feedback when you have a moment. If you need clarification or have questions, feel free to reach out. We are eager to make any necessary adjustments!

Best regards,
Xizheng Yu

rflamary

Hello and thanks for the PR,

I did a quick code review. The PR needs major changes so that I can better understand what the new solvers are doing (you cannot ask people to read the paper to use a function, it should be described in the documentation). It is a bit unclear to me if you provide an EMD, multimarginal solver, a barycenter estimator (especially since you function takes only one array as input) so please clarify this in the code and maybe the PR description.

What needs to be clarified and changed:

Move all solvers to a submodule better named more clear such as ot.lp.discrete_emd .
Please respect POT API: use M for ground loss, a,A for distributions (on the simplex) and all parameters names. All function names can be long but must describe precisely what the function does. For instance if a function computes a barycenter then it should be similarly named to other solvers and be called the same way.
Add to the documentation of all function, the optimization problems solved in math environment using same names as other POT functions.
Take into account all the small comments below.

I know this is a lot of work but we need to ensure that the code fits well in POT and is easy to use and understand by users/conributors.

rflamary · 2023-04-24T08:59:45Z

docs/source/all.rst

@@ -19,6 +19,7 @@ API and modules
   coot
   da
   datasets
+   demd


I'm not OK with demd wich is too short and not clear enough for a module name. This is also the case with the main functions, not everyone has reda the paper and it sould be clear from the function name what it is doing.

Due to historical reasons ot.emd is the exact disrete Ot solver thaat is very general and not only for EMD so we need to find another name for the new solvers in this PR

The new solvers should be in ot.lp.discrete_emd or something else more descriptive

rflamary · 2023-04-24T09:15:36Z

examples/others/plot_demd_1d.py

+
+
+def lp_1d_bary(data, M, n, d):
+    A = np.vstack(data).T


the example should not need to transpose the data. it means that the API for the implemented function is not good (it should retrun the smae thing as ot.lp.barycenter)

rflamary · 2023-04-24T09:16:23Z

examples/others/plot_demd_1d.py

+    print('')
+    print('D-EMD Algorithm:')
+    ot.tic()
+    demd_obj = ot.demd(np.vstack(data), n, d)


where is the barycener? Objective is nice but the barycenetr shoudl be rtruned

rflamary · 2023-04-24T09:18:36Z

examples/others/plot_demd_1d.py

+    return ns, lp_times, demd_times
+
+
+ns, lp_times, demd_times = increasing_bins()


why only plot thr time? pleade also plot the barycenter.

rflamary · 2023-04-24T09:20:22Z

examples/others/plot_demd_gradient_minimize.py

+
+# data, M = getData(n, d, 'uniform')
+data, M = getData(n, d, 'skewedGauss')
+data = np.vstack(data)


Plot the data

rflamary · 2023-04-24T09:42:36Z

ot/demd.py

+
+    dualobj = sum([_.dot(_d) for _, _d in zip(aa, dual)])
+
+    return {'x': xx, 'primal objective': obj,


the output of the function should be of the same type as the input of the function

rflamary · 2023-04-24T09:42:58Z

ot/demd.py

+        except Exception:
+            pass
+
+    dualobj = sum([_.dot(_d) for _, _d in zip(aa, dual)])


should use $nx.sum and nx.dot to ensure that it will work across becknds$

rflamary · 2023-04-24T09:43:39Z

ot/demd.py

+
+def demd(x, d, n, return_dual_vars=False):
+    r"""
+    Solver of our proposed method: d−Dimensional Earch Mover’s Distance (DEMD).


desctibe mroe precisely what you are solkving. is it a barycenter?

rflamary · 2023-04-24T09:47:16Z

ot/demd.py

+            'dual': dual, 'dual objective': dualobj}
+
+
+def demd(x, d, n, return_dual_vars=False):


bad naming, at least discrete_emd or discrete_emd2 if teh function return emd without the plan . if you give the function an empirical distribution then you shoud also put it in the name.

Finally emd is computed beteen twoi distibutions so why is there only on naumpy arrya here?

rflamary · 2023-04-24T10:00:45Z

ot/demd.py

+        `f(x, d, n, return_dual_vars=True) -> (float, ndarray, ...)`
+    x : ndarray, shape (d, n)
+        The initial point for the optimization algorithm.
+    d : int


those shapes can be infered from x they should not be passed as parameters

xzyu02 · 2023-04-24T15:41:13Z

Hello and thanks for the PR,

I did a quick code review. The PR needs major changes so that I can better understand what the new solvers are doing (you cannot ask people to read the paper to use a function, it should be described in the documentation). It is a bit unclear to me if you provide an EMD, multimarginal solver, a barycenter estimator (especially since you function takes only one array as input) so please clarify this in the code and maybe the PR description.

What needs to be clarified and changed:

Move all solvers to a submodule better named more clear such as ot.lp.discrete_emd .

Please respect POT API: use M for ground loss, a,A for distributions (on the simplex) and all parameters names. All function names can be long but must describe precisely what the function does. For instance if a function computes a barycenter then it should be similarly named to other solvers and be called the same way.

Add to the documentation of all function, the optimization problems solved in math environment using same names as other POT functions.

Take into account all the small comments below.

I know this is a lot of work but we need to ensure that the code fits well in POT and is easy to use and understand by users/conributors.

Dear POT team,

Thank you for thoughtful and detailed code review. We apologize for the conflicts and problems right now. We will resolve them one by one and make sure it fits well in POT as well as contributors. We appreciate your time and effort. Thank you.

Best

xzyu02 · 2023-06-16T00:47:40Z

Dear POT team,

I hope this message finds you well. I have recently completed work on the bug fix for workflow checks, and I believe it is now ready for review. I understand everyone has busy schedules, but I would appreciate if you could review this pull request and approve a workflow check at your convenience. Please feel free to raise any questions or concerns. Thank you in advance for your time and patient.

Best regards,
Xizheng Yu

rflamary · 2023-06-16T06:06:59Z

We will do that, thanks for your work it is appreciated.

rflamary

Hello,

Thanks for all the work and the major modifications in your code, it is now in my opinion in a much better shape and we are approaching a merge.

Still I have a few important modifications now that I better understand what your proposed solvers actually do. I know it can get tiresome but POT is becoming very large and we need to be very careful with positioning new API and solvers.

My understanding is that they are MMOT solvers for marginal distribution that have a support on a regular 1D grid and using a specific ground metric (max-min that has monge property) that corresponds to the absolute value loss/EMD ground metric with two marginals.

This is very interesting and I want it in POT but it also means that we cannot use the very general names of the method that you chose in your published paper because it would lead to much confusion for POT users. So I proposed more precise names below that describe I think better what is solved and done. I provide more specific comments below (for instance having mathematical object follow the name of function parameters).

On a more scientific question I am surprised by the shape or your "barycenter" that is actually from my understanding a convergence point for the minimization of your MMOT formulation. It seems to be very similar to L2 (average) barycenters especially compared to the LP solution. Do you have an intuition why?

rflamary · 2023-06-20T07:06:24Z

examples/others/plot_d-mmot.py

+# Compare Barycenters in both methods
+# ---------
+pl.figure(1, figsize=(6.4, 3))
+for i in range(len(barys)):


you shoumd compare it to the l2 (np.mean) barycenter because your barycenter looks very similar

rflamary · 2023-06-20T07:08:21Z

examples/others/plot_d-mmot.py

+# dmmot_obj, log = ot.lp.discrete_mmot(A.T, n, d)
+barys, log = ot.lp.discrete_mmot_converge(
+    A, niters=3000, lr=0.000002, log=True)
+dmmot_obj = log['primal objective']


both the objective value ad the norm of the graient increase at the end which is very surprising since it is supposed to be a gardient decsnet no?

rflamary · 2023-06-20T07:09:40Z

examples/others/plot_d-mmot.py

+# values cannot be compared.
+
+# Perform gradient descent optimization using the d-MMOT method.
+barys = ot.lp.discrete_mmot_converge(A, niters=9000, lr=0.00001)


rflamary · 2023-06-20T07:44:05Z

ot/lp/dmmot.py

+    Parameters
+    ----------
+    i : list
+        The list for which the generalized EMD cost is to be computed.


list of integer indexes...

rflamary · 2023-06-20T07:49:16Z

ot/lp/dmmot.py

+from ..backend import get_backend
+
+
+def dist_monge(i):


dist_monge is a very generic for a very specific ground cost that indeed has monge property. This one MMOT ground cost with monge property not the only one. dist_monge_max_min for instanec is better

rflamary · 2023-06-20T08:17:45Z

ot/lp/dmmot.py

+        return obj
+
+
+def discrete_mmot_converge(


same here I would use monge_mmot_1dgrid_optimizeto stet clearly what the function does. i also need more discussion about why one would optimize all distributions together, you use is as some kind of "barycenter" since they all converge to a given distribution but be clear that it is not a barycenetr in the traditional OT sens.

rflamary · 2023-06-20T08:28:10Z

test/test_dmmot.py

+    return A.T, x
+
+
+def test_discrete_mmot():


since you implemented the function with backends, you should run the tests on arrays from other backends, you can do that by adding a parameter nx to the test function that will be automatically run with all available backends.

rflamary · 2023-06-20T08:29:58Z

examples/others/plot_d-mmot.py

+pl.figure(1, figsize=(6.4, 3))
+for i in range(len(barys)):
+    if i == 0:
+        pl.plot(x, barys[i], 'g-*', label='Discrete MMOT')


it would be nice to see visually if you converged by plotting all the individual distributions (seems like you did because your "barycenter" ). maybe you could call it "Monge MMOT minimization" instead of discrete MMOT?

rflamary · 2023-06-20T08:40:42Z

ot/lp/dmmot.py

+
+def discrete_mmot(A, verbose=False, log=False):
+    r"""
+    Compute the discrete multi-marginal optimal transport of distributions A.


you should explain clearly ere that you suppose that the support of the distributions are supposed integers on the real line, this will be suggested by the new function name but it needs to be stated clearly.

rflamary · 2023-06-20T08:45:18Z

test/test_dmmot.py

+    return A.T, x
+
+
+def test_discrete_mmot():


It would be nice to have a test comparing the loss returned by your solver with two marginals et the exact OT solver with absolute ground metric since they should be equivalent no?.

xzyu02 · 2023-08-01T20:39:01Z

Dear POT team,

I hope you are doing well! This pull request update contains changes and improvements based on the previous review. We appreciate your previous feedback, which guided these revisions. We look forward to your feedback on these updates.

Best Regards,
Xizheng Yu

In summary, here are the key points of the update:

Changed function names and answer comments based on suggestions.
Regarding the similarity of the shape of our "barycenter" to the L2 (average) barycenters: this is likely due to the Monge cost, for simple 1-D examples. I do not believe that the paper describes theory on the relationship of this Monge cost based barycenter to those obtained using the L2 cost. Our result has verified with cvx's gradient and proved to have the same result (cvx compare is available in paper's repo)

Comments resolved below:
Q) you should compare it to the l2 (np.mean) barycenter because your barycenter looks very similar
A) changed compare barycenter from lp to l2 (not sure if we should keep both, but l2 does similar to ours result)

Q) both the objective value ad the norm of the gradient increase at the end which is very surprising since it is supposed to be a gradient descent no?

same here
A) We added lr decay in dmmot_monge_1dgrid_optimize method to control the step size when we approach the minimized obj.
list of integer indexes...
- Fixed.

Q) dist_monge is a very generic for a very specific ground cost that indeed has monge property. This one MMOT ground cost with monge property not the only one. dist_monge_max_min for instanec is better
A) Fixed name.

Q) again this function name is too general, frm the name it looks like a general MMOT solver when in practice it applies only on regular 1D grids (for the marginals) and with a very specific ground metric. I suggest monge_mmot_1dgrid_loss that describes the loss for mmot with monge ground cost on a regular 1D grid (and why we only give the distributions weights to the function). It is a mouthfull but we need precise descriptions when creating new functison in a general purpose OT toolbox
A) Thanks for the detailed explanation on naming suggestion. Since the method is a discrete MMOT problem with Monge costs, we would like to suggest to use a similiar name dmmot_monge_1dgrid_loss tentatively. Since we have “d” distributions, so the grid is “d” dimensional, and our “n” is the discretization/number of bins.

Q) state the ground cost instead of using "generalized Monge" that requires to read your paper. I understand the OT plan is indeednet of which Monge cost but you return the loss for a fixed one.
A) Fixed.

Q) use alpha instead of p to maje the link with the input A of the function
A) Fixed in all mathematic discriptions.

Q) Do not use x for the OT plan, we use either or a bold matrix T for plan in POT. x is already used for support position (that are integers i in your ase) and we need unified API/notations
A) Fixed with \gamma

Q) here it would be nice to retrun a loss withe the gtradeints defined properly so that it can be used in pytorch with standard gradient decsnet algorithms. To do that you can use the backend function set_gradients that define the forward/backward relations . An example of its use can be fnd here:
A) Added inside dmmot_monge_1dgrid_loss, followed by https://github.com/PythonOT/POT/blob/release0.9/ot/backend.py#L1689

Q) same here I would use monge_mmot_1dgrid_optimizeto stet clearly what the function does. i also need more discussion about why one would optimize all distributions together, you use is as some kind of "barycenter" since they all converge to a given distribution but be clear that it is not a barycenetr in the traditional OT sens.
A) We would like to suggest a similar naming dmmot_monge_1dgrid_optimize for this method. Discussion: The main advantage here is the computation cost grows exactly linearly, and can even be faster if some distributions are “interior” to the others. This means that in the worst case we are computing roughly the same amount of “things” as the barycenter approaches, but in average and best cases we “skip” distributions that aren’t important to the computation of the d-dimensional cost. At each iteration step, only those on the “boundary” are being moved. Here's the figure regarding to move the boudary for your reference:

Q) since you implemented the function with backends, you should run the tests on arrays from other backends, you can do that by adding a parameter nx to the test function that will be automatically run with all available backends.
A) Added nx test. For our algorithm, since it involves multiple tensor modifications, tensorflow's immutable feature conflicts with our usage. Meanwhile, PyTorch's mutiple methods requires conversion from list to tensor. We decide to use simple conversion at the start and end of each method between nx and np, and use np for the algorithm calculation.

Q) it would be nice to see visually if you converged by plotting all the individual distributions (seems like you did because your "barycenter" ). maybe you could call it "Monge MMOT minimization" instead of discrete MMOT?
A) A plot for comparing all individual distributions has added, but they are really close due to the method (like we stated, every distribution can be view as a "barycenter"). The tentative naming is dmmot_monge_1dgrid_optimize.

Q) you should explain clearly ere that you suppose that the support of the distributions are supposed integers on the real line, this will be suggested by the new function name but it needs to be stated clearly.
A) Added to the documentation.

Q) It would be nice to have a test comparing the loss returned by your solver with two marginals et the exact OT solver with absolute ground metric since they should be equivalent no?.
A) Added to test

rflamary

Thank you @x12hengyu for all those changes. I know it was a lot of work but it looks good and more maintainable in the future.

I think the contribution is nearly there but there remains a small problem with the backend line where the gradient is set (see below).

Once this is done we can merge , Good work

I will wait for this and do a new release so this should be shortly available in the stable version.

rflamary · 2023-07-19T13:44:39Z

ot/lp/dmmot.py

+                'dual objective': dualobj}
+
+    # define forward/backward relations for pytorch
+    obj = nx.set_gradients(obj, (nx.from_numpy(A)), (dual))


here you need to use the A in input of the function (not a conversion from numpy) so that torch makes the link between this A and the objective. For instance store A0=A at the begining of the function and the use A0 here in set_gradient.

This will be very nice because your loss will be usable and differentiable with .backward() in torch, opening the door to stochasti optimization and deep learning applications.

xzyu02 · 2023-08-02T20:41:23Z

Dear POT team,

I hope you are doing well! I have fixed the gradient problem. Thanks for your time and patient in reviewing our work in past months. We really appreciate!

Best Regards,
Xizheng Yu

ot/lp/dmmot.py

Store input variable instead of copying it

xzyu02 and others added 12 commits April 4, 2023 00:11

add demd.py to ot, add plot_demd_*.py to examples, updated init.py in…

0327b5c

… ot, build failed need to fix

Merge branch 'PythonOT:master' into demd

99f4ae3

update REAMDME.md with citation to iclr23 paper and example link

27878b7

chaneg directory of examples, build successful

4a3a4f1

fix small latex bug

94e0f44

update all.rst, examples and demd have passed pep8 and pyflake

2957510

add more detailed comments for examples

708b756

TODO: test module for demd, wrong demd index after build

7c813a7

add test module

707152c

Merge branch 'PythonOT:master' into demd

02bd955

add contributors

81ab727

pass pyflake checks, pass pep8

706d6a5

xzyu02 changed the title ~~Efficient Discrete Multi Marginal Optimal Transport Regularization~~ [MRG] Efficient Discrete Multi Marginal Optimal Transport Regularization Apr 8, 2023

added the PR to the RELEASES.md file

4e6f693

merge from master

74c87dc

Merge branch 'master' into demd

6730631

rflamary changed the title ~~[MRG] Efficient Discrete Multi Marginal Optimal Transport Regularization~~ [WIP] Efficient Discrete Multi Marginal Optimal Transport Regularization Apr 24, 2023

rflamary requested changes Apr 24, 2023

View reviewed changes

xzyu02 and others added 8 commits May 4, 2023 01:18

temporal changes with logs

08bb919

init changes

29d16f4

merge examples, demd -> lp.dmmot

4226eee

bug fix in plot_dmmot, some commenting/documenting edits

3c7ab34

dmmot example cleanup, some comments/plotting edits

7452379

add dist_monge method

9c360bb

merge from incoming

21f16c5

all dmmot methods takes (n, d) shape A as input (follows POT style)

697036d

xzyu02 changed the title ~~[WIP] Efficient Discrete Multi Marginal Optimal Transport Regularization~~ [MRG] Efficient Discrete Multi Marginal Optimal Transport Regularization Jun 1, 2023

xzyu02 requested a review from rflamary June 1, 2023 17:32

rflamary and others added 4 commits June 12, 2023 09:53

Merge branch 'master' into demd

be09209

Merge branch 'master' into demd

8d16b0f

resolve test fail issue

6de193c

fix pep8 error

e98c7ee

rflamary requested changes Jun 20, 2023

View reviewed changes

xzyu02 and others added 6 commits July 5, 2023 00:00

resolve issues from last review, pyflake and pep8 checked

7339e8a

add lr decay

fd444b7

Merge branch 'master' into demd

bd2d2ec

add more examples, ground cost options, test for uniqueness

f531b9e

Merge branch 'demd' of github.com:x12hengyu/POT into demd

c1ccd46

Merge branch 'master' into demd

99d2e86

rflamary changed the title ~~[MRG] Efficient Discrete Multi Marginal Optimal Transport Regularization~~ [MRG] Efficient Discrete Multi Marginal Optimal Transport Jul 28, 2023

xzyu02 added 2 commits July 28, 2023 12:56

remove additional experiment setting, not needed in this PR

b3cb896

fixed line 14 1 blank line

2d22fc9

xzyu02 requested a review from rflamary August 1, 2023 20:39

Merge branch 'master' into demd

018313b

rflamary requested changes Aug 2, 2023

View reviewed changes

xzyu02 added 2 commits August 2, 2023 14:32

fix gradient computation link

a7bde66

Merge branch 'demd' of github.com:x12hengyu/POT into demd

b370202

xzyu02 requested a review from rflamary August 2, 2023 20:40

rflamary reviewed Aug 3, 2023

View reviewed changes

ot/lp/dmmot.py Outdated Show resolved Hide resolved

Update ot/lp/dmmot.py

24a69c0

Store input variable instead of copying it

rflamary approved these changes Aug 3, 2023

View reviewed changes

rflamary merged commit 5ead79b into PythonOT:master Aug 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] Efficient Discrete Multi Marginal Optimal Transport #454

[MRG] Efficient Discrete Multi Marginal Optimal Transport #454

xzyu02 commented Apr 8, 2023 •

edited

Loading

rflamary commented Apr 11, 2023

xzyu02 commented Apr 12, 2023

xzyu02 commented Apr 21, 2023

rflamary left a comment

rflamary Apr 24, 2023

rflamary Apr 24, 2023

rflamary Apr 24, 2023

rflamary Apr 24, 2023

rflamary Apr 24, 2023

rflamary Apr 24, 2023

rflamary Apr 24, 2023

rflamary Apr 24, 2023

rflamary Apr 24, 2023

rflamary Apr 24, 2023

rflamary Apr 24, 2023

xzyu02 commented Apr 24, 2023

xzyu02 commented Jun 16, 2023

rflamary commented Jun 16, 2023

rflamary left a comment

rflamary Jun 20, 2023

rflamary Jun 20, 2023

rflamary Jun 20, 2023

rflamary Jun 20, 2023

rflamary Jun 20, 2023

rflamary Jun 20, 2023

rflamary Jun 20, 2023

rflamary Jun 20, 2023

rflamary Jun 20, 2023

rflamary Jun 20, 2023

xzyu02 commented Aug 1, 2023

rflamary left a comment

rflamary Jul 19, 2023

xzyu02 commented Aug 2, 2023

		return ns, lp_times, demd_times


		ns, lp_times, demd_times = increasing_bins()


		dualobj = sum([_.dot(_d) for _, _d in zip(aa, dual)])

		return {'x': xx, 'primal objective': obj,

		'dual': dual, 'dual objective': dualobj}


		def demd(x, d, n, return_dual_vars=False):

[MRG] Efficient Discrete Multi Marginal Optimal Transport #454

[MRG] Efficient Discrete Multi Marginal Optimal Transport #454

Conversation

xzyu02 commented Apr 8, 2023 • edited Loading

Types of changes

Motivation and context / Related issue

How has this been tested (if it applies)

PR checklist

rflamary commented Apr 11, 2023

xzyu02 commented Apr 12, 2023

xzyu02 commented Apr 21, 2023

rflamary left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xzyu02 commented Apr 24, 2023

xzyu02 commented Jun 16, 2023

rflamary commented Jun 16, 2023

rflamary left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xzyu02 commented Aug 1, 2023

rflamary left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xzyu02 commented Aug 2, 2023

xzyu02 commented Apr 8, 2023 •

edited

Loading