[MRG] Implement Sinkhorn in log-domain for WDA #336

jakubzadrozny · 2022-01-17T21:41:23Z

Types of changes

Docs change / refactoring / dependency upgrade
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Motivation and context / Related issue

The current implementation of the WDA dimensionality reduction runs into numerical issues (nans and infs) with small values of the regularization parameter (reg). This PR re-implements the sinkhorn algorithm in log-domain, which allows to use WDA with much smaller reg parameter values.

How has this been tested (if it applies)

A new regression test was added that runs WDA with small reg, the current master version fails it and this PR passes.

PR checklist

I have read the CONTRIBUTING document.
The documentation is up-to-date with the changes I made (check build artifacts).
All tests passed, and additional code has been covered with new tests.
I have added the PR and Issue fix to the RELEASES.md file.

* for small values of the regularization parameter (reg) the current implementation runs into numerical issues (nans and infs) * this can be resolved by using log-domain implementation of the sinkhorn algorithm

ot/dr.py

rflamary · 2022-01-18T07:17:42Z

Hello @jakubzadrozny and thank you for the PR.

I just have a question in my code review but as soon as you change the code or discuss why (and the tests pass) we can merge.

codecov · 2022-01-18T07:19:56Z

Codecov Report

Merging #336 (1c72560) into master (263c584) will increase coverage by 0.00%.
The diff coverage is 95.23%.

@@           Coverage Diff           @@
##           master     #336   +/-   ##
=======================================
  Coverage   93.03%   93.04%           
=======================================
  Files          21       21           
  Lines        5198     5217   +19     
=======================================
+ Hits         4836     4854   +18     
- Misses        362      363    +1

rflamary · 2022-01-18T07:29:37Z

Also pleas remember to add your contribution to the REALEASES.md file as a New feature and your name/email as co-contributor on top of the dr.py file

jakubzadrozny · 2022-01-18T12:32:27Z

Hi @rflamary. Thanks for your comments, I've answered your question and pushed another commit.

rflamary · 2022-01-18T16:06:30Z

Hello again @jakubzadrozny and thank you for your changes.

I nearly merged and then noticed something: log sinkhorn is much slower.

Your implementation
https://1255-71472695-gh.circle-artifacts.com/0/dev/auto_examples/others/plot_WDA.html#compute-wasserstein-discriminant-analysis
takes 30 sec on the example

while the release version
https://pythonot.github.io/auto_examples/others/plot_WDA.html#compute-wasserstein-discriminant-analysis
takes 9sec to run

I think it would be nice to add your log implementation to the classical sinkhorn instead of replacing it and add a parameter to WDA function to select classical or log space (with a message in the doc to say which to choose).

Could you please do that?

* use the standard Sinkhorn solver by default (faster) * use log-domain Sinkhorn if asked by the user

jakubzadrozny · 2022-01-20T20:52:58Z

Sorry, I honestly did not expect it to be that much slower. I've added the sinkhorn_method parameter to wda with the previous implementation as default. I tried to follow ot.bregman.sinkhorn with the naming convention and also the doc message, let me know if you'd like that changed.

rflamary · 2022-01-21T07:50:08Z

Well losumexp is exactly 3 times more complex than np.dot ;). That is the reason I keep vanilla sinkhorn as default even if for deep leraning and 32 bits you always need to go toward log stabilization.

The PR looks good I'm merging it thank you for your contribution.

[MRG] Implement Sinkhorn in log-domain for WDA

65ba51a

* for small values of the regularization parameter (reg) the current implementation runs into numerical issues (nans and infs) * this can be resolved by using log-domain implementation of the sinkhorn algorithm

rflamary reviewed Jan 18, 2022

View reviewed changes

ot/dr.py Show resolved Hide resolved

Add feature to RELEASES and contributor name

76bdbfd

rflamary and others added 2 commits January 20, 2022 08:25

Merge branch 'master' into log-sinkhorn-wda

71b0ef6

Add 'sinkhorn_method' parameter to WDA

1c72560

* use the standard Sinkhorn solver by default (faster) * use log-domain Sinkhorn if asked by the user

rflamary merged commit d7c709e into PythonOT:master Jan 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] Implement Sinkhorn in log-domain for WDA #336

[MRG] Implement Sinkhorn in log-domain for WDA #336

jakubzadrozny commented Jan 17, 2022 •

edited

Loading

rflamary commented Jan 18, 2022

codecov bot commented Jan 18, 2022 •

edited

Loading

rflamary commented Jan 18, 2022

jakubzadrozny commented Jan 18, 2022

rflamary commented Jan 18, 2022 •

edited

Loading

jakubzadrozny commented Jan 20, 2022

rflamary commented Jan 21, 2022

[MRG] Implement Sinkhorn in log-domain for WDA #336

[MRG] Implement Sinkhorn in log-domain for WDA #336

Conversation

jakubzadrozny commented Jan 17, 2022 • edited Loading

Types of changes

Motivation and context / Related issue

How has this been tested (if it applies)

PR checklist

rflamary commented Jan 18, 2022

codecov bot commented Jan 18, 2022 • edited Loading

Codecov Report

rflamary commented Jan 18, 2022

jakubzadrozny commented Jan 18, 2022

rflamary commented Jan 18, 2022 • edited Loading

jakubzadrozny commented Jan 20, 2022

rflamary commented Jan 21, 2022

jakubzadrozny commented Jan 17, 2022 •

edited

Loading

codecov bot commented Jan 18, 2022 •

edited

Loading

rflamary commented Jan 18, 2022 •

edited

Loading