Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Implement Sinkhorn in log-domain for WDA #336

Merged
merged 4 commits into from
Jan 21, 2022
Merged

[MRG] Implement Sinkhorn in log-domain for WDA #336

merged 4 commits into from
Jan 21, 2022

Conversation

jakubzadrozny
Copy link
Contributor

@jakubzadrozny jakubzadrozny commented Jan 17, 2022

Types of changes

  • Docs change / refactoring / dependency upgrade
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Motivation and context / Related issue

The current implementation of the WDA dimensionality reduction runs into numerical issues (nans and infs) with small values of the regularization parameter (reg). This PR re-implements the sinkhorn algorithm in log-domain, which allows to use WDA with much smaller reg parameter values.

How has this been tested (if it applies)

A new regression test was added that runs WDA with small reg, the current master version fails it and this PR passes.

PR checklist

  • I have read the CONTRIBUTING document.
  • The documentation is up-to-date with the changes I made (check build artifacts).
  • All tests passed, and additional code has been covered with new tests.
  • I have added the PR and Issue fix to the RELEASES.md file.

* for small values of the regularization parameter (reg) the current implementation runs into numerical issues (nans and infs)

* this can be resolved by using log-domain implementation of the sinkhorn algorithm
ot/dr.py Show resolved Hide resolved
@rflamary
Copy link
Collaborator

Hello @jakubzadrozny and thank you for the PR.

I just have a question in my code review but as soon as you change the code or discuss why (and the tests pass) we can merge.

@codecov
Copy link

codecov bot commented Jan 18, 2022

Codecov Report

Merging #336 (1c72560) into master (263c584) will increase coverage by 0.00%.
The diff coverage is 95.23%.

@@           Coverage Diff           @@
##           master     #336   +/-   ##
=======================================
  Coverage   93.03%   93.04%           
=======================================
  Files          21       21           
  Lines        5198     5217   +19     
=======================================
+ Hits         4836     4854   +18     
- Misses        362      363    +1     

@rflamary
Copy link
Collaborator

Also pleas remember to add your contribution to the REALEASES.md file as a New feature and your name/email as co-contributor on top of the dr.py file

@jakubzadrozny
Copy link
Contributor Author

Hi @rflamary. Thanks for your comments, I've answered your question and pushed another commit.

@rflamary
Copy link
Collaborator

rflamary commented Jan 18, 2022

Hello again @jakubzadrozny and thank you for your changes.

I nearly merged and then noticed something: log sinkhorn is much slower.

Your implementation
https://1255-71472695-gh.circle-artifacts.com/0/dev/auto_examples/others/plot_WDA.html#compute-wasserstein-discriminant-analysis
takes 30 sec on the example

while the release version
https://pythonot.github.io/auto_examples/others/plot_WDA.html#compute-wasserstein-discriminant-analysis
takes 9sec to run

I think it would be nice to add your log implementation to the classical sinkhorn instead of replacing it and add a parameter to WDA function to select classical or log space (with a message in the doc to say which to choose).

Could you please do that?

rflamary and others added 2 commits January 20, 2022 08:25
* use the standard Sinkhorn solver by default (faster)

* use log-domain Sinkhorn if asked by the user
@jakubzadrozny
Copy link
Contributor Author

Sorry, I honestly did not expect it to be that much slower. I've added the sinkhorn_method parameter to wda with the previous implementation as default. I tried to follow ot.bregman.sinkhorn with the naming convention and also the doc message, let me know if you'd like that changed.

@rflamary
Copy link
Collaborator

Well losumexp is exactly 3 times more complex than np.dot ;). That is the reason I keep vanilla sinkhorn as default even if for deep leraning and 32 bits you always need to go toward log stabilization.

The PR looks good I'm merging it thank you for your contribution.

@rflamary rflamary merged commit d7c709e into PythonOT:master Jan 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants