-
Notifications
You must be signed in to change notification settings - Fork 508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Implement Sinkhorn in log-domain for WDA #336
Conversation
* for small values of the regularization parameter (reg) the current implementation runs into numerical issues (nans and infs) * this can be resolved by using log-domain implementation of the sinkhorn algorithm
Hello @jakubzadrozny and thank you for the PR. I just have a question in my code review but as soon as you change the code or discuss why (and the tests pass) we can merge. |
Codecov Report
@@ Coverage Diff @@
## master #336 +/- ##
=======================================
Coverage 93.03% 93.04%
=======================================
Files 21 21
Lines 5198 5217 +19
=======================================
+ Hits 4836 4854 +18
- Misses 362 363 +1 |
Also pleas remember to add your contribution to the REALEASES.md file as a New feature and your name/email as co-contributor on top of the dr.py file |
Hi @rflamary. Thanks for your comments, I've answered your question and pushed another commit. |
Hello again @jakubzadrozny and thank you for your changes. I nearly merged and then noticed something: log sinkhorn is much slower. Your implementation while the release version I think it would be nice to add your log implementation to the classical sinkhorn instead of replacing it and add a parameter to WDA function to select classical or log space (with a message in the doc to say which to choose). Could you please do that? |
* use the standard Sinkhorn solver by default (faster) * use log-domain Sinkhorn if asked by the user
Sorry, I honestly did not expect it to be that much slower. I've added the |
Well losumexp is exactly 3 times more complex than np.dot ;). That is the reason I keep vanilla sinkhorn as default even if for deep leraning and 32 bits you always need to go toward log stabilization. The PR looks good I'm merging it thank you for your contribution. |
Types of changes
Motivation and context / Related issue
The current implementation of the WDA dimensionality reduction runs into numerical issues (nans and infs) with small values of the regularization parameter (reg). This PR re-implements the sinkhorn algorithm in log-domain, which allows to use WDA with much smaller reg parameter values.
How has this been tested (if it applies)
A new regression test was added that runs WDA with small reg, the current master version fails it and this PR passes.
PR checklist