fix the sign of gradient for kl gromov #610

KrzakalaPaul · 2024-03-01T13:49:47Z

Types of changes

Correct a sign error of the nx.set_gradient for gromov (and fused gromov) when loss_fun = 'kl_loss'.
The correct formula is:

gC2 = - nx.dot(T.T, nx.dot(C1, T)) / (C2 + 1e-15) + nx.outer(q, q)

instead of

gC2 = nx.dot(T.T, nx.dot(C1, T)) / (C2 + 1e-15) + nx.outer(q, q)

How has this been tested (if it applies)

You can run this code:

from ot import gromov_wasserstein2,unif
import torch 

C1 = torch.rand((2,2), requires_grad = False)
C2 = 1 - C1
C2.requires_grad = True

eta = 1e-1

for step in range(100):
    
    loss = gromov_wasserstein2(C1=C1,C2=C2,p=unif(2,type_as=C2),q=unif(2,type_as=C2),loss_fun='square_loss')
    grad = torch.autograd.grad(loss, C2)[0]
    C2 = C2 - eta*grad
    C2 = torch.clip(C2,0,1)
    
    print(loss)

You will see that the gradient descent diverges. It converges when we fix the sign error.

PR checklist

I have read the CONTRIBUTING document.
The documentation is up-to-date with the changes I made (check build artifacts).
All tests passed, and additional code has been covered with new tests.
I have added the PR and Issue fix to the RELEASES.md file.

codecov · 2024-03-01T14:11:49Z

Codecov Report

Merging #610 (7cecd40) into master (0573eba) will not change coverage.
The diff coverage is 100.00%.

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #610   +/-   ##
=======================================
  Coverage   96.78%   96.78%           
=======================================
  Files          77       77           
  Lines       16027    16027           
=======================================
  Hits        15511    15511           
  Misses        516      516

cedricvincentcuaz · 2024-03-04T17:40:39Z

Hello @KrzakalaPaul, indeed good catch.

In your example you do not have loss_fun = kl_loss. So doing a quick check to be sure, in this case the GW loss reads as
$E(\mathbf{A}, \mathbf{B}, \mathbf{T}) = \sum_{ijkl} KL(A_{ij}, B_{kl})T_{ik}T_{jl}$
with $KL(A_{ij}, B_{kl}) = A_{ij} \log(A_{ij}) - A_{ij}\log(B_{kl}) - A_{ij} + B_{kl}$.

So $\frac{\partial E}{\partial B_{pq}} = \sum_{ij} (- \frac{A_{ij}}{B_{kl}} + 1) T_{ip}T_{jq}$ and we indeed forgot a - in the current POT implementation.

fix the sign of gradient for kl gromov

e559ac9

KrzakalaPaul added 4 commits March 1, 2024 16:20

Merge branch 'master' into fix-gromov-kl-gradient

9d44658

releases updated

b0d62a6

update releasesmd

f9b8fe7

add PR ref

7cecd40

cedricvincentcuaz merged commit 3e05385 into PythonOT:master Mar 4, 2024
15 checks passed

cedricvincentcuaz mentioned this pull request Mar 4, 2024

[MRG] fix grad sign sr(F)GW with KL loss #611

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix the sign of gradient for kl gromov #610

fix the sign of gradient for kl gromov #610

KrzakalaPaul commented Mar 1, 2024 •

edited

Loading

codecov bot commented Mar 1, 2024 •

edited

Loading

cedricvincentcuaz commented Mar 4, 2024

fix the sign of gradient for kl gromov #610

fix the sign of gradient for kl gromov #610

Conversation

KrzakalaPaul commented Mar 1, 2024 • edited Loading

Types of changes

How has this been tested (if it applies)

PR checklist

codecov bot commented Mar 1, 2024 • edited Loading

Codecov Report

cedricvincentcuaz commented Mar 4, 2024

KrzakalaPaul commented Mar 1, 2024 •

edited

Loading

codecov bot commented Mar 1, 2024 •

edited

Loading