Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests on GPU failing #287

Closed
ncassereau opened this issue Sep 29, 2021 · 0 comments · Fixed by #288
Closed

Tests on GPU failing #287

ncassereau opened this issue Sep 29, 2021 · 0 comments · Fixed by #288

Comments

@ncassereau
Copy link
Contributor

Describe the bug

Tests in test/test_gpu.py do not all pass (test_gpu_sinkhorn).
There is a discrepancy between the matrix computed by ot.bregman and the one computed by ot.gpu.bregman.

To Reproduce

Steps to reproduce the behavior:

  1. Run test/test_gpu.py
=================================== FAILURES ===================================
______________________________ test_gpu_sinkhorn _______________________________

    @pytest.mark.skipif(nogpu, reason="No GPU available")
    def test_gpu_sinkhorn():
    
        rng = np.random.RandomState(0)
    
        for n_samples in [50, 100, 500, 1000]:
            a = rng.rand(n_samples // 4, 100)
            b = rng.rand(n_samples, 100)
    
            wa = ot.unif(n_samples // 4)
            wb = ot.unif(n_samples)
    
            wb2 = np.random.rand(n_samples, 20)
            wb2 /= wb2.sum(0, keepdims=True)
    
            M = ot.dist(a.copy(), b.copy())
            M2 = ot.gpu.dist(a.copy(), b.copy(), to_numpy=False)
    
            reg = 1
    
            G = ot.sinkhorn(wa, wb, M, reg)
            G1 = ot.gpu.sinkhorn(wa, wb, M, reg)

>           np.testing.assert_allclose(G1, G, rtol=1e-10)
E           AssertionError: 
E           Not equal to tolerance rtol=1e-10, atol=0
E           
E           Mismatched elements: 600 / 600 (100%)
E           Max absolute difference: 1.37138433e-07
E           Max relative difference: 2.00548806e-05
E            x: array([[5.888717e-04, 8.415569e-05, 8.951892e-05, 2.190684e-03,
E                   1.977490e-05, 9.029307e-03, 2.036359e-04, 2.300168e-03,
E                   2.933039e-04, 1.124758e-04, 3.394681e-04, 1.416449e-03,...
E            y: array([[5.888707e-04, 8.415610e-05, 8.951908e-05, 2.190691e-03,
E                   1.977479e-05, 9.029328e-03, 2.036355e-04, 2.300179e-03,
E                   2.933086e-04, 1.124764e-04, 3.394675e-04, 1.416444e-03,...

test/test_gpu.py:73: AssertionError

Screenshots

Code sample

from test import test_gpu
test_gpu.test_gpu_sinkhorn()

Expected behavior

GPU unit tests should pass.

Environment (please complete the following information):

  • OS (e.g. MacOS, Windows, Linux): Linux
  • Python version: 3.8
  • How was POT installed (source, pip, conda): source
  • Build command you used (if compiling from source): make buildext ; make install
  • Only for GPU related bugs:
    • CUDA version: 10.2
    • GPU models and configuration: Happens on NVIDIA V100 and NVIDIA A100
    • Any other relevant information: CuPY version is 9.0.0

Output of the following code snippet:

import platform; print(platform.platform())
import sys; print("Python", sys.version)
import numpy; print("NumPy", numpy.__version__)
import scipy; print("SciPy", scipy.__version__)
import ot; print("POT", ot.__version__)
Linux-4.18.0-147.48.1.el8_1.x86_64-x86_64-with-redhat-8.1-Ootpa
Python 3.7.11 (default, Jul 27 2021, 14:32:16) 
[GCC 7.5.0]
NumPy 1.21.2
SciPy 1.7.1
POT 0.8.0dev

Additional context

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant