Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA version is installed in cpu only envs #37

Closed
1 task done
RaulPPelaez opened this issue Aug 16, 2023 · 10 comments
Closed
1 task done

CUDA version is installed in cpu only envs #37

RaulPPelaez opened this issue Aug 16, 2023 · 10 comments
Labels
bug Something isn't working

Comments

@RaulPPelaez
Copy link
Contributor

Solution to issue cannot be found in the documentation.

  • I checked the documentation.

Issue

I cannot manage to install the cpu only build of openmm-torch 1.1. On a machine without a GPU nor CUDA installation:

$ mamba install openmm-torch pytorch-cpu=2

Looking for: ['openmm-torch', 'pytorch-cpu=2']

conda-forge/linux-64                                        Using cache
conda-forge/noarch                                          Using cache
Transaction

  Prefix: /home/raul/mambaforge/envs/test

  Updating specs:

   - openmm-torch
   - pytorch-cpu=2


  Package                Version  Build                      Channel           Size
─────────────────────────────────────────────────────────────────────────────────────
  Install:
─────────────────────────────────────────────────────────────────────────────────────

  + _libgcc_mutex            0.1  conda_forge                conda-forge        3kB
  + libstdcxx-ng          13.1.0  hfd8a6a1_0                 conda-forge     Cached
  + python_abi              3.11  3_cp311                    conda-forge        6kB
  + ld_impl_linux-64        2.40  h41732ed_0                 conda-forge     Cached
  + ca-certificates    2023.7.22  hbcca054_0                 conda-forge     Cached
  + libgfortran5          13.1.0  h15d22d2_0                 conda-forge        1MB
  + libgcc-ng             13.1.0  he5830b7_0                 conda-forge     Cached
  + libzlib               1.2.13  hd590300_5                 conda-forge     Cached
  + zstd                   1.5.2  hfc55251_7                 conda-forge     Cached
  + llvm-openmp           16.0.6  h4dfa4b3_0                 conda-forge       42MB
  + _openmp_mutex            4.5  2_kmp_llvm                 conda-forge        6kB
  + libgfortran-ng        13.1.0  h69a702a_0                 conda-forge       23kB
  + sleef                  3.5.1  h9b69904_2                 conda-forge     Cached
  + rocm-smi               5.6.0  h59595ed_1                 conda-forge        4MB
  + icu                     72.1  hcb278e6_0                 conda-forge     Cached
  + libiconv                1.17  h166bdaf_0                 conda-forge     Cached
  + cudatoolkit           11.8.0  h4ba93d1_12                conda-forge      716MB
  + openssl                3.1.2  hd590300_0                 conda-forge     Cached
  + libffi                 3.4.2  h7f98852_5                 conda-forge     Cached
  + bzip2                  1.0.8  h7f98852_4                 conda-forge     Cached
  + gmp                    6.2.1  h58526e2_0                 conda-forge     Cached
  + ncurses                  6.4  hcb278e6_0                 conda-forge     Cached
  + libuuid               2.38.1  h0b41bf4_0                 conda-forge     Cached
  + libsqlite             3.42.0  h2797004_0                 conda-forge     Cached
  + libexpat               2.5.0  hcb278e6_1                 conda-forge     Cached
  + xz                     5.2.6  h166bdaf_0                 conda-forge     Cached
  + tk                    8.6.12  h27826a3_0                 conda-forge     Cached
  + libnsl                 2.0.0  h7f98852_0                 conda-forge     Cached
  + libuv                 1.44.2  hd590300_1                 conda-forge      824kB
  + libprotobuf          3.21.12  h3eb15da_0                 conda-forge     Cached
  + ocl-icd                2.3.1  h7f98852_0                 conda-forge     Cached
  + libopenblas           0.3.23  pthreads_h80387f5_0        conda-forge        5MB
  + mpfr                   4.2.0  hb012696_0                 conda-forge     Cached
  + readline                 8.2  h8228510_1                 conda-forge     Cached
  + libxml2               2.11.5  h0d562d8_0                 conda-forge     Cached
  + ocl-icd-system         1.0.0  1                          conda-forge     Cached
  + libblas                3.9.0  17_linux64_openblas        conda-forge       14kB
  + mpc                    1.3.1  hfe3b2da_0                 conda-forge     Cached
  + libhwloc               2.9.2  nocuda_h7313eea_1008       conda-forge        3MB
  + liblapack              3.9.0  17_linux64_openblas        conda-forge       14kB
  + libcblas               3.9.0  17_linux64_openblas        conda-forge       14kB
  + tbb                2021.10.0  h00ab1b0_0                 conda-forge      186kB
  + mkl                 2022.2.1  h84fe81f_16997             conda-forge     Cached
  + tzdata                 2023c  h71feb2d_0                 conda-forge     Cached
  + python                3.11.4  hab00c5b_0_cpython         conda-forge     Cached
  + wheel                 0.41.1  pyhd8ed1ab_0               conda-forge     Cached
  + setuptools            68.0.0  pyhd8ed1ab_0               conda-forge     Cached
  + pip                   23.2.1  pyhd8ed1ab_0               conda-forge     Cached
  + mpmath                 1.3.0  pyhd8ed1ab_0               conda-forge     Cached
  + typing_extensions      4.7.1  pyha770c72_0               conda-forge     Cached
  + networkx                 3.1  pyhd8ed1ab_0               conda-forge     Cached
  + filelock              3.12.2  pyhd8ed1ab_0               conda-forge     Cached
  + numpy                 1.25.2  py311h64a7726_0            conda-forge        8MB
  + markupsafe             2.1.3  py311h459d7ec_0            conda-forge     Cached
  + gmpy2                  2.1.2  py311h6a5fa03_1            conda-forge      220kB
  + openmm                 8.0.0  py311h59c6c42_1            conda-forge     Cached
  + jinja2                 3.1.2  pyhd8ed1ab_1               conda-forge     Cached
  + sympy                   1.12  pypyh9d50eac_103           conda-forge     Cached
  + pytorch                2.0.0  cpu_mkl_py311hd1ebf82_101  conda-forge       66MB
  + pytorch-cpu            2.0.0  cpu_mkl_py311ha33ad28_101  conda-forge       19kB
  + openmm-torch             1.1  cuda112py311h20aef98_0     conda-forge      253kB

  Summary:

  Install: 61 packages

  Total download: 848MB

This results in an unusable library

Installed packages

N/A

Environment info

N/A
@RaulPPelaez RaulPPelaez added the bug Something isn't working label Aug 16, 2023
@RaulPPelaez
Copy link
Contributor Author

Ok this works:

$ mamba install "openmm-torch=1.1=cpu*" pytorch-cpu=2

I guess the issue is then that the cpu build is not being chosen automatically.

@mikemhenry
Copy link
Contributor

Can you do mamba info -a on a system that has this issue? I wonder if something weird is going on with the virtual package

@RaulPPelaez
Copy link
Contributor Author

$ mamba info -a

          mamba version : 1.4.9
     active environment : temp
    active env location : /home/raul/mambaforge/envs/temp
            shell level : 2
       user config file : /home/raul/.condarc
 populated config files : /home/raul/mambaforge/.condarc
                          /home/raul/.condarc
          conda version : 22.9.0
    conda-build version : not installed
         python version : 3.10.8.final.0
       virtual packages : __linux=6.4.12=0
                          __glibc=2.37=0
                          __unix=0=0
                          __archspec=1=x86_64
       base environment : /home/raul/mambaforge  (writable)
      conda av data dir : /home/raul/mambaforge/etc/conda
  conda av metadata url : None
           channel URLs : https://conda.anaconda.org/conda-forge/linux-64
                          https://conda.anaconda.org/conda-forge/noarch
          package cache : /home/raul/mambaforge/pkgs
                          /home/raul/.conda/pkgs
       envs directories : /home/raul/mambaforge/envs
                          /home/raul/.conda/envs
               platform : linux-64
             user-agent : conda/22.9.0 requests/2.28.1 CPython/3.10.8 Linux/6.4.12-200.fc38.x86_64 fedora/38 glibc/2.37
                UID:GID : 1000:1000
             netrc file : None
           offline mode : False

# conda environments:
#
base                     /home/raul/mambaforge
temp                  *  /home/raul/mambaforge/envs/temp

sys.version: 3.10.8 | packaged by conda-forge | (main...
sys.prefix: /home/raul/mambaforge
sys.executable: /home/raul/mambaforge/bin/python
conda location: /home/raul/mambaforge/lib/python3.10/site-packages/conda
conda-build: None
conda-env: /home/raul/mambaforge/bin/conda-env
user site dirs: ~/.local/lib/python3.11

CIO_TEST: <not set>
CONDA_DEFAULT_ENV: temp
CONDA_EXE: /home/raul/mambaforge/bin/conda
CONDA_PREFIX: /home/raul/mambaforge/envs/temp
CONDA_PREFIX_1: /home/raul/mambaforge
CONDA_PROMPT_MODIFIER: (temp) 
CONDA_PYTHON_EXE: /home/raul/mambaforge/bin/python
CONDA_ROOT: /home/raul/mambaforge
CONDA_SHLVL: 2
CPATH: <not set>
CURL_CA_BUNDLE: <not set>
LD_LIBRARY_PATH: /usr/local/cuda/lib64:
LIBRARY_PATH: /usr/local/cuda/lib64:
MANPATH: /usr/share/lmod/lmod/share/man:
MODULEPATH: /etc/modulefiles:/usr/share/modulefiles:/usr/share/modulefiles/Linux:/usr/share/modulefiles/Core:/usr/share/lmod/lmod/modulefiles/Core
MOZ_GMP_PATH: /usr/lib64/mozilla/plugins/gmp-gmpopenh264/system-installed
PATH: /home/raul/mambaforge/envs/temp/bin:/home/raul/mambaforge/condabin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin
REQUESTS_CA_BUNDLE: <not set>
SSL_CERT_FILE: <not set>

@mikemhenry
Copy link
Contributor

Hmmm so this CONDA_OVERRIDE_CUDA="" micromamba create -n foo --dry-run openmm-torch gives me

openmm-torch                      1.1  cpu_py311h376c5b7_0        conda-forge      199kB
pytorch                         2.0.0  cpu_mkl_py311hefb3434_101  conda-forge       61MB

Which looks right, both CONDA_OVERRIDE_CUDA="" micromamba create -n foo --dry-run openmm-torch pytorch=1 and CONDA_OVERRIDE_CUDA="" micromamba create -n foo --dry-run openmm-torch pytorch=2 also do what I expect.

Looking at mamba now, CONDA_OVERRIDE_CUDA="" mamba install --dry-run openmm-torch pytorch gives me

openmm-torch            1.1  cuda112py311h20aef98_0
pytorch               2.0.0  cpu_mkl_py311had667d7_101

I am not sure why it is pulling in the GPU build.

@mikemhenry
Copy link
Contributor

tl;dr
CONDA_OVERRIDE_CUDA="" mamba create -n foo --dry-run openmm-torch pulls in a gpu build (incorrect), CONDA_OVERRIDE_CUDA="" micromamba create -n foo --dry-run openmm-torch pulls in a cpu build (correct)

@RaulPPelaez
Copy link
Contributor Author

Might be a mamba bug then?
Even when the override is empty, is CUDA installed in the system you are testing?

@mikemhenry
Copy link
Contributor

In which case? And yes I do think this is a mamba bug... I mean we may be doing something slightly wrong since if you just use pytorch I think it works correctly (as in pulls in a CPU build using mamba) but I am not sure what we are doing different packaging wise.

@sef43
Copy link

sef43 commented Nov 2, 2023

I think this would be fixed if the CUDA build had a __cuda virtual package dependency (this is one of the key differences between the pytorch=*=cpu and pytorch=*=gpu packages). However this previous issue explains why we don't have a non cuda build: #8 (comment)

The reason mamba installs the CUDA version over the cpu version (on my machine) is simply because the timestamp of the CUDA build is newer:

CONDA_OVERRIDE_CUDA="" mamba create -n foo --dry-run python=3.11 openmm-torch -c conda-forge -vvv
info     libsolv  prune_to_best_version_conda 10
info     libsolv  - openmm-torch-1.0-cuda112py311hfd30f1a_0 [367551]
info     libsolv  - openmm-torch-1.0-cuda112py311hfd30f1a_1 [367552]
info     libsolv  - openmm-torch-1.1-cpu_py311hca44478_0 [367566]
info     libsolv  - openmm-torch-1.1-cuda112py311h20aef98_0 [367570]
info     libsolv  - openmm-torch-1.2-cpu_py311hca44478_0 [367574]
info     libsolv  - openmm-torch-1.2-cuda112py311h20aef98_0 [367578]
info     libsolv  - openmm-torch-1.3-cpu_py311hca44478_0 [367582]
info     libsolv  - openmm-torch-1.3-cuda112py311h20aef98_0 [367586]
info     libsolv  - openmm-torch-1.4-cpu_py311hca44478_0 [367590]
info     libsolv  - openmm-torch-1.4-cuda112py311hf760662_0 [367594]
info     libsolv  Fallback to timestamp comparison: 1696956665 vs 1696956917: [1]
info     libsolv  Selecting variant [b] of (a) openmm-torch-1.4-cpu_py311hca44478_0 vs (b) openmm-torch-1.4-cuda112py311hf760662_0 (score: 1)
info     libsolv  creating a branch [data=4582244]:
info     libsolv    - openmm-torch-1.4-cuda112py311hf760662_0
info     libsolv    - openmm-torch-1.4-cpu_py311hca44478_0
info     libsolv  installing openmm-torch-1.4-cuda112py311hf760662_0

@RaulPPelaez
Copy link
Contributor Author

I believe this is the same issue detected here openmm/openmm-ml#67
Hopefully it will be fixed here #49

@RaulPPelaez
Copy link
Contributor Author

The original issue has been solved by #49. Running the following succeeds in installing the cpu versions of everything:

$ mamba create -n test openmm-torch pytorch-cpu                                                                                                                                                                                                                                                        
                                                                                                                                                                                                                                                                                                                             
Looking for: ['openmm-torch', 'pytorch-cpu']                                                                                                                                                                                                                                                                                 
                                                                                                                                                                                                                                                                                                                             
warning  libmamba Cache file "/shared/raul/mambaforge/pkgs/cache/497deca9.json" was modified by another program                                                                                                                                                                                                              
warning  libmamba Cache file "/shared/raul/mambaforge/pkgs/cache/09cdf8bf.json" was modified by another program                                                                                                                                                                                                              
conda-forge/noarch                                  13.4MB @  32.3MB/s  1.2s                                                                                                                                                                                                                                                 
conda-forge/linux-64                                32.2MB @  22.2MB/s  7.0s                                                                                                                                                                                                                                                 
Transaction                                                                                                                                                                                                                                                                                                                  
                                                                                                                                                                                                                                                                                                                             
  Prefix: /shared/raul/mambaforge/envs/test                                                                                                                                                                                                                                                                                  
                                                                                                                                                                                                                                                                                                                             
  Updating specs:                                                                                                                                                                                                                                                                                                            
                                                                                                                                                                                                                                                                                                                             
   - openmm-torch                                                                     
   - pytorch-cpu                                                                                                                                                                                                   
                                                                                                                                                                             
                                                                                                                                                                             
  Package                 Version  Build                        Channel           Size                                                                                       
────────────────────────────────────────────────────────────────────────────────────────                                                                                                                                                                                                                                     
  Install:                                                                            
────────────────────────────────────────────────────────────────────────────────────────                      
...
  + pytorch                 2.1.2  cpu_generic_py310h5d8fa8e_1  conda-forge       26MB                                                                                                                                                                                                                                       
  + pytorch-cpu             2.1.2  cpu_generic_py310h9d11763_1  conda-forge       22kB                                                                                                                                                                                                                                       
  + openmm-torch              1.4  cpu_py310h9717ab3_3          conda-forge      216kB    

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants