Shim package to interact with an existing NVCC install #8229

jakirkham · 2019-04-29T18:00:48Z

Have been using this to build some GPU packages for work. Am placing it up here as a draft PR to solicit feedback and start the discussion towards building a community standard. Please share your thoughts.

Summary: This create a small shim package designed to present an existing NVCC compiler installed by the user to conda-build. Additionally it attaches the cudatoolkit package from defaults as a runtime requirement to anything that uses this compiler with an appropriate version constraint. The goal is to make it easy to build Conda packages that use GPUs. It works well in conjunction with the Docker image proposed in PR ( conda-forge/docker-images#93 ).

cc @kkraus14 @mike-wendt @mrocklin @datametrician @msarahan @seibert @jjhelmus @soumith (feel free to include others I may have missed)

Checklist

License file is packaged (see here for an example)
Source is from official source
Package does not vendor other packages
Build number is 0
GitHub users listed in the maintainer section have posted a comment confirming they are willing to be listed there

This adds a shim package to help conda-build interact with an **existing** nvcc install. Through this shim compiler package, users are able to ensure an appropriate cudatoolkit version is attached to the package and thus make sure that requirement is met at install time.

conda-forge-linter · 2019-04-29T18:01:04Z

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipes/nvcc) and found it was in an excellent condition.

jjhelmus · 2019-05-01T18:50:19Z

recipes/nvcc/build.sh

+    CUDA_HOME="$(dirname $(dirname $(which nvcc)))"
+fi
+
+# Set `CUDA_HOME` in an activation script.


If CUDA_HOME is already set should we store the value into BACKUP_CUDA_HOME or similar and restore it in the deactivate script?

We could do that. Though this will probably be a little gross to write/read (we need to escape some stuff). Also it probably won't matter that much if users are already using this with our Docker image (the intent here).

recipes/nvcc/meta.yaml

isuruf · 2019-05-01T20:31:02Z

recipes/nvcc/build.sh

+mkdir -p "${PREFIX}/etc/conda/activate.d"
+cat > "${PREFIX}/etc/conda/activate.d/${PKG_NAME}_activate.sh" <<EOF
+#!/bin/bash
+export CUDA_HOME="${CUDA_HOME}"


This hardcodes the cuda install location right? How about making the user supply this? (This is what we do with MacOS SDK and CONDA_BUILD_SYSROOT)

It does. Though the intent is for this to come from the docker image where this location is fixed. So it probably doesn't matter much that it is hardcoded.

I suppose we could do that. Though I'm not clear on what the intent would be. Do we want users to use this package outside of the Docker image? It's not clear to me how we can do that reasonably yet. Would prefer getting this out in a limited form and seeing what we can accomplish with it already before thinking about other ways to use it.

Do we want users to use this package outside of the Docker image?

Yes. CONDA_BUILD_SYSROOT is a way for users to use the macos compilers outside of conda-build.

I think we want to get this correct the first time to ensure that we don't need to keep on supporting older conventions.

Users are still able to build packages with this convention outside of conda-forge (have been doing that for a bit in fact). This works the same as it does with other Linux builds. IOW users run a build script that starts one of our Docker images and does the build within it.

No objections to getting things right. Am just not seeing how this is causing issues. Could you please clarify on the problems you are seeing here?

What happens to users who don't have access to docker?

For ex: I was building qt on a machine without access to docker, but it works, because you really don't need the docker image. With this package, you are tying the building of conda packages to the docker image.

This opens us up to figuring out how people have installed the NVIDIA toolchain, whether it was done correctly, and trying to debug the various issues they may encounter by doing this incorrectly. I'd prefer not to have that exposure by limiting our supported case to one that we know behaves well.

Okay, then please add some code to the activate script to error out if ${CUDA_HOME} and ${CUDA_HOME}/lib64/stubs/libcuda.so is not found.

Sounds reasonable. Have pushed such a test. Please take a look.

Can you also check the nvcc version matches the package version?

jakirkham · 2019-05-02T15:30:24Z

As there were a few questions about the CUDA_HOME environment variable, I think we can probably just drop that. It's mostly a holdover from an earlier iteration of this recipe. Hopefully that addresses a few concerns. 🙂

soumith · 2019-05-28T03:06:24Z

looks reasonable to me, thanks for the draft PR!

Namely we need to find the compiler's sysroot, which it knows of, to symlink `libcuda` there so that it will find it during a build.

We need a way of forwarding the compiler preference to `nvcc`. Namely we want to ensure that the Anaconda C++ compiler is used. This script accomplishes that by adding a flag specifying the right compiler to use.

Provides a simple test for compiling some CUDA code.

jakirkham · 2019-07-24T21:37:46Z

After playing with this a bit more, have found it needed a few more changes. In particular this now contains a small shell script, which calls nvcc as opposed to the symlink it had previously. The shell script is there merely to pass -ccbin with the Anaconda C++ compiler specified (and any other user provided arguments). Thus making sure the Anaconda compiler is used. This seems to handle a few more use cases.

In addition, have added a small compilation test to build some trivial CUDA code.

Ensure that CUDA includes are added to the C and C++ compiler flags so they can be found when building.

As the Docker image will know what version it is, just copy that version into the `nvcc` package when creating it. After all the point of this package is to act as a compiler shim for `nvcc` in a Docker container with that version.

conda-forge-linter · 2019-08-15T16:36:12Z

Hi! This is the friendly automated conda-forge-linting service.

I wanted to let you know that I linted all conda-recipes in your PR (recipes/nvcc) and found some lint.

Here's what I've got...

For recipes/nvcc:

Package version doesn't match conda spec

This should ensure re-rendering goes smoothly as it mocks the architecture anyways, which could have ill-effects when creating the feedstock.

Should make the linter happier.

conda-forge-linter · 2019-08-15T17:11:16Z

Hi! This is the friendly automated conda-forge-linting service.

I wanted to let you know that I linted all conda-recipes in your PR (recipes/nvcc) and found some lint.

Here's what I've got...

For recipes/nvcc:

The home item is expected in the about section.
The license item is expected in the about section.
The summary item is expected in the about section.
The recipe must have a build/number section.

conda-forge-linter · 2019-08-15T17:13:17Z

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipes/nvcc) and found it was in an excellent condition.

jakirkham · 2019-08-15T17:35:04Z

@conda-forge/core, this is ready for a final review. 😄

Builds will fail here as we need to use the CUDA Docker images. However we can't easily use them without this package in place first. So I propose we get this through (as long as we are happy with it). Then we can deal with the rest of the configuration issues on the feedstock.

The build script no longer makes sense here as that happens before the split package is handled.

isuruf · 2019-08-15T22:34:55Z

recipes/nvcc/meta.yaml

@@ -0,0 +1,64 @@
+{% set name = "nvcc" %}
+{% set version = environ.get("CUDA_VER", "9.2") %}


Can you add a recipe/conda_build_config.yaml with,

CUDA_VER: - 9.2 - 10.0

and use the value like set version = CUDA_VER

The 9.2 is really only here to make the linter happy. Personally I'd rather not have it here at all if it could be avoided.

Basically what this is doing is picking up CUDA_VER from the docker image.

recipes/nvcc/install_nvcc.sh

Co-Authored-By: Isuru Fernando <[email protected]>

jakirkham · 2019-08-16T17:41:15Z

Any other thoughts @conda-forge/core? If not, am planning on merging soon 🙂

isuruf · 2019-08-16T18:00:35Z

recipes/nvcc/meta.yaml

+        - libgcc-ng
+      run_exports:
+        strong:
+          - cudatoolkit {{ version }}|{{ version }}.*


You need the run_exports of the C++ compiler runtime here as well.

Why? If the user is using the C++ compiler, that will be added by this anyways.

But the runtimes like libgcc-ng and libstdcxx-ng will not be added. run_exports is not transitive.

I don't think users should be relying on nvcc to add these things. If they are using a C or C++ compiler, they should list those requirements explicitly in their recipe.

Then the C and C++ compiler should not be in run requirements. The way the compiler package is set up is that if the user tries to use a C++ compiler in the docker image, it will not be found if it is not an explicit requirement. Having the explicit requirement means that the run_exports for libgcc-ng is added.
But now, if the user adds nvcc_linux-64, they get the C++ compiler, but no run_exports for libgcc-ng.

That makes sense. Have dropped the C and C++ compiler from run. They won't be installed now. The test has been updated to list C and C++ as requirements of the test.

isuruf · 2019-08-16T18:01:59Z

recipes/nvcc/meta.yaml

+        - test -f "${PREFIX}/etc/conda/{{ state }}.d/{{ PKG_NAME }}_{{ state }}.sh"
+        {% endfor %}
+        # Try using the activation scripts.
+        - source ${PREFIX}/etc/conda/activate.d/{{ PKG_NAME }}_activate.sh


Can you remove this line? It should have been run already

We could, but this makes it clearer to the reader what we are testing. Explicit is better than implicit.

But then, we are not testing that CUDA_HOME is automatically set. We are checking that CUDA_HOME is set when that script is run.

I see your point. That said, do these test modifications need to be handled in staged-recipes or can they be handled in the feedstock? Note that more work will need to be done in the feedstock anyways.

Okay. If you fix the other issue, then I'll merge. Also, please wait for my review on a PR in the feedstock before this package is released.

As a higher level comment, making these suggestions is good. Just want to make sure we are progressing on the larger objectives they sit within as well.

I switched things around so now we test activation works without running the activation script first. Also added an explicit test of the activation script after running/testing the deactivation script. This also allowed a little bit of cleanup around the build step. So hopefully a bit nicer. Though we can discuss more in the feedstock.

Do not explicitly include the C and C++ compilers as users should explicitly add these as requirements of their package.

This shows users how they should be using `nvcc` with other compilers and what this looks like. Also tests the expected use case to make sure it works correctly.

isuruf · 2019-08-16T18:51:39Z

recipes/nvcc/meta.yaml

+
+extra:
+  recipe-maintainers:
+    - jakirkham


Please add me as well.

Sure would be happy to have your help here. 🙂 Added you in the last commit.

Make sure that activation works without running the activation script. Also test the activation script works as expected when run explicitly.

These should no longer be needed as the compiler's environment is properly activated now after running the activation tests.

jakirkham · 2019-08-16T19:12:36Z

I think I've addressed all of your concerns, @isuruf. Could you please take another look? 🙂

jakirkham · 2019-08-16T19:21:29Z

Thanks @isuruf! 🎉

This was referenced Apr 29, 2019

How to specify CUDA version in a conda package? conda-forge/conda-forge.github.io#687

Open

Xgboost GPU support conda-forge/xgboost-feedstock#26

Closed

jjhelmus reviewed May 1, 2019

View reviewed changes

recipes/nvcc/meta.yaml Show resolved Hide resolved

isuruf reviewed May 1, 2019

View reviewed changes

jakirkham mentioned this pull request May 7, 2019

[BUG] Conda recipe improvements rapidsai/cudf#1210

Closed

15 tasks

jakirkham mentioned this pull request Jun 10, 2019

Determining CUDA Docker image/compiler version conda-forge/conda-forge-pinning-feedstock#237

Closed

This was referenced Jun 25, 2019

WIP: Add nccl #8637

Closed

WIP: Add CuPy #8638

Closed

This was referenced Jul 11, 2019

[FEA] Conda recipe for cuda-aware MPI rapidsai/cuml#825

Closed

GPU Support conda-forge/pytorch-cpu-feedstock#7

Closed

jakirkham added 3 commits July 24, 2019 16:53

Explain why the C compiler is needed in host

9dafb78

Namely we need to find the compiler's sysroot, which it knows of, to symlink `libcuda` there so that it will find it during a build.

Create a simple nvcc wrapper script

43e8080

We need a way of forwarding the compiler preference to `nvcc`. Namely we want to ensure that the Anaconda C++ compiler is used. This script accomplishes that by adding a flag specifying the right compiler to use.

Add a CUDA compilation test

2fa0d01

Provides a simple test for compiling some CUDA code.

Add CUDA includes

8028f5a

Ensure that CUDA includes are added to the C and C++ compiler flags so they can be found when building.

djsutherland mentioned this pull request Jul 31, 2019

add keops #8992

Closed

6 tasks

jakirkham added 2 commits August 15, 2019 18:19

Set nvcc version based on CUDA_VER

2b0d199

As the Docker image will know what version it is, just copy that version into the `nvcc` package when creating it. After all the point of this package is to act as a compiler shim for `nvcc` in a Docker container with that version.

Quote the version for good measure

d80b660

jakirkham added 2 commits August 15, 2019 18:39

Use split package structure

0c2fa29

This should ensure re-rendering goes smoothly as it mocks the architecture anyways, which could have ill-effects when creating the feedstock.

Default version to 9.2 if undefined

18ef258

Should make the linter happier.

Fill in some details to make the linter happier

c4d7c8c

jakirkham changed the title ~~WIP: Shim package to interact with an **existing** NVCC install~~ Shim package to interact with an **existing** NVCC install Aug 15, 2019

jakirkham marked this pull request as ready for review August 15, 2019 17:32

jakirkham added 3 commits August 15, 2019 23:47

Simply reuse global build number in split package

3cf1715

Use script to perform the build & install of nvcc

cabbd8b

The build script no longer makes sense here as that happens before the split package is handled.

Add checks to ensure $CUDA_HOME is valid

a4b5c68

jakirkham force-pushed the add_nvcc_wrapper branch from 37b7dcd to a4b5c68 Compare August 15, 2019 22:30

isuruf reviewed Aug 15, 2019

View reviewed changes

Check installed CUDA version matches the package

6ed2727

isuruf reviewed Aug 16, 2019

View reviewed changes

recipes/nvcc/install_nvcc.sh Outdated Show resolved Hide resolved

Update recipes/nvcc/install_nvcc.sh

f0d805c

Co-Authored-By: Isuru Fernando <[email protected]>

jakirkham mentioned this pull request Aug 16, 2019

Adding cuda virtual package requirement to cudatoolkit conda/conda#9115

Closed

isuruf reviewed Aug 16, 2019

View reviewed changes

jakirkham added 2 commits August 16, 2019 20:43

Drop C and C++ compiler from requirements/run

2a40df8

Do not explicitly include the C and C++ compilers as users should explicitly add these as requirements of their package.

Require C and C++ compilers for testing

52439ef

This shows users how they should be using `nvcc` with other compilers and what this looks like. Also tests the expected use case to make sure it works correctly.

isuruf reviewed Aug 16, 2019

View reviewed changes

jakirkham added 3 commits August 16, 2019 20:59

Test activation occurs correctly normally

4cc4562

Make sure that activation works without running the activation script. Also test the activation script works as expected when run explicitly.

Drop explicit activation/deactivation around build

3ca9de4

These should no longer be needed as the compiler's environment is properly activated now after running the activation tests.

Add @isuruf as a maintainer as well

15f7122

isuruf merged commit acf336b into conda-forge:master Aug 16, 2019

jakirkham deleted the add_nvcc_wrapper branch August 16, 2019 19:17

jakirkham mentioned this pull request Sep 18, 2019

Add CUDA 10.0 build conda-forge/nvcc-feedstock#5

Closed

5 tasks

isuruf mentioned this pull request Oct 3, 2019

Activate script in conda package contains hardcoded /usr/local/cuda conda-forge/nvcc-feedstock#11

Closed

jakirkham mentioned this pull request Oct 26, 2019

Adding C/C++ compilers as explicit dependencies conda-forge/nvcc-feedstock#25

Open

		@@ -0,0 +1,64 @@
		{% set name = "nvcc" %}
		{% set version = environ.get("CUDA_VER", "9.2") %}

Shim package to interact with an **existing** NVCC install #8229

Shim package to interact with an **existing** NVCC install #8229

Conversation

jakirkham commented Apr 29, 2019 • edited Loading

conda-forge-linter commented Apr 29, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakirkham Aug 15, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakirkham commented May 2, 2019

soumith commented May 28, 2019

jakirkham commented Jul 24, 2019

conda-forge-linter commented Aug 15, 2019

conda-forge-linter commented Aug 15, 2019

conda-forge-linter commented Aug 15, 2019

jakirkham commented Aug 15, 2019

Choose a reason for hiding this comment

jakirkham Aug 15, 2019 • edited Loading

Choose a reason for hiding this comment

jakirkham commented Aug 16, 2019

isuruf Aug 16, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakirkham commented Aug 16, 2019

jakirkham commented Aug 16, 2019

Shim package to interact with an existing NVCC install #8229

Shim package to interact with an existing NVCC install #8229

jakirkham commented Apr 29, 2019 •

edited

Loading

jakirkham Aug 15, 2019 •

edited

Loading

jakirkham Aug 15, 2019 •

edited

Loading

isuruf Aug 16, 2019 •

edited

Loading