Skip to content

Conversation

@pearu
Copy link
Contributor

@pearu pearu commented Mar 15, 2020

Checklist

  • Used a fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

@conda-forge-linter
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I wanted to let you know that I linted all conda-recipes in your PR (recipe) and found some lint.

Here's what I've got...

For recipe:

  • Failed to even lint the recipe, probably because of a conda-smithy bug 😢. This likely indicates a problem in your meta.yaml, though. To get a traceback to help figure out what's going on, install conda-smithy and run conda smithy recipe-lint . from the recipe directory.

@conda-forge-linter
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@pearu
Copy link
Contributor Author

pearu commented Mar 15, 2020

@conda-forge-admin, please rerender

@github-actions
Copy link
Contributor

Hi! This is the friendly automated conda-forge-webservice.
I tried to rerender for you, but it looks like I wasn't able to push to the pearu/enable-cuda
branch of Quansight/arrow-cpp-feedstock. Did you check the "Allow edits from maintainers" box?

@pearu
Copy link
Contributor Author

pearu commented Mar 15, 2020

There are CI failures:

2020-03-15T20:31:40.8273787Z Traceback (most recent call last):
2020-03-15T20:31:40.8277197Z   File "/opt/conda/lib/python3.7/site-packages/conda_build/environ.py", line 764, in get_install_actions
<snip>
2020-03-15T20:31:40.8292864Z     raise ResolvePackageNotFound(bad_deps)
2020-03-15T20:31:40.8293255Z conda.exceptions.ResolvePackageNotFound: 
2020-03-15T20:31:40.8293952Z   - nvcc_linux-64=None

What is the proper way to fix it?

@pearu
Copy link
Contributor Author

pearu commented Mar 17, 2020

@isuruf do you have ideas how to fix the nvcc_linux-64=None issue reported above?

@isuruf
Copy link
Member

isuruf commented Mar 17, 2020

That's there to enable a build without cuda. Do you want a build without cuda? Or do you want only builds with cuda? If the former, use - {{ compiler('cuda') }} # [linux64 and cuda_compiler_version != 'None']. If the latter, use skip: True # [linux64 and cuda_compiler_version=='None']

@pearu
Copy link
Contributor Author

pearu commented Mar 17, 2020

Thanks @isuruf for the hints! The answer to your question actually depends on whether a single feedstock can support both cuda and non-cuda versions of the packages for the same platform (linux64 here) and how. The aim is that arrow-cpp would be installable to environments without cuda (the current master behavior) and to environments with cuda (the extension from this PR, the package needs to include libarrow_cuda library).

Btw, in this particular case of arrow-cpp package, any cuda version of the package would likely work in environments with no cuda installed, libarrow_cuda library would just not be unused.

@jakirkham
Copy link
Member

jakirkham commented Mar 17, 2020

Would take a look at ucx-split as a simple example of supporting CUDA and non-CUDA builds in the same feedstock.

@pearu
Copy link
Contributor Author

pearu commented Mar 17, 2020

Thanks @jakirkham for the link! The current PR is using the same approach as ucx-split.

@pearu
Copy link
Contributor Author

pearu commented Mar 17, 2020

Please review.
@xhochy

Copy link
Contributor

@kkraus14 kkraus14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks @pearu!

@jakirkham
Copy link
Member

Will we need similar changes for pyarrow?

@isuruf
Copy link
Member

isuruf commented Mar 17, 2020

Looks like this only links to libcuda.so. Then you can use 9.2 to build against and have one variant right?

@jakirkham
Copy link
Member

jakirkham commented Mar 18, 2020

Yeah sorry Isuru. I may not be able to answer your question as I don't know enough about Arrow's code base. Hopefully someone more knowledgeable there can provide us some idea of what CUDA features they are using 🙂

@xhochy xhochy merged commit 27d886a into conda-forge:master Mar 18, 2020
@xhochy
Copy link
Member

xhochy commented Mar 18, 2020

Will we need similar changes for pyarrow?

Yes, there is a pyarrow._cuda module which gets built only when CUDA is activated.

@pearu pearu deleted the pearu/enable-cuda branch March 18, 2020 16:54
@isuruf
Copy link
Member

isuruf commented Mar 18, 2020

@xhochy, this PR should not have been merged. As it is right now, some users who don't even have a GPU will get a proprietary binary blob (cuda-toolkit).

@xhochy
Copy link
Member

xhochy commented Mar 18, 2020

@xhochy, this PR should not have been merged. As it is right now, some users who don't even have a GPU will get a proprietary binary blob (cuda-toolkit).

Oh, then I didn't understand the how selection mechanism for cuda vs non-cuda works.

@xhochy
Copy link
Member

xhochy commented Mar 18, 2020

@isuruf Is there something easy to fix or should we revert and mark as broken?

@isuruf
Copy link
Member

isuruf commented Mar 18, 2020

It's better to mark as broken for now. There are some issues that needs to be discussed before fixing.

xhochy added a commit that referenced this pull request Mar 18, 2020
@xhochy
Copy link
Member

xhochy commented Mar 18, 2020

@pearu I revert the PR on master, please reopen and then we can discuss the issues @isuruf is talking about.

@isuruf Please request changes in future, so I don't merge this accidentially again ;). From the discussion I thought everything was resolved.

@jakirkham
Copy link
Member

Sorry I'm confused @isuruf, what is the issue here?

@isuruf
Copy link
Member

isuruf commented Mar 18, 2020

Question to @xhochy. If arrow is built with CUDA enabled, do downstream packages need to enable CUDA too? (I know that downstream package can enable CUDA, but do they need to?)
If not, then we can have one single package with CUDA enabled and people who don't want CUDA will only have a small amount of code added.

@jakirkham, there's no features added here, so the conda solver is free to choose a CUDA build which brings in cudatoolkit (unless ignore_run_exports is used) which is a huge proprietary binary blob.

@xhochy
Copy link
Member

xhochy commented Mar 18, 2020

Question to @xhochy. If arrow is built with CUDA enabled, do downstream packages need to enable CUDA too? (I know that downstream package can enable CUDA, but do they need to?)
If not, then we can have one single package with CUDA enabled and people who don't want CUDA will only have a small amount of code added.

No, downstream packages without cuda will just work fine. The CUDA support/usage is contained in libarrow_cuda${SHLIB_EXT}. All other shared objects don't deal link/operate with CUDA.

@jakirkham
Copy link
Member

@jakirkham, there's no features added here, so the conda solver is free to choose a CUDA build which brings in cudatoolkit (unless ignore_run_exports is used) which is a huge proprietary binary blob.

Got it. Yeah this seems like a feature we would want for other reasons. Missed that wasn't included here.

For context @pearu @xhochy, we handle this in the ucx case by adding this ucx-proc output. This allows user selection of CUDA support or not.

Another way to solve this is to add some dlopen logic around loading the CUDA libraries you have built here with fallback handling if libcuda is not available. This is how openmpi solves this problem (as Isuru alluded to earlier).

@jakirkham
Copy link
Member

Sorry I may have misunderstood your comment before, @xhochy. Are you already using dlopen? If so, maybe just adding cudatoolkit to ignore_run_exports and run_constrained would solve this issue.

@xhochy
Copy link
Member

xhochy commented Mar 19, 2020

No, we're not using dlopen. I looked again a bit through the Arrow source code and the following happens:

  • We build libarrow_cuda which links against "cuda", all other libarrow_* libraries are exactly the same, no changed functionality, no link to "cuda"
  • We build the plasma store and libplasma differently based on whether we have CUDA support. Thus we need to have different Arrow builds for with-cuda and without-cuda.

@xhochy
Copy link
Member

xhochy commented Mar 19, 2020

Marking the builds as broken: conda-forge/admin-requests#22

@jakirkham
Copy link
Member

Ok then borrowing ucx-proc should work here (arrow-proc?).

@pearu pearu restored the pearu/enable-cuda branch March 19, 2020 13:21
@pearu
Copy link
Contributor Author

pearu commented Mar 19, 2020

I am not sure how the splitting of arrow-cpp should work.

ucx-split seems to produce two conda packages:

  • ucx - main package
  • ucx-proc with cpu and gpu build labels - I am not sure what it does ?

I can think of the following solution:

  • arrow-cpp - contains everything that the current master provides.
  • arrow-cpp-cuda - contains arrow-cpp plus libarrow_cuda library (and cuda specific libplasma).
    Can this be achieved within one feedstock?

Btw, I have created #125 but it might be nonsense.

@isuruf
Copy link
Member

isuruf commented Mar 19, 2020

Here's an option. Have 2 variants.

  1. With CUDA but with __cuda >=9.2 as a run requirement. This package will be installed only if conda>=4.8 and the host has CUDA driver built
  2. Without CUDA

@jakirkham
Copy link
Member

Though users may want to use the CPU-only package on machines that have a GPU.

@isuruf
Copy link
Member

isuruf commented Mar 19, 2020

@jakirkham, then they can use the build string to get the package that doesn't use the GPU.

@jakirkham
Copy link
Member

jakirkham commented Mar 19, 2020

Build strings of non-mutex package tend to have other things like build numbers, which can make this a bit hard to use in practice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants