Xgboost GPU support #26

RAMitchell · 2019-04-30T03:59:52Z

Hi @aldanor, @beckermr, I am an xgboost dev mostly responsible for the GPU algorithms. We have had a few requests to get a GPU enabled package up on anaconda for linux. What would it take to make this happen? I am fairly new to anaconda, any guidance would be appreciated.

Thanks,
Rory

jakirkham · 2019-04-30T14:23:01Z

Hi Rory, thanks for stopping by and filing this request.

Currently conda-forge does not have GPU support, but it is something we are hoping to change.

To get things started, have submitted PR ( conda-forge/docker-images#93 ) to create a new Docker image (similar to the one used for Linux builds currently), which is based off of existing NVIDIA CUDA Docker images that include NVCC. Also have submitted PR ( conda-forge/staged-recipes#8229 ), which creates a shim compiler package for conda-build so it can use the existing NVCC install in the Docker image easily. There are a few more things that we will want to do, but that is a good starting point.

Right now am waiting for community feedback on these two proposals. If you have any thoughts on them, please feel free to chime in.

jakirkham · 2019-04-30T22:34:12Z

Do you have a Conda recipe you have been using for this currently?

Barring that what would be the best instructions to follow for building a GPU enabled package? These or something else?

Also what are your requirements in terms of GPU libraries (e.g. CUDA, NCCL, etc.) versions? Are there any other library dependencies we should be aware of?

Finally is there a good test we could run to ensure the build worked correctly?

RAMitchell · 2019-04-30T22:50:54Z

We have a Jenkins based CI system that builds python wheels for pypy.

Here is the docker file: https://github.com/dmlc/xgboost/blob/master/tests/ci_build/Dockerfile.gpu_build

This contains appropriate nccl versions. We have been releasing builds with cuda 8.0 for maximum compatibility. We then test CPU algorithms in a minimal container (https://github.com/dmlc/xgboost/blob/master/tests/ci_build/Dockerfile.release) to ensure it still works on a system without a GPU, as well as testing the GPU algorithms on a GPU enabled container.

When I say testing, I would run the Python tests here (tests/python-gpu), disabling or enabling the multi-GPU tests as appropriate. Additionally run the google test executable testxgboost.exe, you will have to enable google tests in the cmake options to generate this.

jakirkham · 2019-04-30T23:15:17Z

Thanks for the info. Have a few more follow-up questions.

Do you require those exact versions in the container? Or are they lower bounds? Are there any versions that you have found to be problematic?

As for CUDA 8.0, currently PR ( conda-forge/docker-images#93 ) proposes building for both CUDA 9.2 and 10.0. Should we consider adding 8.0 as well? How long are you holding onto older CUDA versions? When do you start picking up newer CUDA versions? Do you value building for multiple versions of CUDA? In particular JIT compilation time has come up before.

Are there particular tests that you have found useful for catching common build or user issues?

RAMitchell · 2019-04-30T23:33:07Z

I haven't really given transitioning between cuda versions so much thought, we started with 8.0 and haven't upgraded since. 9.2 should be fine as well, maybe we should upgrade for pypy as well. Jenkins is currently building 8.0, 9.2 and 10.0 without issues, there is a known compilation bug 10.1.

I do not want to add any additional complexity to the user install experience, hence the choice to use the most stable version. If we can release multiple versions without impacting user experience, then this is beneficial, in particular to avoid JIT compilation for volta.

I can't recommend any specific tests, I would normally just run all python tests and google tests.

jakirkham · 2019-05-02T15:50:38Z

Thanks for the additional info.

With CUDA 8.0 it looks like we need an older compiler than what we typically use, I can look into that. Will focus on CUDA 9.2 and CUDA 10.0 near term as that aligns well with what we have. If you could give me an idea of the relative importance of CUDA 8.0 to users that would be very helpful.

Sure that makes sense. In the wheel context picking something old that works broadly sounds like a good choice. Conda is designed to be language agnostic so is very comfortable expressing library dependencies that are not Python specific (e.g. cudatoolkit versions). That said, I would appreciate your feedback on packages we build to ensure they are working as expected.

RAMitchell · 2019-05-02T22:02:47Z

I would say just look at cuda 9.2 or greater. We may remove cuda 8.0 support in the short term anyway.

jakirkham · 2019-05-03T00:51:49Z

Sounds good. Thanks for letting me know. 🙂

hcho3 · 2019-05-15T14:26:59Z

@jakirkham Have you also considered building GPU XGBoost for Windows? I have recently spent time to set up Jenkins CI for Windows (dmlc/xgboost#4463 and dmlc/xgboost#4469), and I found out that you don't actually need a GPU to compile CUDA code. (Running CUDA code will require one, however.) You just need to install the CUDA toolkit. So in principle it should be possible to use Appveyor or Azure Pipelines to compile CUDA code.

EDIT. Just found an example of installing CUDA toolkit inside AppVeyor: https://ci.appveyor.com/project/tmcdonell/cuda/builds/24181692/job/7qum8ca8g7l8qfqn#L6

jakirkham · 2019-05-17T19:17:59Z

Thanks for raising this @hcho3.

I agree it would be good to think about GPUs on other platforms (including Windows). Have not thought too much on this yet.

Would you be comfortable starting an issue on the webpage repo? This is normally where we discuss things that may affect the org more generally. Would also be happy to raise that issue if you prefer (though may ask you to fill in the Windows specifics as you have looked more closely at this 🙂).

jakirkham · 2019-10-09T21:14:29Z

I've seen discussion of a new release coming out. Given this, what is the best way to build a GPU enabled xgboost package now? @RAMitchell ? 🙂

RAMitchell · 2019-10-17T22:42:27Z

I don't think much has really changed from our end regarding our build system. We no longer support cuda 8.0. We are currently considering an intermediate release before 1.0, I don't know how soon this will happen.

Where did we get to with this last time? Is the infrastructure in place from the conda-forge perspective to build GPU code? I see there are still some work going on with nccl in #9694. We depend on nccl for the distributed version (e.g. with dask) but could still release a single GPU version to start with.

jakirkham · 2019-10-17T22:49:08Z

Thanks for the follow-up Rory! 😄

An intermediate release would be very welcome.

conda-forge now has the infrastructure to perform the builds and we have tried this out on a few feedstocks so far. We support CUDA 9.2, 10.0, and 10.1. Docs still need to be written, but that shouldn't be a blocker to getting this started here.

The NCCL package is ready to go. Am just waiting on feedback from some other people before going ahed. Though no objections to starting more simply here if that makes sense.

jakirkham · 2019-10-25T00:33:34Z

Does xgboost still have compilations issues with CUDA 10.1 or was that fixed?

RAMitchell · 2019-10-25T01:04:44Z

Fixed afaik

jakirkham · 2019-10-25T01:32:08Z

Is the fix in 0.90 or master?

RAMitchell · 2019-10-25T01:45:05Z

Looks like 0.9 should include the fix: dmlc/xgboost#4475

RAMitchell · 2020-03-28T08:13:55Z

Any updates on this? Xgboost 1.0 is released now.

twsl · 2020-04-08T10:03:27Z

I am waiting for it as well. What exactly is currently blocking this?

ksangeek · 2020-04-23T11:07:40Z

I see that the only other thing missing to update the conda recipe with GPU support is the need for the cudatoolkit conda package in conda-forge. It will be required in the test stage and run.
I see that cudatoolkit-dev and nccl are already available.

Otherwise, we already have a reference on what needs to be done from this recipe branch in the anaconda defaults - https://github.com/AnacondaRecipes/xgboost-feedstock/blob/py-xgboost-gpu/recipe/meta.yaml.

jakirkham · 2020-04-23T19:16:55Z

We are discussing how best to proceed offline atm. Will update once we have figured out a plan of action.

cc @JohnZed @quasiben

anders-wind · 2021-05-21T14:01:09Z

Looks like cudatoolkit is available at conda-forge now: https://anaconda.org/conda-forge/cudatoolkit
just an FYI :)

jakirkham mentioned this issue May 1, 2019

Add CUDA build image conda-forge/docker-images#93

Merged

hcho3 mentioned this issue Mar 27, 2020

Conda package not compiled with GPU support dmlc/xgboost#5447

Closed

jaimergp mentioned this issue Aug 31, 2020

XGBoost package. openkinome/kinoml#25

Merged

3 tasks

izahn mentioned this issue Feb 13, 2022

add gpu support #84

Merged

5 tasks

xhochy closed this as completed in #84 Feb 15, 2022

VEZcoding mentioned this issue Jun 30, 2022

XGB on GPU dmlc/xgboost#8041

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Xgboost GPU support #26

Xgboost GPU support #26

RAMitchell commented Apr 30, 2019

jakirkham commented Apr 30, 2019

jakirkham commented Apr 30, 2019

RAMitchell commented Apr 30, 2019

jakirkham commented Apr 30, 2019

RAMitchell commented Apr 30, 2019

jakirkham commented May 2, 2019

RAMitchell commented May 2, 2019 •

edited

Loading

jakirkham commented May 3, 2019

hcho3 commented May 15, 2019 •

edited

Loading

jakirkham commented May 17, 2019

jakirkham commented Oct 9, 2019

RAMitchell commented Oct 17, 2019 •

edited

Loading

jakirkham commented Oct 17, 2019

jakirkham commented Oct 25, 2019

RAMitchell commented Oct 25, 2019

jakirkham commented Oct 25, 2019

RAMitchell commented Oct 25, 2019

RAMitchell commented Mar 28, 2020

twsl commented Apr 8, 2020

ksangeek commented Apr 23, 2020

jakirkham commented Apr 23, 2020

anders-wind commented May 21, 2021

Xgboost GPU support #26

Xgboost GPU support #26

Comments

RAMitchell commented Apr 30, 2019

jakirkham commented Apr 30, 2019

jakirkham commented Apr 30, 2019

RAMitchell commented Apr 30, 2019

jakirkham commented Apr 30, 2019

RAMitchell commented Apr 30, 2019

jakirkham commented May 2, 2019

RAMitchell commented May 2, 2019 • edited Loading

jakirkham commented May 3, 2019

hcho3 commented May 15, 2019 • edited Loading

jakirkham commented May 17, 2019

jakirkham commented Oct 9, 2019

RAMitchell commented Oct 17, 2019 • edited Loading

jakirkham commented Oct 17, 2019

jakirkham commented Oct 25, 2019

RAMitchell commented Oct 25, 2019

jakirkham commented Oct 25, 2019

RAMitchell commented Oct 25, 2019

RAMitchell commented Mar 28, 2020

twsl commented Apr 8, 2020

ksangeek commented Apr 23, 2020

jakirkham commented Apr 23, 2020

anders-wind commented May 21, 2021

RAMitchell commented May 2, 2019 •

edited

Loading

hcho3 commented May 15, 2019 •

edited

Loading

RAMitchell commented Oct 17, 2019 •

edited

Loading