Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BLOCKING] Latest XGBoost with CUDA fails to compile on Windows #4462

Closed
hcho3 opened this issue May 12, 2019 · 6 comments · Fixed by #4463
Closed

[BLOCKING] Latest XGBoost with CUDA fails to compile on Windows #4462

hcho3 opened this issue May 12, 2019 · 6 comments · Fixed by #4463

Comments

@hcho3
Copy link
Collaborator

hcho3 commented May 12, 2019

  • OS: Windows Server 2012 R2
  • Compiler: Visual Studio 2017 (15.0.0+26228.76)
  • CUDA: 9.0
  • CMake version: 3.12.0

I checked out the latest code from the master branch and tried to build and got this error:

13>common.cu.obj : error LNK2038: mismatch detected for 'RuntimeLibrary': value 'MD_DynamicRelease' doesn't match value 'MT_StaticRelease' in cli_main.obj [C:\Users\Administrator\Desktop\xgboost\build\runxgboost.vcxproj]
13>hist_util.cu.obj : error LNK2038: mismatch detected for 'RuntimeLibrary': value 'MD_DynamicRelease' doesn't match value 'MT_StaticRelease' in cli_main.obj [C:\Users\Administrator\Desktop\xgboost\build\runxgboost.vcxproj]
13>host_device_vector.cu.obj : error LNK2038: mismatch detected for 'RuntimeLibrary': value 'MD_DynamicRelease' doesn't match value 'MT_StaticRelease' in cli_main.obj [C:\Users\Administrator\Desktop\xgboost\build\runxgboost.vcxproj]
...
...
13>regression_obj.cu.obj : error LNK2001: unresolved external symbol __imp_acoshf 13>gpu_predictor.obj : error LNK2001: unresolved external symbol __imp_acoshf
13>updater_gpu.obj : error LNK2001: unresolved external symbol __imp_acoshf
13>updater_gpu_hist.obj : error LNK2001: unresolved external symbol __imp_acoshf
...

Get the full build log from MSBuild

I suspect that some /MD flags are not being replaced with /MT.

To reproduce:

mkdir build
cd build
cmake .. -G"Visual Studio 15 2017 Win64" -DUSE_CUDA=ON -DCMAKE_VERBOSE_MAKEFILE=ON
"C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build\vcvarsall.bat" amd64
msbuild xgboost.sln /m /p:Configuration=Release -fl -flp:logfile=MyProjectOutput.log;verbosity=diagnostic
@hcho3
Copy link
Collaborator Author

hcho3 commented May 12, 2019

This time, I'm really hoping to set up CI for Windows for good.

Good news is that I finally managed to get SSH connection working with Windows EC2 workers. (Public key auth was quite tricky to get right.) Once this issue is fixed, I will submit a PR to add Windows GPU tests.

@hcho3
Copy link
Collaborator Author

hcho3 commented May 12, 2019

Indeed, hist_util.cu is getting wrong compilation flag (/MD instead of /MT):

"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\bin\nvcc.exe" -gencode=arch=compute_35,code="sm_35,compute_35" -gencode=arch=compute_50,code="sm_50,compute_50" -gencode=arch=compute_52,code="sm_52,compute_52" -gencode=arch=compute_60,code="sm_60,compute_60" -gencode=arch=compute_61,code="sm_61,compute_61" -gencode=arch=compute_70,code="sm_70,compute_70" -gencode=arch=compute_70,code="compute_70,compute_70" --use-local-env --cl-version 2017 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.10.25017\bin\HostX86\x64" -x cu -IC:\Users\Administrator\Desktop\xgboost\cub -IC:\Users\Administrator\Desktop\xgboost\include -I"C:\Users\Administrator\Desktop\xgboost\dmlc-core\include" -IC:\Users\Administrator\Desktop\xgboost\rabit\include -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\include" --keep-dir x64\Release -maxrregcount=0 --machine 64 --compile -cudart static --expt-extended-lambda --expt-relaxed-constexpr -lineinfo -Xcompiler="/EHsc -Ob2 -openmp" -D_WINDOWS -DNDEBUG -DXGBOOST_USE_CUDA=1 -DDMLC_LOG_CUSTOMIZE=1 -DXGBOOST_MM_PREFETCH_PRESENT=1 -D"CMAKE_INTDIR="Release"" -DWIN32 -D_WINDOWS -DNDEBUG -DXGBOOST_USE_CUDA=1 -DDMLC_LOG_CUSTOMIZE=1 -DXGBOOST_MM_PREFETCH_PRESENT=1 -D"CMAKE_INTDIR="Release"" -D_MBCS -Xcompiler "/EHsc /W3 /nologo /O2 /FS /Zi /MD /GR" -o objxgboost.dir\Release/common/hist_util.cu.obj "C:\Users\Administrator\Desktop\xgboost\src\common\hist_util.cu" (TaskId:151)

@hcho3
Copy link
Collaborator Author

hcho3 commented May 13, 2019

Update: The build is fixed if I manually replace

<Runtime>MD</Runtime>

with

<Runtime>MT</Runtime>

in all of the generated *.vcxproj files.

Now I'd like to find a way to force CMake to produce *.vcxproj files with MT runtime property.

@trivialfis
Copy link
Member

The cuda source code is split into host and device source during compilation. So there are 3 different sources, c++ from cc file, c++ from cu files for host, and c++ from cu file for device. My guess is the second one went wrong because its flags are controlled by CMake CUDA language instead of CXX language.

@trivialfis
Copy link
Member

We just need to copy the configuration for CMake CXX to CUDA lang.

@hcho3
Copy link
Collaborator Author

hcho3 commented May 13, 2019

My guess is the second one went wrong because its flags are controlled by CMake CUDA language instead of CXX language.

Interesting.

We just need to copy the configuration for CMake CXX to CUDA lang.

@trivialfis Do you have a suggestion? So far I came up with #4463 and I'm wondering if you have a more elegant fix.

hcho3 added a commit that referenced this issue May 14, 2019
* Fix #4462: Use /MT flag consistently for MSVC target

* First attempt at Windows CI

* Distinguish stages in Linux and Windows pipelines

* Try running CMake in Windows pipeline

* Add build step
@lock lock bot locked as resolved and limited conversation to collaborators Aug 12, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants