Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

install error:error: An ERROR occured while running the Makefile for the ps-lite library. Exit code: 2 #4

Closed
SCismycat opened this issue Jun 27, 2019 · 24 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@SCismycat
Copy link

when I use:python setup.py install.
report error: An ERROR occured while running the Makefile for the ps-lite library. Exit code: 2

@byronyi
Copy link
Member

byronyi commented Jun 27, 2019

Do you mind to share a little bit more of the error log? And also your environment setup, e.g. OS version, compiler verison, CUDA, etc.

@SCismycat
Copy link
Author

SCismycat commented Jun 27, 2019

CUDA Version 9.0.176
Linux version 3.10.0-862.el7.x86_64 (gcc version 4.8.5 20150623 (Red Hat 4.8.5-28) (GCC)
centos7
Python3.5/TF1.9/keras

The error.log as follow:

 warnings.warn(msg)
running install
running bdist_egg
running egg_info
writing byteps.egg-info/PKG-INFO
writing dependency_links to byteps.egg-info/dependency_links.txt
writing top-level names to byteps.egg-info/top_level.txt
reading manifest file 'byteps.egg-info/SOURCES.txt'
writing manifest file 'byteps.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/customer.o src/customer.cc >build/customer.d
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/postoffice.o src/postoffice.cc >build/postoffice.d
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/van.o src/van.cc >build/van.d
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/meta.pb.o src/meta.pb.cc >build/meta.pb.d
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include  -pthread -c src/postoffice.cc -o build/postoffice.o
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include  -pthread -c src/customer.cc -o build/customer.o
g++: 错误:unrecognized command line option ‘-std=c++14’
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/postoffice.o] 错误 1
make: *** 正在等待未完成的任务....
make: *** [build/customer.o] 错误 1
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include  -pthread -c src/meta.pb.cc -o build/meta.pb.o
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/meta.pb.o] 错误 1
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include  -pthread -c src/van.cc -o build/van.o
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/van.o] 错误 1
error: An ERROR occured while running the Makefile for the ps-lite library. Exit code: 2

===
maybe caused by g++ version??

@ymjiang
Copy link
Member

ymjiang commented Jun 27, 2019

@SCismycat Could you provide the commands that you use?

Besides, if you do the following:

cd byteps/3rdparty/ps-lite
make clean && make -j 

Does it report the same error?

@SCismycat
Copy link
Author

My cmd as follow:

git clone --recurse-submodules https://github.com/bytedance/byteps
ls
cd byteps/
ls
python3 setup.py install

I do the make command,message as follow:

 make clean && make -j
rm -rf build  tests/test_connection  tests/test_kv_app_multi_servers  tests/test_simple_app  tests/test_kv_app_multi_workers  tests/test_kv_app_benchmark  tests/test_kv_app tests/*.d tests/*.dSYM
find src -name "*.pb.[ch]*" -delete
/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/bin/protoc --cpp_out=./src --proto_path=./src src/meta.proto
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/customer.o src/customer.cc >build/customer.d
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/postoffice.o src/postoffice.cc >build/postoffice.d
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/van.o src/van.cc >build/van.d
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/meta.pb.o src/meta.pb.cc >build/meta.pb.d
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include  -pthread -c src/postoffice.cc -o build/postoffice.o
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include  -pthread -c src/customer.cc -o build/customer.o
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/postoffice.o] 错误 1
make: *** 正在等待未完成的任务....
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/customer.o] 错误 1
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include  -pthread -c src/meta.pb.cc -o build/meta.pb.o
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/meta.pb.o] 错误 1
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include  -pthread -c src/van.cc -o build/van.o
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/van.o] 错误 1
rm src/meta.pb.h

@byronyi
Copy link
Member

byronyi commented Jun 27, 2019

You could try yum install devtoolset-7.

@bobzhuyb
Copy link
Member

Yes, this is gcc version problem. BytePS right now requires gcc 4.9 or above.

You can try the suggestion from @byronyi , or use our dockerfile.

@bobzhuyb bobzhuyb added the good first issue Good for newcomers label Jun 27, 2019
@changlan changlan added the enhancement New feature or request label Jun 28, 2019
@changlan
Copy link
Contributor

changlan commented Jun 28, 2019

Marking this as "enhancement". Perhaps we could check gcc version explicitly during installation, until we support gcc 4.8.

@Kylin9511
Copy link

Kylin9511 commented Jun 29, 2019

@byronyi @bobzhuyb @changlan I meet a problem. I don't have sudo right for the server, so I have to use anaconda vitual env.

I install the following gcc7 envs in anaconda.

conda install -n torch1.1 -c omgarcia isl
conda install -n torch1.1 -c quantstack gcc-7

But still meets exact the same error message.

So any solution for non-root user with anaconda3?

@bobzhuyb
Copy link
Member

bobzhuyb commented Jun 30, 2019

@luzhilin19951120 When you say exact the same error message, do you mean this line ?

g++: 错误:unrecognized command line option ‘-std=c++14’

If so, can you try again with latest master branch? Make sure the 3rdparty/ps-lite is updated as well. We recently removed the dependency on c++14.

@Kylin9511
Copy link

@bobzhuyb Sorry, but may you make it clearer about how to Make sure the 3rdparty/ps-lite is updated?

I tried to pull again. With latest master, the following error message occurred.

make: *** [/home/luzhilin/software/byteps/3rdparty/ps-lite/deps/include/google/protobuf/message.h] 错误 2
**error: An ERROR occured while running the Makefile for the ps-lite library. Exit code: 2**

@ymjiang
Copy link
Member

ymjiang commented Jul 1, 2019

To make sure your pslite is the latest, cd into your byteps/3rdparty/ps-lite and then type git log to see if the latest commit is 52f042b

If not, then you are not using the latest ps-lite.

@Kylin9511
Copy link

@ymjiang well then I am on the latest commit of ps-lite
image

But still meet the aforementioned error when making ps-lite library...

@ymjiang
Copy link
Member

ymjiang commented Jul 1, 2019

@luzhilin19951120 Can you please show more about the error log? The information is kind of limited.

Besides, would you mind try using gcc-4.9?

@Kylin9511
Copy link

Kylin9511 commented Jul 1, 2019

@ymjiang This is all terminal STD output. make.log

And all the ERR message has been given.

@ymjiang
Copy link
Member

ymjiang commented Jul 1, 2019

@luzhilin19951120 Then I would suggest using gcc-4.9 for compile, as we have suggested in README.

@bobzhuyb
Copy link
Member

bobzhuyb commented Jul 1, 2019

@luzhilin19951120 I don't see error in your make.log..... Am I missing something? Can you also redirect stderr to the file?

I suggest you do a refresh clone if you don't know how to start over

git clone --recurse-submodules https://github.com/bytedance/byteps

Also, your problem is that you can't build protobuf. It's different from the first post, and I am not sure whether this is really BytePS's own problem.

@Kylin9511
Copy link

Kylin9511 commented Jul 1, 2019

@ymjiang @bobzhuyb
Well I recurse the submodules, change gcc to 4.9
image

And when reinstall the BytePS with the following command
BYTEPS_NCCL_HOME=/usr/local/nccl_2.3.7 BYTEPS_CUDA_HOME=/usr/local/cuda-9.0/ BYTEPS_USE_RDMA=1 python setup.py install > make.log 2>&1

I still got the following error message (redirected, including std and err out)
make.log

Yeap, the problem seems to be the protobuf. But that is a dependency of BytePS and there should be a reason that the building failed~

@bobzhuyb
Copy link
Member

bobzhuyb commented Jul 1, 2019

tensorflow/tensorflow#5017 (comment)
Have a look at this thread and upvoted answers?

If you search the error message in your make.log, you can see a lot of related issues.

/usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found
/usr/lib64/libstdc++.so.6: version `CXXABI_1.3.8' not found

@Kylin9511
Copy link

@bobzhuyb Yes, it turns out to be gcc library problem. The dynamic lib of gcc is not updated since I merely installed gcc4.9 in anaconda.

I managed to deploy a gcc5.4 environment and the protofuf bug is gone.

@Kylin9511
Copy link

Kylin9511 commented Jul 1, 2019

@bobzhuyb However another problem exist when buildind PyTorch plugin. And I failed to locate the bug, which seems to be inside build_ext.build_extension(pytorch_lib)

The detailed log is as follows.
make.log

p.s. I installed pytorch1.1.0 using

conda install pytorch torchvision cudatoolkit=9.0 -c pytorch

@bobzhuyb
Copy link
Member

bobzhuyb commented Jul 1, 2019

byteps/common/global.cc:19:18: fatal error: numa.h: 没有那个文件或目录

Install libnuma-dev

Read your own log, check the error message, and install libraries based on the error message...

@Kylin9511
Copy link

@bobzhuyb Sorry for your troublesome, I am not quite familiar with C++ based package/library.

The environment setting seems to be a little bit inconvinient for non-root user. I may try to use docker later on. I think it would be better if you can release PIPY library version like horovod😄.

@ymjiang
Copy link
Member

ymjiang commented Jul 1, 2019

@luzhilin19951120 We already release some pip libraries. See https://github.com/bytedance/byteps/blob/master/docs/pip-list.md

@bobzhuyb
Copy link
Member

bobzhuyb commented Jul 3, 2019

I believe we have addressed all the issues here, including fallback to c++11 from c++14, and providing pip packages. Closing this. Feel free to reopen.

pleasantrabbit pushed a commit that referenced this issue Nov 3, 2020
* initial support for server profiling

* simulate the customer

* add optional profile granularity

* output to json file

* improve format
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

6 participants