-
Notifications
You must be signed in to change notification settings - Fork 487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"set_mempolicy: Operation not permitted" and performance degradation in 8GPU with single machine #17
Comments
Looks like Docker prevented numactl from setting mempolicy. We will need to confirm if this persists on our testbeds @ymjiang. As a quick workaround, perhaps try |
@changlan I try |
It may have something to do with NUMA. Can you try As far as we have tested, on our testbed there is no such problem. @changlan |
@ymjiang |
@ymjiang Let's change the default value to 8. Only very experienced BytePS users may know whether a <8 value is better for their environment. |
Default value changed to 8 now (7c4dd67). Closing this issue. |
@burness Can you clean the cached image and try again now? We just pushed a new image. |
@ymjiang @changlan I have change an new image, and when i run
It seems port cause the error |
@ymjiang another question, I see the bt error infor in the log, How can it get the coredump file? I set |
@ymjiang Did you miss "--net=host" in all your commands in the tutorial? |
@burness Can you add |
Fixed the tutorial: ec55073 |
@burness Thank you for reminding. On our platform there seems no such problem for Anyway, we will fix this just in case. |
Complementary improvement of bytedance/ps-lite#16
Describe the bug
I use 4GPUs(1080ti) in a single machine, It perform well, but when I use 8GPUs and byteps get performance degradation: from 161.8 img/sec per GPU to 17.4 img/sec per GPU and have some warning info "set_mempolicy: Operation not permitted".
To Reproduce
Steps to reproduce the behavior:
Just change the gpu num in step-by-step toturial
Environment (please complete the following information):
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: