Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

多卡训练无法使用 #133

Open
480284856 opened this issue Jul 1, 2022 · 1 comment
Open

多卡训练无法使用 #133

480284856 opened this issue Jul 1, 2022 · 1 comment
Assignees
Labels

Comments

@480284856
Copy link

Synonyms on loading stopwords [/home/guxj/anaconda3/envs/NLP/lib/python3.8/site-packages/synonyms/data/stopwords.txt] ...
Synonyms on loading vectors [/home/guxj/anaconda3/envs/NLP/lib/python3.8/site-packages/synonyms/data/words.vector.gz] ...
Traceback (most recent call last):
File "", line 1, in
File "/home/guxj/anaconda3/envs/NLP/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/guxj/anaconda3/envs/NLP/lib/python3.8/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/guxj/anaconda3/envs/NLP/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/guxj/anaconda3/envs/NLP/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/guxj/anaconda3/envs/NLP/lib/python3.8/runpy.py", line 265, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/guxj/anaconda3/envs/NLP/lib/python3.8/runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/guxj/anaconda3/envs/NLP/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/guxj/code/T5/test2.0.py", line 4, in
import synonyms
File "/home/guxj/anaconda3/envs/NLP/lib/python3.8/site-packages/synonyms/synonyms.py", line 50, in
from .word2vec import KeyedVectors
ImportError: attempted relative import with no known parent package
Traceback (most recent call last):
File "/home/guxj/code/T5/test2.0.py", line 349, in
main()
File "/home/guxj/code/T5/test2.0.py", line 346, in main
mp.spawn(train,nprocs=config.gpus,args=(config,))
File "/home/guxj/anaconda3/envs/NLP/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/guxj/anaconda3/envs/NLP/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes
while not context.join():
File "/home/guxj/anaconda3/envs/NLP/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 149, in join
raise ProcessExitedException(
torch.multiprocessing.spawn.ProcessExitedException: process 0 terminated with exit code 1

为啥会这样?我在命令行上导入就可以,但运行多卡训练的脚本就会出错。

@hailiang-wang
Copy link
Member

您反馈的这个问题,很重要!

问题发生原因,可参考

https://blog.csdn.net/weixin_41699811/article/details/84965328

关于问题修复,欢迎提交 PR,我 merge 后会发布到 Synonyms 新版本。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants