Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiGPU support #415

Open
jialuli3 opened this issue Jun 9, 2018 · 1 comment
Open

MultiGPU support #415

jialuli3 opened this issue Jun 9, 2018 · 1 comment

Comments

@jialuli3
Copy link
Contributor

jialuli3 commented Jun 9, 2018

I was trying to using the las-tedlium model from recipes folder. I'm wondering if the xnmt has multi-gpu support since dynet supports that. I also find the multi-gpu options of command line available in xnmt_run_experiments.py file. But when I specified using two GPUs, I received the following errors.

Traceback (most recent call last):
  File "/home/jialu/miniconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/jialu/miniconda3/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/jialu/miniconda3/lib/python3.6/site-packages/xnmt-0.0.1-py3.6.egg/xnmt/xnmt_run_experiments.py", line 122, in <module>
    sys.exit(main())
  File "/home/jialu/miniconda3/lib/python3.6/site-packages/xnmt-0.0.1-py3.6.egg/xnmt/xnmt_run_experiments.py", line 105, in main
    eval_scores = experiment(save_fct = lambda: save_to_file(model_file, experiment))
  File "/home/jialu/miniconda3/lib/python3.6/site-packages/xnmt-0.0.1-py3.6.egg/xnmt/experiment.py", line 56, in __call__
    self.train.run_training(save_fct = save_fct)
  File "/home/jialu/miniconda3/lib/python3.6/site-packages/xnmt-0.0.1-py3.6.egg/xnmt/training_regimen.py", line 124, in run_training
    loss_builder = self.training_step(src, trg)
  File "/home/jialu/miniconda3/lib/python3.6/site-packages/xnmt-0.0.1-py3.6.egg/xnmt/training_task.py", line 257, in training_step
    standard_loss = self.model.calc_loss(src, trg, self.loss_calculator)
  File "/home/jialu/miniconda3/lib/python3.6/site-packages/xnmt-0.0.1-py3.6.egg/xnmt/translator.py", line 134, in calc_loss
    encodings = self.encoder(embeddings)
  File "/home/jialu/miniconda3/lib/python3.6/site-packages/xnmt-0.0.1-py3.6.egg/xnmt/transducer.py", line 100, in __call__
    es = module(es)
  File "/home/jialu/miniconda3/lib/python3.6/site-packages/xnmt-0.0.1-py3.6.egg/xnmt/pyramidal.py", line 98, in __call__
    fs = fb(es_list)
  File "/home/jialu/miniconda3/lib/python3.6/site-packages/xnmt-0.0.1-py3.6.egg/xnmt/lstm.py", line 185, in __call__
    batch_size = expr_seq[0][0].dim()[1]
  File "/home/jialu/miniconda3/lib/python3.6/site-packages/xnmt-0.0.1-py3.6.egg/xnmt/expression_sequence.py", line 132, in __getitem__
    return dy.inputTensor([self.lazy_data[batch][:,key] for batch in range(len(self.lazy_data))], batched=True)
  File "_dynet.pyx", line 2434, in _dynet.inputTensor
  File "_dynet.pyx", line 2228, in _dynet._tensorInputExpression.__cinit__
RuntimeError: is_valid() not implemented for CUDA yet. You can use CPU implementation with to_device operation instead.
@neubig
Copy link
Contributor

neubig commented Jun 22, 2018

This is not supported well yet, so we'll have to think of a way to do this appropriately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants