Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ONNX model does not save on GPU #3144

Closed
lezwon opened this issue Aug 25, 2020 · 2 comments · Fixed by #3145
Closed

ONNX model does not save on GPU #3144

lezwon opened this issue Aug 25, 2020 · 2 comments · Fixed by #3145
Labels
bug Something isn't working help wanted Open to be worked on priority: 0 High priority task

Comments

@lezwon
Copy link
Contributor

lezwon commented Aug 25, 2020

🐛 Bug

Attempting to export on ONNX after training model on GPU, throws an error is the input_sample or example_input_array is not a CUDA tensor.

To Reproduce

Steps to reproduce the behavior:

  1. Train a model on GPU
  2. Try to export to ONNX when self.example_input_array = torch.zeros(1, 1, 500, 500) or input_sample = torch.zeros(1, 1, 500, 500)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-32-cd8009a0b6a3> in <module>
      1 filepath = 'model.onnx'
----> 2 model.to_onnx(filepath, export_params=True)

/opt/conda/lib/python3.7/site-packages/pytorch_lightning/core/lightning.py in to_onnx(self, file_path, input_sample, **kwargs)
   1721         if 'example_outputs' not in kwargs:
   1722             self.eval()
-> 1723             kwargs['example_outputs'] = self(input_data)
   1724 
   1725         torch.onnx.export(self, input_data, file_path, **kwargs)

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

<ipython-input-24-51cae3b5e57f> in forward(self, inputs)
     20 
     21     def forward(self, inputs):
---> 22         return self.model(inputs)
     23 
     24     def training_step(self, batch, batch_idx):

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/container.py in forward(self, input)
     98     def forward(self, input):
     99         for module in self:
--> 100             input = module(input)
    101         return input
    102 

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/conv.py in forward(self, input)
    351 
    352     def forward(self, input):
--> 353         return self._conv_forward(input, self.weight)
    354 
    355 class Conv3d(_ConvNd):

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight)
    348                             _pair(0), self.dilation, self.groups)
    349         return F.conv2d(input, weight, self.bias, self.stride,
--> 350                         self.padding, self.dilation, self.groups)
    351 
    352     def forward(self, input):

RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same

Code sample

filepath = 'model.onnx'
model.to_onnx(filepath, export_params=True)

Expected behavior

Should automatically convert example_input_array or input_sample to the device type and save the model to ONNX.

@lezwon lezwon added bug Something isn't working help wanted Open to be worked on labels Aug 25, 2020
@lezwon lezwon mentioned this issue Aug 25, 2020
7 tasks
@Borda
Copy link
Member

Borda commented Aug 25, 2020

I would say that the problem could be the distributed way, mind check running only on a single GPU?

@Borda Borda added the priority: 0 High priority task label Aug 25, 2020
@lezwon
Copy link
Contributor Author

lezwon commented Aug 25, 2020

I ran this on Kaggle notebook. When I tried to save after training, it threw the error.

@mergify mergify bot closed this as completed in #3145 Aug 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Open to be worked on priority: 0 High priority task
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants