run_ssd_live_demo.py: "RuntimeError: expected device cpu but got device cuda:0" #89

Jaftem · 2019-12-06T17:47:25Z

Hi,

I trained a mb2-ssd-lite model with a subset (just 1 class) of Open Images on just 20 epochs. I'm now attempting to run the live demo with this model:

$ python run_ssd_live_demo.py mb2-ssd-lite models/mb2-ssd-lite-Epoch-19-Loss-3.6359732536622036.pth models/open-images-model-labels.txt

And I get the runtime error

Traceback (most recent call last):
  File "run_ssd_live_demo.py", line 65, in <module>
    boxes, labels, probs = predictor.predict(image, 10, 0.4)
  File "/ml/playground/pytorch-ssd/vision/ssd/predictor.py", line 37, in predict
    scores, boxes = self.net.forward(images)
  File "/ml/playground/pytorch-ssd/vision/ssd/ssd.py", line 93, in forward
    locations, self.priors, self.config.center_variance, self.config.size_variance
  File "/ml/playground/pytorch-ssd/vision/utils/box_utils.py", line 104, in convert_locations_to_boxes
    locations[..., :2] * center_variance * priors[..., 2:] + priors[..., :2],
RuntimeError: expected device cpu but got device cuda:0

I can run the live demo on the pretrained model as per the README's instrucitons without error. Any ideas?

The text was updated successfully, but these errors were encountered:

Jaftem · 2019-12-06T18:39:14Z

So it looks like the issue is the locations tensor being a CPU tensor and priors being a CUDA tensor. On line 93 of vision/ssd/ssd.py I made the following change:

    locations.to(self.device), self.priors, self.config.center_variance, self.config.size_variance

Which gets the live demo to work. But because mb1-ssd seems to work fine, I believe the issue occurs at some point prior to this and that the above fix is more of a workaround. I haven't fully reviewed the entire code base to know if there is a better fix.

hsahovic · 2019-12-12T08:49:00Z

I ran into the same issue as @Jaftem. His workaround solved my problem.
It seems that trying to convert models trained on GPUs does not work with the current code base.

vladserkoff · 2019-12-12T12:47:49Z

Since the demo shows the inference on CPU, you either want to pass map_location='cpu' here

pytorch-ssd/run_ssd_live_demo.py

Line 41 in 7174f33

net.load(model_path)

or explicitly move the model to gpu somewhere in run_ssd_live_demo.py

TheCamilovisk · 2019-12-12T15:32:37Z

@Jaftem try changing line 50 on run_ssd_live_demo.py from this:

predictor = create_mobilenetv2_ssd_lite_predictor(net, candidate_size=200)

to this:

predictor = create_mobilenetv2_ssd_lite_predictor(net, candidate_size=200, device=torch.device('cuda'))

Initializing the Predictor class this way solves this issue without touching the SSD class.
I don't know the reason why the CPU is default device of the mb2-ssd-lite Predictor, but this line of the README may be a clue:

You may notice MobileNetV2 SSD/SSD-Lite is slower than MobileNetV1 SSD/Lite on PC. However, MobileNetV2 is faster on mobile devices.

So, this model may be intended to be used in mobile devices.

MADHAVAN001 · 2019-12-16T20:34:35Z

As @Jaftem pointed out, this error is reproducible when running in a node with GPUs. Pytorch seems to be loading into GPU by default.

Running this is a CPU only node works.

YaYaB · 2020-04-10T16:02:34Z

Yep exactly, the issue comes from
https://github.com/qfgaohao/pytorch-ssd/blob/master/vision/ssd/ssd.py#L35
ssd model will be loaded using cuda device if available. ssd's constructor has a device parameter that is None by default.
When loading the mobilenet_v2_ssd_lite network, ssd's constructor is called without any device at all, thus it will load with cuda if available.
https://github.com/qfgaohao/pytorch-ssd/blob/master/vision/ssd/mobilenet_v2_ssd_lite.py#L58

On my side I just forced the device to be cpu in ssd because that is why I need. However a better solution would be to add a parameter device to the mobilenet_v2_ssd_lite so that it can specify a device

AiueoABC · 2020-07-29T06:20:12Z

I got similar error when I run convert_to_caffe2_models.py to convert mobilenet_v2_ssd_lite model to onnx.
In this case, @Jaftem 's solution helped me to get onnx, but not sure this is okay.

Jaftem changed the title ~~run_ssd_live_demo.py: RuntimeError: expected device cpu but got device cuda:0~~ run_ssd_live_demo.py: "RuntimeError: expected device cpu but got device cuda:0" Dec 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

run_ssd_live_demo.py: "RuntimeError: expected device cpu but got device cuda:0" #89

run_ssd_live_demo.py: "RuntimeError: expected device cpu but got device cuda:0" #89

Jaftem commented Dec 6, 2019

Jaftem commented Dec 6, 2019 •

edited

Loading

hsahovic commented Dec 12, 2019

vladserkoff commented Dec 12, 2019

TheCamilovisk commented Dec 12, 2019

MADHAVAN001 commented Dec 16, 2019

YaYaB commented Apr 10, 2020

AiueoABC commented Jul 29, 2020

run_ssd_live_demo.py: "RuntimeError: expected device cpu but got device cuda:0" #89

run_ssd_live_demo.py: "RuntimeError: expected device cpu but got device cuda:0" #89

Comments

Jaftem commented Dec 6, 2019

Jaftem commented Dec 6, 2019 • edited Loading

hsahovic commented Dec 12, 2019

vladserkoff commented Dec 12, 2019

TheCamilovisk commented Dec 12, 2019

MADHAVAN001 commented Dec 16, 2019

YaYaB commented Apr 10, 2020

AiueoABC commented Jul 29, 2020

Jaftem commented Dec 6, 2019 •

edited

Loading