Skip to content

Commit

Permalink
Fix module to device in AutoUnit (#398)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: #398

In D46001765 the `self.device` reference was accidentally changed to `device`. Because of this the module isn't being moved to the device properly and we are seeing errors like:
```
ValueError: DistributedDataParallel device_ids and output_device arguments only work with single-device/multiple-device GPU modules or CPU modules, but got device_ids [1], output_device None, and module parameters {device(type='cpu')}
```
when running vise DDP.

Reviewed By: bobakfb

Differential Revision: D46056924

fbshipit-source-id: d980909fa745161a800c72d91d849ee04b27aea7
  • Loading branch information
daniellepintz authored and facebook-github-bot committed May 21, 2023
1 parent 60331d0 commit ab79159
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions torchtnt/framework/auto_unit.py
Original file line number Diff line number Diff line change
Expand Up @@ -259,7 +259,7 @@ def __init__(
# remove ddp comm hook variables from params dict
del params_dict["comm_state"]
del params_dict["comm_hook"]
module = module.to(device)
module = module.to(self.device)
module = DDP(module, device_ids=device_ids, **params_dict)
if torchdynamo_params:
# TODO: Add support for dynamo and DDP
Expand Down Expand Up @@ -295,7 +295,7 @@ def __init__(
**asdict(strategy),
)
else:
module = module.to(device)
module = module.to(self.device)

self.module: torch.nn.Module = module

Expand Down

0 comments on commit ab79159

Please sign in to comment.