Restore the previous autograd setting #49

martenlienen · 2021-05-02T10:48:42Z

Indiscriminately enabling autograd at the end of the loop also enables it when the user
had explicitly disabled it before. This is a common occurrence when the loss is computed
over a validation set.

In my application this would stop the training with an out of memory error because autograd on the validation data quickly exhausts memory while it is fine in the training loop (sequential data with validation on full sequences but training on subsequences).

Indiscriminately enabling autograd at the end of the loop also enables it when the user had explicitly disabled it before. This is a common occurrence when the loss is computed over a validation set.

caotians1 · 2021-05-05T23:35:08Z

I just had this bug causing validation to hang when used with DistributedDataParallel. DistributedDataParallel checks torch.is_grad_enabled() when determining whether to perform synchronization before forward pass. Geomloss turns grad_enabled on after the first iteration, causing threads to hang waiting for synchronization.

Restore the previous autograd setting

e6ff657

Indiscriminately enabling autograd at the end of the loop also enables it when the user had explicitly disabled it before. This is a common occurrence when the loss is computed over a validation set.

jeanfeydy added a commit that referenced this pull request Jun 18, 2022

Applied PR #49 for issue #57.

69a3130

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restore the previous autograd setting #49

Restore the previous autograd setting #49

martenlienen commented May 2, 2021

caotians1 commented May 5, 2021

Restore the previous autograd setting #49

Are you sure you want to change the base?

Restore the previous autograd setting #49

Conversation

martenlienen commented May 2, 2021

caotians1 commented May 5, 2021