Linear scheduler in multi-gpu training #12

awaisjafar · 2020-09-04T03:35:27Z

So the Linear Scheduler for linearly increasing distortion (similar to dropblock) would in no way work for multi-gpu training since it uses a simple variable i (not tensor) so when we do the following

def step(self):
        if self.i < len(self.drop_values):
            self.disout.dist_prob = self.drop_values[self.i]
        self.i += 1

The value of i will never get updated. You can try if you want. My question is, how did you guys run this code to train imagenet and got those results?

The text was updated successfully, but these errors were encountered:

yehuitang · 2020-09-07T10:59:17Z

The code works well on our device. Could you provide more detailed information?

awaisjafar · 2020-09-09T05:30:08Z

unless you're running it on a single gpu, the value of i will never be updated. It would remain 0 or change between 0 and 1 as it is not a tensor. It's a known issue and you can read more about it at,

kjunelee/MetaOptNet#41

yehuitang · 2020-09-09T08:07:40Z

In the code of disout, the update of i is not in the function 'forward()'. It is updated in function 'train' of file 'train_imagenet.py '. Thus the problem in the mentioned issue does not occur.

awaisjafar · 2020-09-09T08:37:39Z

Got it. Thanks

awaisjafar closed this as completed Sep 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Linear scheduler in multi-gpu training #12

Linear scheduler in multi-gpu training #12

awaisjafar commented Sep 4, 2020

yehuitang commented Sep 7, 2020

awaisjafar commented Sep 9, 2020

yehuitang commented Sep 9, 2020

awaisjafar commented Sep 9, 2020

Linear scheduler in multi-gpu training #12

Linear scheduler in multi-gpu training #12

Comments

awaisjafar commented Sep 4, 2020

yehuitang commented Sep 7, 2020

awaisjafar commented Sep 9, 2020

yehuitang commented Sep 9, 2020

awaisjafar commented Sep 9, 2020