Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linear scheduler in multi-gpu training #12

Closed
awaisjafar opened this issue Sep 4, 2020 · 4 comments
Closed

Linear scheduler in multi-gpu training #12

awaisjafar opened this issue Sep 4, 2020 · 4 comments

Comments

@awaisjafar
Copy link

So the Linear Scheduler for linearly increasing distortion (similar to dropblock) would in no way work for multi-gpu training since it uses a simple variable i (not tensor) so when we do the following

def step(self):
        if self.i < len(self.drop_values):
            self.disout.dist_prob = self.drop_values[self.i]
        self.i += 1

The value of i will never get updated. You can try if you want. My question is, how did you guys run this code to train imagenet and got those results?

@yehuitang
Copy link
Collaborator

The code works well on our device. Could you provide more detailed information?

@awaisjafar
Copy link
Author

unless you're running it on a single gpu, the value of i will never be updated. It would remain 0 or change between 0 and 1 as it is not a tensor. It's a known issue and you can read more about it at,

kjunelee/MetaOptNet#41

@yehuitang
Copy link
Collaborator

In the code of disout, the update of i is not in the function 'forward()'. It is updated in function 'train' of file 'train_imagenet.py '. Thus the problem in the mentioned issue does not occur.

@awaisjafar
Copy link
Author

Got it. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants