Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training code in roma_indoor.py seems to use non-distributed sampler with DDP #32

Closed
justachetan opened this issue Apr 13, 2024 · 4 comments

Comments

@justachetan
Copy link

Hi,

I was going through the training code in experiments/roma_indoor.py and it seems that you have used a non-distributed sampler (WeightedRandomSampler) instead of DistributedSampler. I believe this means that the entire data will be replicated and passed to each model replica instead of shards of the data. Just wanted to confirm this and ask if this is this is intentional?

Thanks!

@Parskatt
Copy link
Owner

Yes all data goes to all. Its not really intentional. Distributed sampler should be used but I think the effect would be minor. I guess if youre sending entire scannet to each gpu it will be annoyingly heavy. Btw, I just used every 10th frame of scannet iirc (it was a while since I used the code)

@justachetan
Copy link
Author

Thanks! I am not sure what happens when a non-distributed sampler works with distributed training. I am assuming each replica of the dataloader gets a different seed and the order of samples is different across devices?

@Parskatt
Copy link
Owner

I assume so. I never set any seeds, and I havent observed issues relating to repeated samples.

@justachetan
Copy link
Author

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants