Training code in `roma_indoor.py` seems to use non-distributed sampler with DDP #32

justachetan · 2024-04-13T05:50:32Z

Hi,

I was going through the training code in experiments/roma_indoor.py and it seems that you have used a non-distributed sampler (WeightedRandomSampler) instead of DistributedSampler. I believe this means that the entire data will be replicated and passed to each model replica instead of shards of the data. Just wanted to confirm this and ask if this is this is intentional?

Thanks!

The text was updated successfully, but these errors were encountered:

Parskatt · 2024-04-13T06:02:59Z

Yes all data goes to all. Its not really intentional. Distributed sampler should be used but I think the effect would be minor. I guess if youre sending entire scannet to each gpu it will be annoyingly heavy. Btw, I just used every 10th frame of scannet iirc (it was a while since I used the code)

justachetan · 2024-04-15T14:59:20Z

Thanks! I am not sure what happens when a non-distributed sampler works with distributed training. I am assuming each replica of the dataloader gets a different seed and the order of samples is different across devices?

Parskatt · 2024-04-15T15:31:10Z

I assume so. I never set any seeds, and I havent observed issues relating to repeated samples.

justachetan · 2024-04-15T17:24:52Z

Thank you!

justachetan closed this as completed Apr 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training code in `roma_indoor.py` seems to use non-distributed sampler with DDP #32

Training code in `roma_indoor.py` seems to use non-distributed sampler with DDP #32

justachetan commented Apr 13, 2024

Parskatt commented Apr 13, 2024

justachetan commented Apr 15, 2024

Parskatt commented Apr 15, 2024

justachetan commented Apr 15, 2024

Training code in roma_indoor.py seems to use non-distributed sampler with DDP #32

Training code in roma_indoor.py seems to use non-distributed sampler with DDP #32

Comments

justachetan commented Apr 13, 2024

Parskatt commented Apr 13, 2024

justachetan commented Apr 15, 2024

Parskatt commented Apr 15, 2024

justachetan commented Apr 15, 2024

Training code in `roma_indoor.py` seems to use non-distributed sampler with DDP #32

Training code in `roma_indoor.py` seems to use non-distributed sampler with DDP #32