Modify DataLoader #1398

myunghakLee · 2020-07-20T14:58:13Z

The DataLoader of the original code had a problem that the number of data had to be divided by Batchsize.
so often an error occurred.

Like this
("TypeError: forward() missing 1 required positional argument: 'x')
#1355
#1074

But add "drop_last = True" in DataLoader, the problem is fixed.

I tested this code in a "multi-GPU environment"

🛠️ PR Summary

_{Made with ❤️ by Ultralytics Actions}

🌟 Summary

Enhancements to DataLoader behavior in YOLOv3 training script.

📊 Key Changes

The DataLoader instances in training and testing now use the drop_last argument set to True.

🎯 Purpose & Impact

🎯 Purpose: The modification ensures that any partial batches at the end of an epoch are discarded. This can be particularly useful when the neural network expects consistent batch sizes to maintain effective learning, as some layers or configurations might be sensitive to varying batch sizes.
🔍 Impact: Users may experience more stable training, particularly when dealing with small datasets or datasets that do not divide evenly by the batch size. This change may slightly reduce the amount of data being used for training, as the last, potentially smaller batch from each epoch will be skipped.

Prev file have to adjust number of dataset should be divided by batchsize. But use "drop_last = True" in DataLoader The above conditions are no longer necessary

glenn-jocher · 2020-07-20T19:06:58Z

@myunghakLee thanks for the feedback. You should be able to train and test any size datasets with any size batches without problem. On the train side, COCO 2017 train is 118287 images for example, and we've trained this with no problems at batch size 8, 16, 64, 96, 30, etc. The test set there is 5000 images, and the testing is run at the same batch-size at training, so there are combinations which are clearly not evenly divisible, yet still train and test correctly.

We also can not accept this PR for the simple reason that we want to test all images in the validation set, it would be an extremely poor design decision to drop images from validation.

myunghakLee added 2 commits July 20, 2020 23:32

Update train.py

a82fb8c

add Drop last

16fb26f

Prev file have to adjust number of dataset should be divided by batchsize. But use "drop_last = True" in DataLoader The above conditions are no longer necessary

This was referenced Jul 20, 2020

training error #1142

Closed

TypeError: forward() missing 1 required positional argument: 'x' #1355

Closed

myunghakLee closed this Jul 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modify DataLoader #1398

Modify DataLoader #1398

myunghakLee commented Jul 20, 2020 •

edited by UltralyticsAssistant

Loading

glenn-jocher commented Jul 20, 2020

Modify DataLoader #1398

Modify DataLoader #1398

Conversation

myunghakLee commented Jul 20, 2020 • edited by UltralyticsAssistant Loading

🛠️ PR Summary

🌟 Summary

📊 Key Changes

🎯 Purpose & Impact

glenn-jocher commented Jul 20, 2020

myunghakLee commented Jul 20, 2020 •

edited by UltralyticsAssistant

Loading