-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Set ImageNet data augmentation by default #13757
Conversation
https://github.com/apache/incubator-mxnet/blob/a38278ddebfcc9459d64237086cd7977ec20c70e/example/image-classification/train_imagenet.py#L42 When I try to train imagenet with this line commented, the train-accuracy reaches 99% while the validation-accuracy is only less than 50% (single machine, 8 GPUs, global batchsize=2048, Resnet50). Absolutely this is overfitting. Then I uncomment this line and try again with the same experiment settings. This time both train and validation accuracy converge to about 70%. Thus, it seems that this data augmentation is pretty important for ImageNet training. Perhaps it will be better to uncomment this as default, so that future developers won't get confused by the over-fit issue.
@sandeep-krishnamurthy @eric-haibin-lin Can you take a look? @mxnet-label-bot Add [pr-awaiting-review] |
I'm unsure why image net arguments are not the default for a training script for image net and would be curious to know why not, but there are two better approaches to this depending on what is determined. If ImageNet arguments are to be the default, they should be merged into the stanza:
If they are not the default, it would be cleaner to add an argument --override-with-image-net-augmentations or something more appropriately named that would override parameters with those in the method. Vishaal |
@rahul003 could you take a look at it. any idea why it was commented? Thanks a lot! |
@ymjiang - Thanks for your contributions. Did you get a chance to look at @vishaalkapoor comment? |
@mxnet-label-bot update [pr-awaiting-response] |
Hi @sandeep-krishnamurthy , I agree with @vishaalkapoor that the parameter argument should be set as default. But I see the parameter is already provided in |
@vishaalkapoor Could you suggest a way forward on this PR? |
@@ -39,7 +39,7 @@ def set_imagenet_aug(aug): | |||
data.add_data_args(parser) | |||
data.add_data_aug_args(parser) | |||
# uncomment to set standard augmentations for imagenet training |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment should change accordingly
@ymjiang can you please set a command line argument to either override or keep |
@ymjiang Could you please address the review comments made by @anirudhacharya. It seems no updates since last 2 weeks. Thanks! |
@karan6181 @anirudhacharya Sorry for the delay. I committed two new changes to enable data-augmentation with command-line argument. Please review and see if they are appropriate. |
@anirudhacharya Ping for review. |
@vishaalkapoor Can you take a look at this PR again? |
@ymjiang Hi, could you rebase to latest master? it should resolve the failing CI test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@ymjiang Hi, could you rebase to latest master? it should resolve the failing CI test |
@ymjiang Gentle ping... |
Rebased to master now: https://github.com/apache/incubator-mxnet/pull/15189. Will close this issue. |
* Update .gitmodules * Set ImageNet data augmentation by default https://github.com/apache/incubator-mxnet/blob/a38278ddebfcc9459d64237086cd7977ec20c70e/example/image-classification/train_imagenet.py#L42 When I try to train imagenet with this line commented, the train-accuracy reaches 99% while the validation-accuracy is only less than 50% (single machine, 8 GPUs, global batchsize=2048, Resnet50). Absolutely this is overfitting. Then I uncomment this line and try again with the same experiment settings. This time both train and validation accuracy converge to about 70%. Thus, it seems that this data augmentation is pretty important for ImageNet training. Perhaps it will be better to uncomment this as default, so that future developers won't get confused by the over-fit issue. * Add argument for imagenet data augmentation * Enable data-aug with argument * Update .gitmodules
https://github.com/apache/incubator-mxnet/blob/a38278ddebfcc9459d64237086cd7977ec20c70e/example/image-classification/train_imagenet.py#L42
When I try to train imagenet with this line commented, the train-accuracy reaches 99% while the validation-accuracy is only less than 50% (single machine, 8 GPUs, global batchsize=2048, Resnet50, fp32). Absolutely this is overfitting.
Then I uncomment this line and try again with the same experiment settings. This time both train and validation accuracy converge to about 66%, which looks like normal result.
Thus, it seems that this data augmentation is pretty important for ImageNet training. Perhaps it will be better to uncomment this as default, so that future developers won't get confused by the overfitting issue.