Should I increase epochs when I use augmentation?

Pretty basic, question in the title. I’m practicing on the MNIST dataset on Kaggle.

I’ve read that augmentation is a way to artificially add “new” data to your dataset. However, that doesn’t seem quite true to me, since the augmentations happen randomly and on the fly. This means that every epoch, my network might be seeing a (almost) totally new dataset. I feel like this would lead to underfitting.

That made me wonder about increasing the number of epochs to overcome underfitting, but I surprisingly can’t find a topic on it.


  • at the same number of epochs (50), I lose 20% accuracy on the test set (Kaggle LB) with augmentation.

  • At 60 epochs, I jump back up to 97% accuracy, which is still 1.5% less accurate than without transformations.

  • At 70 Epochs, my accuracy drops down to 71% accuracy, which is about 28% worse.

Augmentations I am using are:

transforms.RandomRotation(10, fill=(0,)),

I have also tried with and without RandomHorizontalFlip but having it or not does not seem to change much.

Do you test your model on augmented data?

‘Test’ data is unlabelled Kaggle competition data, so I’m not augmenting it… BUT my ‘validation’ set is augmented. I was just trying to figure out if that was bad in a different thread!

Based on the answer in that thread, I removed augmentation from the “validation set” and this is what happened:


This doesn’t look like overfitting… it looks like the augmentations are straight up BAD for the model.

Whats the probability of the augmentations?
I.e. do you use torchvision.transforms.RandomChoice(transforms) or torchvision.transforms.RandomApply(transforms, p=0.5) to randomize the augmentations?