Is there a way to insert data augmentation into the training data based on the validation data?

AndreTeixeira · June 18, 2020, 12:31am

I’m training a CNN for plant diseases recognition and the training set is composed by laboratory images. I am achieving low precision in my validation and test sets as they are composed of images taken in the field and which contain an extremely varied dynamic range. Is there a way to introduce data augmentation on my training set based on the validation set in order to get a better acurracy?

Nikronic · June 18, 2020, 12:40am

Hi,

I am not sure I have understood your question properly, but I think the problem is for instance, train set has normal images, but val set has rotated images or many other issues that differs from train set and you want to introduce some augmentations like rotation to make training set more similar to val.

If it is that case, I have to mention a few points:

Introducing any new statistical changes to training based on validation set is some kind of cheating as in real world, we know that we are going to test our algorithm against samples that it has never seen. So actually, you have seen val set, tested your model got output and now you know model is not working well. So, I think your idea is not really acceptable to change train set to be similar to val set.
Why not shuffling lab and field images? create a new train set that has samples in val set.
I am not sure about this but how about adding all type of augmentations to make sure all possible transformation can be learned by network.
I thing there was a paper called auto augment which tried to find best augmentation based on data.

Bests