Questions about data normalization

Manoel · July 16, 2020, 4:32pm

I’m training my model on a dataset that I’ve normalized. But I thought that I just had to normalize the train set and not the validation or test sets. However, my validation loss was really inconsistent and it never converged, until I changed the transforms and add the normalized transforms on it. Then the validation loss started to work as expected.

My question is: When I normalize my train set I must normalize my validation and test set as well? If so, why? When I have a single image to predict, will I have to normalized it?

Nikronic · July 17, 2020, 7:13am

Hi,

Normalization is some kind of transformation, let’s say your data contains weights of humen then if you train your model with normalized data, model learns to predict for instance using input in range of [-1, 1] and it also predict in a range similar to that. Then if you test your model using original values for instance [1, 200]`, your model has no idea about these huge variance and bias so it cannot predict in the structure it was trained for.
Test/validation is a step to ensure your model had a reliable train stage, so everything need be consistent between train and test/validation, if not, how can one make sure model is working fine?

Bests

Manoel · July 20, 2020, 1:16pm

Perfect. Thank you very much.