How to deal with the labels having drastically different values?


I am dealing with a regression problem where inputs are images 98x98, and outputs are vectors of 16 elements.

Some examples of the outputs are:

[12023, 0.1, 2.0, 11982, 0.8, 1.2, 0.3, 0.9, 1.9, 1.1, 0.4, 0.5, 1.0, 0.9, 0.9, 1.7]
[11975, 0.6, 2.1, 11145, 0.4, 1.1, 0.9, 0.2, 1.3, 1.6, 0.1, 0.4, 1.5, 0.4, 0.8, 1.0]

As you can see the first and the fourth elements are several orders of magnitudes larger than the rest of the vectors.

The question is if this going to affect the learning process negatively and if labels need to be preprocessed somehow (e.g. normalized, or something else)?


This will affect your training if you simply use MSE, because the model will neglect all dimensions but the first and the fourth. The easy solution is to normalize your outputs across dimension.


Thanks! I will try that!

Hi @omarfoq ,
What would be better:

  • to normalize the labels only across those dimensions that have huge values, or
  • to normalize labels across all dimensions (separately), so that the labels across dimensions have mean 0 and standard deviation 1?


It’s better to standarize all dimensions, because:

  • You may have some dimension with a scale that is smaller then all the rest, and then it will be neglected.
  • I believe that standardizing helps during optimization