Which activation function to use depending on your data?

Hi everyone,

This is a general doubt I have about the relationship between the activation function and the data.

Imagine our data (images), after normalization, is centered at 0 and take values between -1 and 1. It means our network will try to output images which values will be also between -1 and 1.
So, for example, if we use ReLU as activation function after each layer, we are removing negative values during training, so the network will never be able to correctly predict our targets.

Is that right?
If I use ReLU, should I normalize my data between 0 and 1 insted of between -1 and 1?

PD: I ask this because I saw cases where they normalize between -1 and 1 and they use ReLU successfully.

Thanks in advance :smile:

Hi Deep!

No, this way of looking at it isn’t correct.

The reason is that the weights in your layers can be negative and
the biases are not constrained. So a negative input value can be
multiplied by a negative weight or have a positive bias added to it,
and therefore become positive, before it gets to the ReLU activation.


K. Frank