Confusion about image values when using tanh as an activation function

Hello,

I have seen in many GAN repositories, where tanh is used as a generator activation function, input images not be in the range [-1,1] but in [0,1].

I noticed the same thing when I tried to replicate some networks and train them. When using images normalized in range [-1,1] I get bad images in the first epoch whilst in the other case training, regarding losses and generated images “seems” to be more stable.

More specifically, I want to ask:

  1. Is it “valid” to train such a network, with images in the range [0,1] and tanh as an activation?

  2. Is it valid to use instead (tanh+1)/2 in the outermost layer activation in the generator to enforce the values be in the range [0,1]

I’m not sure what “valid” means in this case. Are you concerned about some theoretical claim and think this normalization approach is “invalid”?
If so, I wouldn’t be too concerned about it (although I would still be interested to hear about your concerns), as I claim that a lot of explanations are written after the model training (finally) worked. :wink:

My main concern is about how would one decide the appropriate normalization values for the input images. Is the safest way, to run the experiments with all the available combinations, i.e tanh activation with values in [0,1] and [-1,1], and decide based on the crispness of the images?
Furthermore, if some special losses are used (pre-trained VGG like style, perceptual), these networks were trained with images normalized in the range [0,1] so another confusion occurs.

In any case, as I understand from your answer, do you mean that the network’s “power” is too strong and alleviates all the “mismatches” and finally learns the appropriate weights regardless of the image scales?