A question about normalisation ranges and their effectiveness

Hello PyTorch community!

I have a question regarding the use of different normalisation ranges and their benefits in training a neural network model.

Currently, I have normalised my inputs to the neural network to be between 1.0 and 10.0 (reinforcement learning), but I was wondering if a normalisation range of between -1.0 and 1.0 or even 0.0 and 1.0 will improve the training speed and stability.

Additional information: The neural network uses the tanh activation function (PPO) with 3 layers of 64 nodes each.

This is the normalisation code block I am using

from sklearn import preprocessing
def normalisation_of_vector(self, vector, lower_bound = 1, upper_bound = 10):
        '''
        Might change to between 0 and 1 or -1 and 1
        '''
        np_list = np.array(vector).reshape(-1,1)
        normalization_scaler = preprocessing.MinMaxScaler(feature_range=(lower_bound, upper_bound))
        normalized_np_list = normalization_scaler.fit_transform(np_list)
        normalized_literal_list = normalized_np_list.tolist()

        flattened_list = functools.reduce(operator.concat, normalized_literal_list)

        rounded_flattened_list = list(np.around(np.array(flattened_list), 2))

        return rounded_flattened_list

Any help is greatly appreciated!

Inputs cantered on 0 are definitely preferable, and a scale around unity is also a good default choice given the initialisations that are used by default in the various modules provided by pytorch. So I’d rather use (-1, 1) than (1, 10).

1 Like

Inputs cantered on 0 are definitely preferable, and a scale around unity is also a good default choice given the initialisations that are used by default in the various modules provided by pytorch. So I’d rather use (-1, 1) than (1, 10).

@vmoens thank you for your reply! Could I ask what do you mean by “a scale around unity”?

It means that things should be normalised such that their distribution approaches a standard normal distribution (mean 0 and standard deviation of 1)
Eg
x = (x-mean(sample_x))/std(sample_x)