Hi

What is the recommended way to normalize sequence input data (3d tensors).

Should I use batchnorm layer? create my own transform on Dataloader?

Is there any example of normalizing sequence data which is not word embedding?

Thanks!

Gilad

Hi

What is the recommended way to normalize sequence input data (3d tensors).

Should I use batchnorm layer? create my own transform on Dataloader?

Is there any example of normalizing sequence data which is not word embedding?

Thanks!

Gilad

1 Like

What kind of normalization are you thinking of applying?

You can compute the torch.mean / torch.std over the sequence and center it by subtracting / dividingā¦

2 Likes

That makes me wonder.

Letās say the input sample is $ x $.

So we feed the net with $ \frac{x - m}{v} = \frac{x}{v} - \frac{m}{v} $.

Now, if the first layer of the net is Linear Layer we have the Bias term (Assuming it is turned on).

If there is need to subtract a value from all samples, like $ \frac{m}{v} $, Iād assume it will be learned.

So I see the point in dividing, but whatās the point in centering?

yes the centering āmightā be learnt by the bias term, if all stars align (and your biases are closely initialized). Giving a centering prior based on dataset statistics just helps. Scaling can also technically be learnt by the convolution operation (the weights can learn to scale up or down), but whether the weights can learn good scaling depends on initialization, how activation dynamic range changes over the network depth etc.

Well, Iām not experience in Deep Learning.

The question is if the Bias term is indeed (For centered data sets) approaches zero.

I will check on MNIST just for self knowledge.

I have the same question. Canāt understand why there arenāt more examples of normalizing the inputs (and outputs potentially). Looking at `torchvision.transforms.Normalize`

it says it is for normalizing āa tensor image with mean and standard deviationā which I donāt think is the same as what weāre talking about here.

In Scikit-Learn you simply add a `sklearn.preprocessing.StandardScaler`

to your pipeline and it normalizes your dataset before training starts. As far as I can see there are no ābuilt-inā ways to do this in PyTorch or am I missing something?

Are there any options other than these:

- Manually calculate the mean and std deviation of your data and convert it at the beginning and keep a record of those parameters.
- Build above into your
`torch.utils.data.dataset`

if you are using a custom dataset and data loader. - For output normalization, attach the parameters to your model and build the conversion back to original units into your models
`forward()`

method (remembering to feed un-normalized target values to your test criterion). - Or, do all your predictions and test evaluations in normalized units and only convert them back when you finally plot/save the results.
- Or import and use
`sklearn.preprocessing.StandardScaler`

?