I’ve searched for a while and I can’t find any examples or conclusive guidance on how to implement input or output scaling.
Situation:
I am training an RNN on sequence input data to predict outputs (many-to-many). Both the inputs and outputs are continuous-valued so I should scale them (to zero mean and unit variance).
Obviously there is no built-in function to do scaling in Pytorch. I thought transforms.Normalize
might be suitable but every time I try to use it on my data it says TypeError: tensor is not a torch image.
Is it only designed for images?
I can easily implement scaling by hand or as a custom transformer as others have so this is not the main issue. The bigger question is where to put the code.
-
One way is to add this to my custom dataset. I could initialize the scaler mean and variance when the dataset is initialized and then add the transformation to the dataset’s
__getitem__
method Since I’m also doing output scaling then I will have to ‘hang on’ to the transform method so I can do the inverse operation to the predictions before comparing them with the targets. -
Another option is to add it to the data loader.
-
Some people think it should be added to the estimator model (RNN in this case).
-
Keep it out of all objects and manually add it to the high-level code running the training and prediction tasks.
Please advise.