Scaling target for LSTM regression?

I’m trying to do regression with an LSTM and mini batching.
So I have 7 features of different ranges, therefore I’m scaling them between 0 and 1.
My first attempt was to not scale the target variable, which ranges from 500-1000, but the model couldn’t learn anything. So after reading some posts, that’s because the initial output of my model is too much away from the output, and therefore I will get huge losses.

So if I also scale the target variable the model seems to learn quite good.
Inside a epoch I’m calculating the mse of the model after has seen a batch. At the end of a epoch I sum up the losses per batch and divide it by the number of batches, right?

How can I interpret this loss value now? It’s really small, about 0.01. But isn’t that only because I scaled my target between 0 and 1? How can I get from this value to the “real” mse between the original target values, because this mse would be quite bigger, right?
Is 0.01 really shows how well the model performance on my actual original target values??

Thanks for helping

You could use the scaled targets for training and just re-scale it to the original range to calculate the “real loss” on the validation set. E.g. if you normalize the target values by dividing by 1000 you should multiply the outputs with 1000 and calculate the MSE for the original target for debugging purposes.