Hi, I wonder if that’s exactly the same as RMSE when dealing with batch size more than 1 tensor.
i.e. target and prediction are [2,0,256,256] tensor
MSE_0 = MSE(prediction[0,:,:,:], target[0,:,:,:])
MSE_1 = MSE(prediction[1,:,:,:], target[2,:,:,:])
RMSE what we want is:
SQRT( MSE_0) + SQRT( MSE_1)
torch.sqrt(nn.MSELoss(x,y)) will give:
SQRT( MSE_0 + MSE_1)
so:
sqrt(M1+M2) is not equals to sqrt(M1) + sqrt(M2)
with reduction is even off, we wanna
Mean[ Mean (sqrt (MSE_0) ) + Mean(sqrt (MSE_1) ) ]
what will get with reduction = ‘mean’ instead, I think is:
sqrt (Mean(MSE_0) + Mean(MSE_1) )
so:
[sqrt(M1) / N + sqrt(M2)/N] /2 is not equals to sqrt (M1/N + M2/N)
please correct me if my understanding is wrong. Thanks