Hi all,
I would like to use the RMSE loss instead of MSE. From what I saw in pytorch documentation, there is no build-in function. Any ideas how this could be implemented?
Hi all,
I would like to use the RMSE loss instead of MSE. From what I saw in pytorch documentation, there is no build-in function. Any ideas how this could be implemented?
Wouldn’t it work, if you just call torch.sqrt()
in nn.MSELoss
?
x = Variable(torch.randn(5, 10), requires_grad=True)
y = Variable(torch.randn(5, 10))
criterion = nn.MSELoss()
loss = torch.sqrt(criterion(x, y))
loss.backward()
print(x.grad)
The solution of @ptrblck is the best I think (because the simplest one).
For the fun, you can also do the following ones:
# create a function (this my favorite choice)
def RMSELoss(yhat,y):
return torch.sqrt(torch.mean((yhat-y)**2))
criterion = RMSELoss
loss = criterion(yhat,y)
# create a nn class (just-for-fun choice :-)
class RMSELoss(nn.Module):
def __init__(self):
super().__init__()
self.mse = nn.MSELoss()
def forward(self,yhat,y):
return torch.sqrt(self.mse(yhat,y))
criterion = RMSELoss()
loss = criterion(yhat,y)
You should be careful with NaN
which will appear if the mse=0
. Something like this would probably be better :
class RMSELoss(nn.Module):
def __init__(self, eps=1e-6):
super().__init__()
self.mse = nn.MSELoss()
self.eps = eps
def forward(self,yhat,y):
loss = torch.sqrt(self.mse(yhat,y) + self.eps)
return loss
sqrt of 0 is 0, not nan
>>> torch.sqrt(torch.zeros(1))
tensor([0.])
Of course, the issue is during the backward pass as you multiply 0 by infinity (derivative of sqrt at 0).
>>> mse = nn.MSELoss()
>>> yhat = torch.zeros(1, requires_grad=True)
>>> y = torch.zeros(1)
>>> loss = torch.sqrt(mse(yhat,y))
>>> loss.backward()
>>> yhat.grad
tensor([nan])
Using the simple module I wrote above
>>> rmse = RMSELoss()
>>> yhat = torch.zeros(1, requires_grad=True)
>>> y = torch.zeros(1)
>>> loss = rmse(yhat,y)
>>> loss.backward()
>>> yhat.grad
tensor([0.])
Hi, I wonder if that’s exactly the same as RMSE when dealing with batch size more than 1 tensor.
i.e. target and prediction are [2,0,256,256] tensor
MSE_0 = MSE(prediction[0,:,:,:], target[0,:,:,:])
MSE_1 = MSE(prediction[1,:,:,:], target[2,:,:,:])
RMSE what we want is:
SQRT( MSE_0) + SQRT( MSE_1)
torch.sqrt(nn.MSELoss(x,y)) will give:
SQRT( MSE_0 + MSE_1)
so:
sqrt(M1+M2) is not equals to sqrt(M1) + sqrt(M2)
with reduction is even off, we wanna
Mean[ Mean (sqrt (MSE_0) ) + Mean(sqrt (MSE_1) ) ]
what will get with reduction = ‘mean’ instead, I think is:
sqrt (Mean(MSE_0) + Mean(MSE_1) )
so:
[sqrt(M1) / N + sqrt(M2)/N] /2 is not equals to sqrt (M1/N + M2/N)
please correct me if my understanding is wrong. Thanks
Try to add eps, such as eps = 1e-8, according to your precision.,