by default, the criterions in the nn
package indeed dont.
if you write MSE as:
def mse_loss(input, target):
return torch.sum((input - target)^2) / input.data.nelement()
Then you can indeed compute the gradient wrt input and target
by default, the criterions in the nn
package indeed dont.
if you write MSE as:
def mse_loss(input, target):
return torch.sum((input - target)^2) / input.data.nelement()
Then you can indeed compute the gradient wrt input and target