Nn criterions don't compute the gradient w.r.t. targets

smth · June 21, 2017, 9:10pm

by default, the criterions in the nn package indeed dont.

if you write MSE as:

def mse_loss(input, target):
    return torch.sum((input - target)^2) / input.data.nelement()

Then you can indeed compute the gradient wrt input and target