Loss.grad is None for Custom loss function

pdubey · November 21, 2018, 4:22pm

I have been implementing Deep Hashing via Discrepancy Minimization for quite some time now but I am stuck because the parameters of the model (modified AlexNet) are not updating. There seems to be a problem with the gradient computation as when I try to print loss.grad after calling loss.backward, I get None. H has requires_grad=True, D_hat has requires_grad=False, delta has requires_grad=False, lambda1, lambda2 are scalars.

class loss_with_H(Function):
    
    @staticmethod
    def forward(ctx, H, D_hat, delta, lambda1, lambda2):
        if(lambda1 == 0 and lambda2 == 0):
            delta = torch.zeros_like(H)
        temp = D_hat.t() + D_hat
        ctx.save_for_backward(temp, H, delta)
        ctx.lambda1 = lambda1
        L = torch.trace(torch.mm(torch.mm(H.t(), D_hat) + torch.mm(lambda1 * delta.t(), temp), H) + lambda2 * torch.mm(delta.t(), torch.mm(D_hat, delta)))
        return L
    
    @staticmethod
    def backward(ctx, grad_out):
        temp, H, delta = ctx.saved_tensors
        lambda1 = ctx.lambda1
        grad_H = grad_D_hat = grad_delta = grad_lambda1 = grad_lambda2 = None
        grad_H = grad_out.clone()
        grad_H = torch.mm(temp, H + lambda1 * delta)
        return grad_H * grad_out, grad_D_hat, grad_delta, grad_lambda1, grad_lambda2

I am updating delta as follows

def update_delta(D_hat, H, delta, lambda1, lambda2):
    with torch.no_grad():
        temp = D_hat.t() + D_hat
        grad_delta = torch.mm(temp, lambda1 * H + lambda2 * delta)
        delta = -torch.sign(grad_delta) - H
        return delta

albanD · November 21, 2018, 4:24pm

Hi,

tensor.grad is expected to be None for all tensors that are not leaf tensor (you can check with tensor.is_leaf. So it is expected that loss.grad is None.

pdubey · November 21, 2018, 4:31pm

I get your point @albanD
But the magnitude of loss from the above loss function increases when I update lambda1 and lambda2 (as mentioned in the paper) and does not go down. I conclude from this that the model parameters are not updating. So, either there is an error in calculating gradients during backpropagation or there is an error in updating model parameters. I am not able to pinpoint the issue.

Andrew_Carr · October 15, 2019, 5:56pm

I have a similar problem using a custom loss function. In my case, however, the parameters have tensor.is_leaf returning True which is not what I expect since the gradient is None.

albanD · October 15, 2019, 7:21pm

This can still happen for few reasons:

if you use autograd.grad() instead of .backward()
if you don’t use this tensor when computing your loss