Custom loss function: gradients are None

Hi all,
I wrote the following code for calculating the IOU loss but the gradients are None. I guess some problem in the first line of the forward function (creating a new tensor out), but I am unable to resolve it. Please help me to resolve it.
Thanks

def intersection(a, b):
    '''
        input: 2 boxes (a,b)
        output: overlapping area, if any
    '''
    top = max(a[0], b[0])
    left = max(a[1], b[1])
    bottom = min(a[2], b[2])
    right = min(a[3], b[3])
    h = max(bottom - top, 0)
    w = max(right - left, 0)
    return h * w

def union(a, b):
    a_area = (a[2] - a[0]) * (a[3] - a[1])
    b_area = (b[2] - b[0]) * (b[3] - b[1])
    return a_area + b_area - intersection(a,b)
def iou(a, b):
    '''
        input: 2 boxes (a,b)
        output: Itersection/Union
    '''
    U = union(a,b).float()
    if U == 0:
        return 0
    out=intersection(a,b) / U
    return out

class IOU_Loss(nn.Module):
    def __init__(self):
        super(IOU_Loss,self).__init__()
    def forward(self,pred,target):
        out=torch.tensor([iou(pred[i],target[i]) for i in range(target.size()[0])])
        out=1-out.mean()
        out.requires_grad=True
        return out
l=IOU_Loss()
x=torch.randn(2,4)
x.requires_grad=True
y=torch.randn(2,4)
loss=l(x,y)
loss.backward()
print(x.grad) # outputs None

Hi,

You should not modify the requires_grad field during the forward pass.
If it is False while it was True for the inputs, it means that you did non-differentiable operations and so gradients cannot be computed.

In particular, the IoU loss, if you write it as a mathematical function, I seem to remember that it is non differentiable (or gives gradients of 0 everywhere).

1 Like

Hi @albanD, when I removed the out.requires_grad=True, I am getting the error, RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn.

Yes, that is what I would expect. You have something that is non-differentiable. So it cannot compute gradients for it :confused:

I think IoU is differentiable since it involves basic arithmetic operations (multiplication, addition, subtraction). Is there any problem in the out tensor (I guess here the graph is breaking). Am I correct?

@albanD I removed the new tensor (out) and experimented. You are right, the gradients are 0 everywhere.

So the “no gradient” comes from the fact that you pack in a new torch.tensor(). If you use torch.stack() to create the Tensor with all the iou, it will backprop fine.
But the gradient will be 0 everywhere still :confused:

Do you have any suggestions to avoid this or any other loss functions for Bounding box regression (other than L1 and L2)?

I am not very familiar with such work unfortunately.
But I’m sure you can find some ideas looking for smoothed IOU online or looking how other detection papers do this (regression task on the bounding box boundaries if I recall correctly).
Maybe @fmassa has some insight on the state of the art way to do this?

1 Like

I have added a smoothness term, now I am getting non-zero gradients. But, when I use torch.stack() to create the tensor, the following error comes, RuntimeError: grad can be implicitly created only for scalar outputs. Why so?

Solved. I forgot to take the mean. Thanks.

1 Like

You can try GIOU loss or DIOU loss for box regression.