Custom loss function: gradients are None

Gkv · September 30, 2019, 2:29pm

Hi all,
I wrote the following code for calculating the IOU loss but the gradients are None. I guess some problem in the first line of the forward function (creating a new tensor out), but I am unable to resolve it. Please help me to resolve it.
Thanks

def intersection(a, b):
    '''
        input: 2 boxes (a,b)
        output: overlapping area, if any
    '''
    top = max(a[0], b[0])
    left = max(a[1], b[1])
    bottom = min(a[2], b[2])
    right = min(a[3], b[3])
    h = max(bottom - top, 0)
    w = max(right - left, 0)
    return h * w

def union(a, b):
    a_area = (a[2] - a[0]) * (a[3] - a[1])
    b_area = (b[2] - b[0]) * (b[3] - b[1])
    return a_area + b_area - intersection(a,b)
def iou(a, b):
    '''
        input: 2 boxes (a,b)
        output: Itersection/Union
    '''
    U = union(a,b).float()
    if U == 0:
        return 0
    out=intersection(a,b) / U
    return out

class IOU_Loss(nn.Module):
    def __init__(self):
        super(IOU_Loss,self).__init__()
    def forward(self,pred,target):
        out=torch.tensor([iou(pred[i],target[i]) for i in range(target.size()[0])])
        out=1-out.mean()
        out.requires_grad=True
        return out
l=IOU_Loss()
x=torch.randn(2,4)
x.requires_grad=True
y=torch.randn(2,4)
loss=l(x,y)
loss.backward()
print(x.grad) # outputs None

albanD · September 30, 2019, 2:52pm

Hi,

You should not modify the requires_grad field during the forward pass.
If it is False while it was True for the inputs, it means that you did non-differentiable operations and so gradients cannot be computed.

In particular, the IoU loss, if you write it as a mathematical function, I seem to remember that it is non differentiable (or gives gradients of 0 everywhere).

Gkv · September 30, 2019, 3:32pm

Hi @albanD, when I removed the out.requires_grad=True, I am getting the error, RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn.

albanD · September 30, 2019, 3:32pm

Yes, that is what I would expect. You have something that is non-differentiable. So it cannot compute gradients for it

Gkv · September 30, 2019, 3:36pm

I think IoU is differentiable since it involves basic arithmetic operations (multiplication, addition, subtraction). Is there any problem in the out tensor (I guess here the graph is breaking). Am I correct?

Gkv · September 30, 2019, 3:40pm

@albanD I removed the new tensor (out) and experimented. You are right, the gradients are 0 everywhere.

albanD · September 30, 2019, 3:41pm

So the “no gradient” comes from the fact that you pack in a new torch.tensor(). If you use torch.stack() to create the Tensor with all the iou, it will backprop fine.
But the gradient will be 0 everywhere still

Gkv · September 30, 2019, 3:44pm

Do you have any suggestions to avoid this or any other loss functions for Bounding box regression (other than L1 and L2)?

albanD · September 30, 2019, 3:47pm

I am not very familiar with such work unfortunately.
But I’m sure you can find some ideas looking for smoothed IOU online or looking how other detection papers do this (regression task on the bounding box boundaries if I recall correctly).
Maybe @fmassa has some insight on the state of the art way to do this?

Gkv · September 30, 2019, 4:07pm

I have added a smoothness term, now I am getting non-zero gradients. But, when I use torch.stack() to create the tensor, the following error comes, RuntimeError: grad can be implicitly created only for scalar outputs. Why so?

Gkv · September 30, 2019, 4:12pm

Solved. I forgot to take the mean. Thanks.

Chou_Jackson · January 10, 2020, 2:15pm

You can try GIOU loss or DIOU loss for box regression.