Hi all,
I wrote the following code for calculating the IOU loss but the gradients are None. I guess some problem in the first line of the forward function (creating a new tensor out), but I am unable to resolve it. Please help me to resolve it.
Thanks

def intersection(a, b):
'''
input: 2 boxes (a,b)
output: overlapping area, if any
'''
top = max(a[0], b[0])
left = max(a[1], b[1])
bottom = min(a[2], b[2])
right = min(a[3], b[3])
h = max(bottom - top, 0)
w = max(right - left, 0)
return h * w
def union(a, b):
a_area = (a[2] - a[0]) * (a[3] - a[1])
b_area = (b[2] - b[0]) * (b[3] - b[1])
return a_area + b_area - intersection(a,b)
def iou(a, b):
'''
input: 2 boxes (a,b)
output: Itersection/Union
'''
U = union(a,b).float()
if U == 0:
return 0
out=intersection(a,b) / U
return out
class IOU_Loss(nn.Module):
def __init__(self):
super(IOU_Loss,self).__init__()
def forward(self,pred,target):
out=torch.tensor([iou(pred[i],target[i]) for i in range(target.size()[0])])
out=1-out.mean()
out.requires_grad=True
return out
l=IOU_Loss()
x=torch.randn(2,4)
x.requires_grad=True
y=torch.randn(2,4)
loss=l(x,y)
loss.backward()
print(x.grad) # outputs None

You should not modify the requires_grad field during the forward pass.
If it is False while it was True for the inputs, it means that you did non-differentiable operations and so gradients cannot be computed.

In particular, the IoU loss, if you write it as a mathematical function, I seem to remember that it is non differentiable (or gives gradients of 0 everywhere).

Hi @albanD, when I removed the out.requires_grad=True, I am getting the error, RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn.

I think IoU is differentiable since it involves basic arithmetic operations (multiplication, addition, subtraction). Is there any problem in the out tensor (I guess here the graph is breaking). Am I correct?

So the â€śno gradientâ€ť comes from the fact that you pack in a new torch.tensor(). If you use torch.stack() to create the Tensor with all the iou, it will backprop fine.
But the gradient will be 0 everywhere still

I am not very familiar with such work unfortunately.
But Iâ€™m sure you can find some ideas looking for smoothed IOU online or looking how other detection papers do this (regression task on the bounding box boundaries if I recall correctly).
Maybe @fmassa has some insight on the state of the art way to do this?

I have added a smoothness term, now I am getting non-zero gradients. But, when I use torch.stack() to create the tensor, the following error comes, RuntimeError: grad can be implicitly created only for scalar outputs. Why so?