Hi all,
I wrote the following code for calculating the IOU loss but the gradients are None. I guess some problem in the first line of the forward function (creating a new tensor out), but I am unable to resolve it. Please help me to resolve it.
Thanks
def intersection(a, b):
'''
input: 2 boxes (a,b)
output: overlapping area, if any
'''
top = max(a[0], b[0])
left = max(a[1], b[1])
bottom = min(a[2], b[2])
right = min(a[3], b[3])
h = max(bottom - top, 0)
w = max(right - left, 0)
return h * w
def union(a, b):
a_area = (a[2] - a[0]) * (a[3] - a[1])
b_area = (b[2] - b[0]) * (b[3] - b[1])
return a_area + b_area - intersection(a,b)
def iou(a, b):
'''
input: 2 boxes (a,b)
output: Itersection/Union
'''
U = union(a,b).float()
if U == 0:
return 0
out=intersection(a,b) / U
return out
class IOU_Loss(nn.Module):
def __init__(self):
super(IOU_Loss,self).__init__()
def forward(self,pred,target):
out=torch.tensor([iou(pred[i],target[i]) for i in range(target.size()[0])])
out=1-out.mean()
out.requires_grad=True
return out
l=IOU_Loss()
x=torch.randn(2,4)
x.requires_grad=True
y=torch.randn(2,4)
loss=l(x,y)
loss.backward()
print(x.grad) # outputs None
You should not modify the requires_grad field during the forward pass.
If it is False while it was True for the inputs, it means that you did non-differentiable operations and so gradients cannot be computed.
In particular, the IoU loss, if you write it as a mathematical function, I seem to remember that it is non differentiable (or gives gradients of 0 everywhere).
Hi @albanD, when I removed the out.requires_grad=True, I am getting the error, RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn.
I think IoU is differentiable since it involves basic arithmetic operations (multiplication, addition, subtraction). Is there any problem in the out tensor (I guess here the graph is breaking). Am I correct?
So the “no gradient” comes from the fact that you pack in a new torch.tensor(). If you use torch.stack() to create the Tensor with all the iou, it will backprop fine.
But the gradient will be 0 everywhere still
I am not very familiar with such work unfortunately.
But I’m sure you can find some ideas looking for smoothed IOU online or looking how other detection papers do this (regression task on the bounding box boundaries if I recall correctly).
Maybe @fmassa has some insight on the state of the art way to do this?
I have added a smoothness term, now I am getting non-zero gradients. But, when I use torch.stack() to create the tensor, the following error comes, RuntimeError: grad can be implicitly created only for scalar outputs. Why so?