Calling Backward in a Custom Backward

kemal_oksuz · February 18, 2019, 6:33pm

Hello,
I am writing a custom loss function including the backward pass by myself. Here is a basic sketch of my issue with explanation.

from torch.autograd import Variable, Function
class Loss2(torch.nn.Module):
def init(self):
def forward(self,Input1,Input2):
class Loss1(Function):
def init(self):
def forward(ctx,Input1,Input2):
def backward(ctx,gradOutput):
%The problem occurs below:
a=Variable(a,requires_grad=True)
b=Variable(b,requires_grad=False)
criterion=Loss2()
LossVal2=criterion(a,b)
LossVal2.backward()
print(a.grad) %This is always None

I want to call another loss function(i.e Loss2) in the backward of Loss1, which is also a custom torch.nn.Module but all the functions have autograd. So, I only implemented forward pass for Loss2. However, I can not get the gradients of Loss2 with respect to its inputs. But I always have None as the gradient wrt a parameter.

By the way I previously used Loss2 as an independent loss function. So, it worked by itself previously. Maybe I am missing a fundemental issue for implementation in pytorch. Waiting for help.

albanD · February 19, 2019, 11:16am

Hi,

Please check this section of the doc explaining how to create custom functions. In particular new style functions use static method and no inits.

If you still have problems, please post your full code so that we can look into it !

kemal_oksuz · February 19, 2019, 7:30pm

Hi,

I modified my code as you depicted. But still have the error.

The code is very long. But I think it may be sufficient with the following part. This is the code segment. When I run it as shown, it works without problem. I can see box_a.grad.

a=Variable(torch.tensor([[ 0.8576, 0.1218, 0.9537, 0.4391],
[ 0.2763, 0.0282, 0.5718, 0.9983]]).cuda(),requires_grad = True)
b=Variable(torch.tensor([[ 0.8560, 0.0853, 0.9980, 0.4480],
[ 0.0880, 0.1360, 0.6380, 0.9973]]).cuda(),requires_grad = False)
max_xy = torch.min(box_a[:, 2:].unsqueeze(0), box_b[:, 2:].unsqueeze(0))
min_xy = torch.max(box_a[:, :2].unsqueeze(0), box_b[:, :2].unsqueeze(0))
interside = torch.clamp((max_xy - min_xy), min=0)
i = interside[:, :, 0] * interside[:, :, 1]
area_a = ((box_a[:, 2]-box_a[:, 0]) * (box_a[:, 3]-box_a[:, 1])).unsqueeze(0)
area_b = ((box_b[:, 2]-box_b[:, 0]) * (box_b[:, 3]-box_b[:, 1])).unsqueeze(0)
u = area_a + area_b - i
IoU=i / u
l=IoU.mean()
l.backward()
print(“grad”,box_a.grad)

I have the following gradients:

grad tensor([[-3.0804, -0.9329, 3.0804, 0.9329],
[-0.7972, 0.1469, 0.7972, -0.1469]], device=‘cuda:0’)

On the other hand when I write this code segment in a backward pass of another loss function then it fails. See the code below please:

class LossFunction(Function):
@staticmethod
def forward(ctx,arg1,arg2):
#Some Code Here without problem
return loss
@staticmethod
def backward(ctx,gradOutput):
#Some Code Here without problem
#Same Code block below
a=Variable(torch.tensor([[ 0.8576, 0.1218, 0.9537, 0.4391],
[ 0.2763, 0.0282, 0.5718, 0.9983]]).cuda(),requires_grad = True)
b=Variable(torch.tensor([[ 0.8560, 0.0853, 0.9980, 0.4480],
[ 0.0880, 0.1360, 0.6380, 0.9973]]).cuda(),requires_grad = False)
max_xy = torch.min(box_a[:, 2:].unsqueeze(0), box_b[:, 2:].unsqueeze(0))
min_xy = torch.max(box_a[:, :2].unsqueeze(0), box_b[:, :2].unsqueeze(0))
interside = torch.clamp((max_xy - min_xy), min=0)
i = interside[:, :, 0] * interside[:, :, 1]
area_a = ((box_a[:, 2]-box_a[:, 0]) * (box_a[:, 3]-box_a[:, 1])).unsqueeze(0)
area_b = ((box_b[:, 2]-box_b[:, 0]) * (box_b[:, 3]-box_b[:, 1])).unsqueeze(0)
u = area_a + area_b - i
IoU=i / u
l=IoU.mean()
l.backward()
print(“grad”,box_a.grad)

Initially, I get the following error:
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

So, I thought that that setting requires_grad flags of the intermediate variables such as i,u etc. will solve, but not. Now I do not have error but the derivative is none.
So, I hope i is more clear now.