Hey everyone!
I’m stuck for a while on a problem with the gradient check tool by now.
Here is a little example. I’m trying to get some intuition about a more complex backpropagation with multiple in- and output. However, when I’m constructing a toy example for this purpose, I’m already in trouble.
The code below will run & the test will be passed.
# simple multiplication & addition
class mult_add_mult_out(torch.autograd.Function):
@staticmethod
def forward(ctx, a, b):
ctx.save_for_backward(a, b)
out_1 = a * b # d_a = b; d_b = a
out_2 = a + b # d_a = 1; d_b = 1
return out_1, out_2
@staticmethod
def backward(ctx, grad_at_out_1, grad_at_out_2):
a, b = ctx.saved_tensors
return grad_at_out_1*b + grad_at_out_2, grad_at_out_2*a + grad_at_out_1
my_mult_add_mult_out = mult_add_mult_out.apply
a1 = torch.Tensor([3]).double()
a1.requires_grad = True
b1 = torch.Tensor([2]).double()
b1.requires_grad = False
#print(my_mult_add_mult_out(a1,b1))
test_mult_add_mult_out = gradcheck(my_mult_add_mult_out,(a1, b1))#grad_b
print('Test mult_add_mult_out: '+test_mult_add_mult_out.__str__())
However, if I switch the “need for gradients” to
a1 = torch.Tensor([3]).double()
a1.requires_grad = False
b1 = torch.Tensor([2]).double()
b1.requires_grad = True
this test fails.
Either the gradient check engine is wrong (what I doubt…) or I simply made a mistake in computing the gradients by hand (which will be much more likely…).
Any hints will be apprechiated!
Thank you