GradientCheckProblem? Or wrong derivatives?

magz · November 7, 2018, 7:46pm

Hey everyone!

I’m stuck for a while on a problem with the gradient check tool by now.

Here is a little example. I’m trying to get some intuition about a more complex backpropagation with multiple in- and output. However, when I’m constructing a toy example for this purpose, I’m already in trouble.
The code below will run & the test will be passed.

# simple multiplication & addition
class mult_add_mult_out(torch.autograd.Function):
    @staticmethod
    def forward(ctx, a, b):
        ctx.save_for_backward(a, b)
        out_1 = a * b       # d_a = b;  d_b = a
        out_2 = a + b       # d_a = 1;   d_b = 1
        return out_1, out_2 

    @staticmethod
    def backward(ctx, grad_at_out_1, grad_at_out_2):
        a, b = ctx.saved_tensors
        return grad_at_out_1*b + grad_at_out_2, grad_at_out_2*a + grad_at_out_1

my_mult_add_mult_out = mult_add_mult_out.apply
a1 = torch.Tensor([3]).double()
a1.requires_grad = True
b1 = torch.Tensor([2]).double()
b1.requires_grad = False

#print(my_mult_add_mult_out(a1,b1))

test_mult_add_mult_out = gradcheck(my_mult_add_mult_out,(a1, b1))#grad_b
print('Test mult_add_mult_out: '+test_mult_add_mult_out.__str__())

However, if I switch the “need for gradients” to

a1 = torch.Tensor([3]).double()
a1.requires_grad = False
b1 = torch.Tensor([2]).double()
b1.requires_grad = True

this test fails.
Either the gradient check engine is wrong (what I doubt…) or I simply made a mistake in computing the gradients by hand (which will be much more likely…).
Any hints will be apprechiated!
Thank you

InnovArul · November 7, 2018, 9:06pm

Just a minor mistake. This should be

return grad_at_out_1*b + grad_at_out_2, grad_at_out_1*a + grad_at_out_2

magz · November 8, 2018, 3:01am

Thank you very much for your hint!
Actually, it has to be

return grad_at_out_1*b + grad_at_out_2, grad_at_out_1*a + grad_at_out_2

, so it was indeed a typo that I could’nt see anymore after staring at my own code for a while…