[SOLVED] Extra Factor when Calculating Gradient

Arrykrishna · August 24, 2018, 2:23pm

I am trying to a simple gradient calculation. The true function is:

and the goal is to calculate the gradient of y with respect to the thetas.

import torch
from torch.autograd import Variable

theta = Variable(torch.tensor([3.0, 2.0, 1.0]), requires_grad=True)
x       = Variable(torch.tensor([1.0]), requires_grad= False)
y       = theta[0]**2 * x**2 + torch.sqrt(theta[1]) * x + theta[2]
y.backward(torch.ones(3))
print(theta.grad)

The output is: [18.000, 1.0607, 3.0000]. Any idea why I have an extra factor of 3 as I was expecting [6.0000, 0.3536, 1.0000]

Thanks,
Arrykrishna

ptab · August 24, 2018, 2:38pm

I think the issue here is that you are incorrectly passing torch.ones(3) to your .backward call, since y is a scalar value.
On pytorch 0.41 your code gives an error but using either y.backward() or y.backward(torch.ones(1)) works.
Pure speculation: Maybe your version of pytorch handles the tensor size mismatch by calling backward on 1*y+1*y+1*y.

Arrykrishna · August 24, 2018, 2:41pm

Thanks ptab for the quick reply.

y.backward() or y.backward(torch.ones(1))

works for me too.