Following is a simple version of my original code.
Basically; the code does two things
(1) optimize the objective function use the gradient descent
(2) control the gradients computed from (1) by through a neural network
a = Variable(torch.Tensor([-2.]),requires_grad=True)
b = Variable(torch.Tensor([-2.]),requires_grad=True)
z = 2*x+10
nn = Control_Variate() # a simple neural network
var_opt = torch.optim.Adam(nn.parameters(), lr=0.1) # notice params are neural network weights
for i in range(1000):
f_z = nn(a*x+b)
der_shape = torch.ones(f_z.size())
loss = torch.mean((f_z-z)**2)
loss.backward(der_shape)
gr_a = a.grad.clone()
a.grad.data.zero_()
gr_b = b.grad.clone()
b.grad.data.zero_()
g_a = Adam_Optim(gr_a) # my own Adam optimizer
g_b = Adam_Optim(gr_b)
b.data.sub_(g_b) # update a and b
a.data.sub_(g_a)
# now, do the variance control
var_opt.zero_grad()
var_loss = torch.mean((gr_a+gr_b-5)**2) # control the variance of changing weights
var_loss.backward()
var_opt.step()
print("done")
The problem I have is that at
var_loss.backward()
It says that
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
but my backward() differentiates w.r.t. neural network weights; and gr_a
and gr_b
are function of my variable a
and b
where I have declare to be requries_grad=True
So I’m quit confused why I receive this error message