Gradients are None Policy Gradient

btil · October 13, 2020, 10:30pm

I wish to update my PolicyNet network based on a loss from two variables not on the graph. So, any idea how I can only update PolicyNet based on a loss that does not require grad? Thanks.

actions = PolicyNet.forward(state)
loss = criterion(a,b) # a and b have requires_grad=False
loss = Variable(loss, requires_grad=True)
loss.backward()
print(list(PolicyNet.parameters())[0].grad)
optimizer.step()

albanD · October 13, 2020, 11:30pm

Hi,

You should not call the .forward() function of Modules directly but just call them like: actions = PolicyNet(state).

If your PolicyNet’s params have requires_grad=True, then actions will also have requires_grad=True and it will propagate all the way to the loss.
So you want to make sure that the PolicyNet params do require grads and that you don’t .detach() (or .data) anything during the loss computation.

btil · October 14, 2020, 12:44am

Hi thanks for responding. Within RegressNet I am having to return the loss as a Variable with requires_grad=True or else I get the “RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn”. Using this Variable loss is breaking the graph, it works when I just use the actions of the PolicyNet. I am breaking the computation graph inside RegressNet with a criterion between two variables that do not have grad. I edited my post.

albanD · October 14, 2020, 12:58am

Doing this is 100% wrong. You break the graph and create a new Tensor with no history. So your backward will just stop at loss.

If you properly use the actions that require grads to compute the loss. Then the loss will require gradients naturally and all will work.
You need to make sure you don’t do any op that breaks the link.

btil · October 14, 2020, 1:05am

Thanks, the variables a and b I am getting the loss with are outputs of the eval network, and so when I try to call .backward() I get the error of no grad or grad_fn.

albanD · October 14, 2020, 1:08am

Right,
But does a and b depend on actions? If so, they should require gradients.

btil · October 14, 2020, 1:11am

They depend on actions and are being input to an eval network so I do not know how to make them require gradients and pass them through an eval network at the same time.

albanD · October 14, 2020, 1:15am

If your eval network is just a net with requires_grad=False. It won’t present the output from requiring gradients if the input does. So if the input to the eval net requires grad, then the output will as well.
And if there is a function that computes these values based on the action, then they should require grad. Unless you detach/unpack Variable which you shouldn’t do.