Here, mean_energy and mean_decision are the output of my NN and are restricted to positive values (so I don’t have problems with the log)
The gradient is always None.
I have already checked that required_grad = True and is_leaf = True for the loss function also.
I calculate the gradients of the weights manually (so I can update them that way) after calling loss.backward() with:
for p in self.model.parameters():
gradient_list.append(p.grad)
Does anybody have an idea how to troubleshoot this? Thanks in advance!
advantage.detach() “breaks the computation graph,” and loss.requires_grad = True doesn’t repair the damage.
Regardless of whether your code can be tweaked to (appear to)
work, loss must be usefully differentiable with respect to the parameters you are trying to train in order for backpropagation
and training to work.
Thanks for the reply! Turns out I don’t need loss.required_grad = True and the .detach() call on the advantage function for the loss function to have grad enabled.
But now I receive the error message:
RuntimeError: Trying to backward through the graph a second time (or directly access saved variables after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved variables after calling backward.
Seems like backpropagation goes through my network two times?
I’ve also tried something else, removing the advantage function (which is ofc not correct for actor-critic). Then for some reason the loss function does not have grad_enabled, and using grad_enabled = True results in the gradient of my weights being zero again.
Is_Leaf is true again though.
It appears that .backward does not seem to work with the MultivariateNormal function… when I use Normal instead for just one parameter, I can calculate the gradient.
I believe that MultivariateNormal and Normal have the same
behavior in this regard.
In general, you can’t differentiate or back propagate through calling .sample() on a Distribution. This is true for Normal, as well as MultivariateNormal. In contrast, you typically can backpropagate
through non-sampling methods such as .log_prob().