Need some help me, I have been struggling with this for days.
I trying to build a DDPG agent using pytorch. This model uses two nets actor and the critic. The way you optimize the weight of the actor net is by calculating the “gradient of the critic net loss w.r.t part of its input (critic_gradients)” and then merge (sum) the negative of these gradients with the actor gradients of its output with respect to its model parameters.
I did the following:
merge_grad = (actor_output, actor.parameters(), - critic_gradients)
So far so good, merge_grad has the merging operation in a tuple of tensors. Now is time to replace the actor parameters gradients with the “merge_grad” so the optimizer does its job, and here the problem I can not solve.
“merge_grad” is a tuple of tensor which has an order that does not match the order in which the actor.parameters() iterator returns the gradients of the model. So if I try the loop below I get
`for orig_grad in actor.parameters():
for new_grad in merged_grad: orig_grad.grad = new_grad
RuntimeError: assigned grad has data of a different size
Thanks a million in advance