Hi
I am not sure this is the way to solve my problem, but I hope it through some light over it and help someone that is in my same situation. I calculated the merged_gradients one layer at a time and update the specific layer gradients one at a time.
# Create a dictionary with the actor local parameter
parameter_to_layer = defaultdict(list)
for name, w in self.actor_local.named_parameters():
parameter_to_layer[name] = w
# Calculate the merging gradient of the actor_local model w.r.t the model parameters and
# add the negative of the critic actions gradient w.r.t the q_expected value; do this one layer
# at a time.
for layer in parameter_to_layer.keys():
merged_gradient = torch.autograd.grad(actor_actions, parameter_to_layer[layer],
grad_outputs=-critic_action_gradients,
retain_graph=True)
for l, w in self.actor_local.named_parameters():
if l == layer:
w.grad = merged_gradient[0]
If someone find a problem in the above solution (besides its performance suck) please let me know
Thanks