How to use the autograd.grad output to replace the model parameters gradients

Guillermo_Aure_Farin · November 16, 2018, 1:37am

Hi

Need some help me, I have been struggling with this for days.

I trying to build a DDPG agent using pytorch. This model uses two nets actor and the critic. The way you optimize the weight of the actor net is by calculating the “gradient of the critic net loss w.r.t part of its input (critic_gradients)” and then merge (sum) the negative of these gradients with the actor gradients of its output with respect to its model parameters.

I did the following:

merge_grad = (actor_output, actor.parameters(), - critic_gradients)

So far so good, merge_grad has the merging operation in a tuple of tensors. Now is time to replace the actor parameters gradients with the “merge_grad” so the optimizer does its job, and here the problem I can not solve.

“merge_grad” is a tuple of tensor which has an order that does not match the order in which the actor.parameters() iterator returns the gradients of the model. So if I try the loop below I get

`for orig_grad in actor.parameters():

     for new_grad in merged_grad:

            orig_grad.grad = new_grad

RuntimeError: assigned grad has data of a different size
`
Thanks a million in advance

Guillermo_Aure_Farin · November 17, 2018, 1:50am

Hi
I am not sure this is the way to solve my problem, but I hope it through some light over it and help someone that is in my same situation. I calculated the merged_gradients one layer at a time and update the specific layer gradients one at a time.

 # Create a dictionary with the actor local parameter
        parameter_to_layer = defaultdict(list)
        for name, w in self.actor_local.named_parameters():
            parameter_to_layer[name] = w

        # Calculate the merging gradient of the actor_local model w.r.t the model parameters and
        # add the negative of the critic actions gradient w.r.t the q_expected value; do this one layer
        # at a time.
        for layer in parameter_to_layer.keys():
            merged_gradient = torch.autograd.grad(actor_actions, parameter_to_layer[layer],
                                                  grad_outputs=-critic_action_gradients,
                                                  retain_graph=True)
            for l, w in self.actor_local.named_parameters():
                if l == layer:
                    w.grad = merged_gradient[0]

If someone find a problem in the above solution (besides its performance suck) please let me know

Thanks