What is the difference between .grad and ._grad?

(Quan Vuong) #1

In the most popular a3c pytorch implementation, there’s a function ensure_shared_grads that ensure the local and global shared optimizer share gradient.

In line 18, shared_param._grad = param.grad is used to share the gradient of the local optimizer with the global optimizer. However, only shared_param.grad is checked to ensure the gradients are shared in subsequently backward pass.

What’s the difference between .grad and ._grad ?

Thanks!

1 Like
(Hugh Perkins) #2

_grad is an internal variable, that is writable from python. you’ll notice that .grad is read-only

1 Like
(Yummy Chen) #3

so, if I want change the gradient manually in optimizer, must I using the following program?

for group in meta_optim.param_groups:  # meta_optim is optimizer
     for p,g in zip(group['params'], grads):  # grads were calculated manually.
         if p.grad is not None:
             p._grad.data = g.data # rather than p.grad.data=g.data?
meta_optim.step()