How do we collect all the gradients from a model and store it?

pinocchio · February 13, 2020, 10:37pm

I did this:

grads = OrderedDict([ (name, None ) for name, w in mdl.named_parameters() ])
for name, w in mdl.named_parameters():
  grads[name] = w.grad

is this correct?

ptrblck · February 14, 2020, 6:07am

This seems to be a good way.
However, note that you are storing references, so if your model updates the gradients or zeroes them out, your dict will reflect these changes.
You could call w.grad.clone() to avoid this (also a check, if grad is not None should be added in this case).