I did this:
grads = OrderedDict([ (name, None ) for name, w in mdl.named_parameters() ])
for name, w in mdl.named_parameters():
grads[name] = w.grad
is this correct?
I did this:
grads = OrderedDict([ (name, None ) for name, w in mdl.named_parameters() ])
for name, w in mdl.named_parameters():
grads[name] = w.grad
is this correct?
This seems to be a good way.
However, note that you are storing references, so if your model updates the gradients or zeroes them out, your dict will reflect these changes.
You could call w.grad.clone()
to avoid this (also a check, if grad
is not None should be added in this case).