I’m sorry for asking this question. But I really don’t know the result .
They do the same thing, the only difference is that one of them is through the optim module while the other is through the nn module. Let’s have a look at the doc.
From the optim doc https://pytorch.org/docs/stable/_modules/torch/optim/optimizer.html, you have :
def zero_grad(self):
r"""Clears the gradients of all optimized :class:`torch.Tensor` s."""
for group in self.param_groups:
for p in group['params']:
if p.grad is not None:
p.grad.detach_()
p.grad.zero_()
while from the nn doc (https://pytorch.org/docs/stable/_modules/torch/nn/modules/module.html#Module.zero_grad), you have :
def zero_grad(self):
r"""Sets gradients of all model parameters to zero."""
for p in self.parameters():
if p.grad is not None:
p.grad.detach_()
p.grad.zero_()
The only difference is for group in self.param_groups
which represents the neural network parameter when you initialize your optimizer.
1 Like