Do i need to do optimizer.zero_grad() when using Adam solver?
Related: are model.zero_grad() and optimizer.zero_grad() equivalent when using an optimizer?
@Nick_Young yes, the buffer for the gradient are never zeroed out automatically.
@lgelderloos only if you created your optimizer as optimizer = optim.some_optim_func(model.parameters(), ...)
. Basically model.zero_grad()
will zero all the parameters in the model. optimizer.zero_grad()
will zero out all parameters associated with this optimizer. Depending on how you created the optimizer, they will be the same or not.
4 Likes
Thanks for the clarification!
Thank you! @albanD