Should we perform optimizer.zero_grad() before we do forward pass or after that?

as titled. should I call zero grad before I do forward pass? Or I just need to make sure it is performed before I call loss.backward()?

1 Like

The forward pass doesn’t care about the grads, nor does it modify them.
You only need to zero the grads before calling .backward().

5 Likes

Thanks! It helps a lot

Does calling zero_grad before or after forward pass have any memory implications, however?