as titled. should I call zero grad before I do forward pass? Or I just need to make sure it is performed before I call loss.backward()?
The forward pass doesn’t care about the grads, nor does it modify them.
You only need to zero the grads before calling
Thanks! It helps a lot