Is gradient calculated based on the last forward operation?

You can do any combination, depending on what your constraints are. You can see this post for a more detailed description: Why do we need to set the gradients manually to zero in pytorch?

1 Like