(like, it seems like we could have an option in backward to not zero out the gradients, like backward(preserve_grads=True), but by default, seems like zeroing out the gradients could be the default action)
9 Likes