Why is "accumulating" the default mode of .gradient?

Hi Aerin! Nice to see you here :slight_smile:

I found this post with an answer by @albanD - Why do we need to set the gradients manually to zero in pytorch?
It explains the decision to accumulate gradients when .backward() is performed. I assume the same argument applies for .gradient().

2 Likes