What does doing .backward twice do?

Oh I see. .zero_() makes all the content of a tensor zero (not just gradients). But I was trying to make the contents of a tensor that is part of a computation graph zero (with an in-place operation that is not tracked). So Pytorch was trying to protect me from it. But Pytorch was ok with x.grad being zeroed because it is not part of the computation graph so far, so it’s fine to call zero_() on it.

The (wrong) assumption I made was that .zero_() was built to make gradients zero, not tensors. So naturally, I was very confused as of why pytorch was complaining.

Thanks AlbanD! Your such a boss in this forum. :slight_smile: :muscle:

1 Like