Why does the pure additions can be backward() multiple times?

25349023 · December 14, 2022, 3:06pm

Hello, I’m studying how the autograd in PyTorch works, and I find something confusing:

I’ve learned that if the backward() is called more than once, there will be an runtime error since the graph has been freed:

x = torch.tensor([1., 2.], requires_grad=True)
y = torch.tensor([3., 4.], requires_grad=True)
z = (x * y).sum()
z.backward()
z.backward()  # <-- this will raise an RuntimeError

However, I find that if the graph contains only additions, then the backward() can be called multiple times without errors:

x = torch.tensor([1., 2.], requires_grad=True)
y = torch.tensor([3., 4.], requires_grad=True)
z = (x + y).sum()
z.backward()
z.backward()  # <-- this will NOT cause any error

Why is there such a difference?

srishti-git1110 · December 14, 2022, 4:13pm

Hi,
As soon as the first .backward() call is made the references to the saved tensors in the computation graph of the tensor on which backward is called are freed.

And so, when saved tensors are required to calculate the gradients, an error will be raised on the second backward call.

With addition, no saved tensors are required to calculate the gradient, hence the behaviour is expected.

25349023 · December 14, 2022, 5:46pm

Thank you for your detailed explanation!