In this specific case you do not need
retain_graph=True, but in general you may need it. As you compute the forward pass, PyTorch saves variables that will be needed to compute the gradients in the backward pass. For example,
z = y * y needs to save the value of
dz/dy = 2*y (or
y + y). However,
y = x + 2 doesn’t need to save anything because
dy/dx = 1 which doesn’t depend on x.
When you call
retain_graph=False (or without specifying it), the automatic differentiation engine frees the saved variables as it computes the gradients. If you call
backwards() again, it will fail with an exception if it needs any freed saved variables. If it doesn’t need any saved variables, like in your example, then it will succeed, but you shouldn’t rely on this behavior.
If you change
y = x + 2 to
y = x * x, you will see an error:
Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.