Is retain_grad
orthogonal to retain_graph
? I am a bit confused on their difference. My current understanding is that retain_graph
retains .grad_fn
attributes, while retain_grad
retains .grad
attributes.
retain_graph
argument when set to True
during a backward call on a tensor x
causes autograd NOT to agressively free the saved references to the intermediate tensors in the graph of x
that are required for the gradient computation of x
wrt some tensor.
y.retain_grad()
is used to populate the grad
attribute of y
, which is a non-leaf tensor, when a .backward()
call is made – this is the non-default behaviour.
1 Like
the saved references to the intermediate tensors in the graph of
x
that are required for the gradient computation ofx
wrt some tensor.
Where are these references stored? Are they only implicitly stored in the function grad_fn
?