Is `retain_grad` orthogonal to `retain_graph`?

Is retain_grad orthogonal to retain_graph? I am a bit confused on their difference. My current understanding is that retain_graph retains .grad_fn attributes, while retain_grad retains .grad attributes.

retain_graph argument when set to True during a backward call on a tensor x causes autograd NOT to agressively free the saved references to the intermediate tensors in the graph of x that are required for the gradient computation of x wrt some tensor.

y.retain_grad() is used to populate the grad attribute of y, which is a non-leaf tensor, when a .backward() call is made – this is the non-default behaviour.

1 Like

the saved references to the intermediate tensors in the graph of x that are required for the gradient computation of x wrt some tensor.

Where are these references stored? Are they only implicitly stored in the function grad_fn?