Use of retain_graph = True

Harsh_Choudhary · May 11, 2023, 4:42am

I have my loss function as loss = loss1 + loss2.
After each forward pass, i calculate the gradients using loss.backward() and update my weights. Where loss1(w,b) loss2(w,b) and loss(w,b) are functions of network parameters.

Now in every iteration of gradient descent, I need the gradients of loss1 and loss2 wrt the network parameters as well.So i use loss1.backward(retain_graph=True) and loss2.backward(retain_graph=True) . Is it the right approach?

Also, if you can explain what is actually happening while doing retain_graph= True and with False I would be grateful

srishti-git1110 · May 11, 2023, 7:05am

Hi,
retain_graph=True causes autograd NOT to aggressively free up the saved tensors required for grad computation after the backward call.

But for this, you will not be able to call . backward on a tensor more than once as the intermediate tensors required for the backward pass shall already be freed.

For your case, using retain_graph=True should help if you aren’t running into any errors and everything is working as expected. Otherwise, feel free to post the error along with an executable code snippet.

Harsh_Choudhary · May 11, 2023, 7:48am

Thanks for your answer,

I do not get any error, but as you mentioned : “i cannot call .backward more than once …”, That explains the error that I was recieving when I was not setting the retain_graph = True, parameter while calculating the gradients of loss1 and loss2 in each step of my gradient descent as they use the same tensors for computation.

Again, many thanks for clarification

ptrblck · May 11, 2023, 9:02am

A small addition: make sure to use retain_graph=False (or just drop this argument as it’s the default) in the last backward call to allow PyTorch to free the intermediates.