How to debug: RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time?

I was changing my model and encountered a RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time, but the error message is not very useful.

I have a normal training-loop and don’t use LSTMs or something, so this should not happen.

is there any way to find out which part of the model was responsible?

There might be something calculate globally, and when you .backward() first time, it breaks the computation graph, and the second time you call .backward(), it still need the ‘global’ variable to compute gradient, but it has beed freed (as the error raised). So checking your code, whether there are some variables in computation graph globally.

Ok. My code is quite complicated, so this is easier said than done :confused:

Is there really no way to get the offending calculation?

I’ve stripped down my network, but it’s still occurring…Like I said, my architecture is quite complicated and there are quite a few moving parts. So there are still a lot of places that could be responsible.

I bet the error is just a stupid mistake somewhere deep in my code

ok, I found the perpetrator. But I am not sure why it happens…

This should close the question. Althought it does not really help anyone.

I just uncommented code everywhere.