Correct way storing states inside one forward pass

Knievel · December 3, 2019, 7:53am

From this thread I found out that I apparently store the whole computational graph in my list with each iteration.

How am I supposed to handle this if I need the states for the gradient computation? Can I still detach them at some point?