Cuda out of memory error during forward pass

ptrblck · June 6, 2021, 11:45pm

Yes, Autograd will save the computation graphs, if you sum the losses (or store the references to those graphs in any other way) until a backward operation is performed.
To accumulate gradients you could take a look at this post, which explains different approaches and their computation as well as memory usage.