Will Torch automatically free activation memory?

ConnollyLeon · March 16, 2021, 6:21am

Will Torch free the gpu memory of activation values of a specific layer after its backward has done? Since the activation values are no more needed after calculating gradients.

ptrblck · March 16, 2021, 7:39pm

Yes, the intermediate tensors will be removed after the backward() operation is done, if retain_graph=True is not used.
This is also the reason, why a second backward() call right after the first one will raise an error (if the intermediate tensors were needed to calculate the gradients).

ConnollyLeon · March 19, 2021, 8:36am

The intermediate tensors will be removed after the backward() operation is done

That’s awesome.
Could you please show me in which file this “remove operation” is executed?

ptrblck · March 19, 2021, 8:51am

I don’t know where exactly the intermediates are freed, but @albanD might know.

albanD · March 22, 2021, 6:34pm

Hey,

Each “Node” in the graph implements a special functions to release all the resources it can here: pytorch/function.h at a46d56f988505547b0779838a022970b79123b3c · pytorch/pytorch · GitHub
This is called from the execution engine when retain_graph=False here: pytorch/engine.cpp at a46d56f988505547b0779838a022970b79123b3c · pytorch/pytorch · GitHub

ConnollyLeon · April 8, 2021, 6:07am

That’s incredible. Thank you!