Where are the activations and gradients stored?

smr97 · May 30, 2018, 5:01am

Hi,
I am planning to implement a network which does not store the activations/gradients for a couple of layers, and recomputes them on the fly for the backward pass. I am currently using convolutional layers from torch.nn. Please let me know how and where I should change the implementation so that gradients will not be stored for all the layers, and activations also will not be stored for all the layers.

justusschock · May 30, 2018, 5:17am

I think you are looking for torch.utils.checkpoint

smr97 · May 30, 2018, 5:31am

Yes, this is very relevant, thank you!