Is it possible to do temporary calculations and remove them from computational graph?

Yuerno · February 16, 2020, 3:59pm

Hello all! In my current project, I have pass in 2D array of values (like [4096, 3] or [32768, 3]) into a neural network which processes these values and returns just [4096] or [32768]. The problem with this, as one might imagine, is that you run out of memory super fast at higher resolutions of input (like [64x64x64, 3]).

Luckily, only part of the values calculated from the neural network are necessary for calculations. Most of the values returned will be negative, some will be positive, and I only need to perform calculations on the positive values. Therefore, all those negative values just end up being unnecessary calculations for backprop and take up memory. Therefore, I was wondering if it’s possible to remove the extra calculations from the computational graph so that I can still work with higher resolutions and not have to backprop unneeded data.

albanD · February 17, 2020, 9:56pm

Hi,

Are these values in two different branch of the network or all into the same Tensors?

Also for memory issues, you can try the checkpoint module to reduce memory usage at the cost of some compute.

Yuerno · February 18, 2020, 1:51am

Hey, thanks for the reply! I will take a look at checkpointing.

And these values are all in the same tensor. Basically, some sections of the returned [4096], [32768], etc. tensor are positive, others are negative, and I want to ignore any gradients calculated from the negative section.

albanD · February 18, 2020, 4:24pm

The autograd looks at Tensors as one elementary object. So it won’t be able to keep part of it and remove others I’m afraid.