Memory leak because of addition?

pytorcher · February 28, 2018, 6:42pm

I have a Variable(FloatTensor) in my graph that is adding another Variable(FloatTensor) to it but on every iteration of forward in the custom nn.Module it allocates memory that is never deleted. Is there some other way I should be writing this line? Or is there a way to manually delete the allocated memory for the addition? The following line seems to cause the problem:

var1[:,index1,:,:] = var1[:,index1,:,:] + var2[:,index2,:,:]

using objgaraph it I get the following names for the leaked items:

before - top 2:
tuple 3191
dict 3037

after 1000 runs, top 4:
slice 72076
Size 50036
dict 47093
tuple 28216

and categories increment every time they pass over that line. Which tracemalloc also pointed me towards.

SimonW · February 28, 2018, 6:46pm

Operations on variables track history. See http://pytorch.org/docs/master/notes/faq.html#my-model-reports-cuda-runtime-error-2-out-of-memory for some explanation.

pytorcher · February 28, 2018, 8:06pm

Thanks for the quick reply! I was not able to get any of those things to work.

“Don’t accumulate history across your training loop.”

This is not data accumulation. These are both Variables that need to have a gradient calculated. Calling .data on anything in the loop gives me problems with Tensor/Variable math, warnings because of returning a Tensor instead of a Variable in forward(), or warning that backward() has nothing to compute a variable for.

“Don’t hold onto tensors and variables you don’t need.”

If this is the problem I’m not sure where I’m supposed to delete. var1 is what gets returned at the end of forward so I assume that gets cleaned up elsewhere? Deleting var2 after its used does not make a difference in memory loss.

pytorcher · February 28, 2018, 9:59pm

Got it! Somewhere up the chain var2 is getting calculated in a different module where the numbers are member variables, i.e. self.inputToVar2. I’m guessing that is making them not get deleted at the end of the cycle. Switching all of those inputs, which do not need to be variables, to just tensors seems to have solved it!

Thanks for the link, that was very helpful!