Hi,
I have a strange memory leak in my program; maybe you can help me out. I am using pytorch 0.1.12
I have created a custom function, let’s called it fancy_function. During training, there is flag that tells me whether I should apply fancy function or not. Fancy function looks like this:
class FancyFunction(torch.autograd.Function):
def __init__(self):
self.param1 = 'fancy'
def forward(self, tensor):
self.saved_for_backward = {}
self.saved_for_backward['input'] = tensor
return tensor * 2 + 1
def backward(self, grad_output):
tensor = self.saved_for_backward['input']
grad_output = 2*grad_output
del self.saved_for_backward #free memory
return grad_output
(The real function is more complicated, of course). Note that I delete the saved_for_backward attribute, so that after backward I don’t carry around any more tensors.
So, during training, I call
if fancy_flag:
p.data = FancyFunction.forward(p.data)
#do things
p.grad.data = FancyFunction.backward(p.grad.data)
# note that I have to call .forward() and .backward() manually for
# an unrelated problem; this is not the issue
If fancy_flag is False, I don’t see any memory issues. If fancy_flag is True, then cuda memory consumption keeps growing and eventually it raises a cuda out of memory error.
I tried modifying the code by manually calling python garbage collector; so I added gc.collect() at the end of each epoch and at the end of my .backward() function. Unfortunately I still have this memory issue.
Do you have any ideas?