CUDA memory leaks

antspy · September 30, 2017, 8:43am

Hi,

I have a strange memory leak in my program; maybe you can help me out. I am using pytorch 0.1.12

I have created a custom function, let’s called it fancy_function. During training, there is flag that tells me whether I should apply fancy function or not. Fancy function looks like this:

class FancyFunction(torch.autograd.Function):

    def __init__(self):
        self.param1 = 'fancy'

    def forward(self, tensor):
        self.saved_for_backward = {}
        self.saved_for_backward['input'] = tensor
        return tensor * 2 + 1

    def backward(self, grad_output):
        tensor = self.saved_for_backward['input']
        grad_output = 2*grad_output
        del self.saved_for_backward #free memory
        return grad_output

(The real function is more complicated, of course). Note that I delete the saved_for_backward attribute, so that after backward I don’t carry around any more tensors.

So, during training, I call

if fancy_flag:
    p.data = FancyFunction.forward(p.data)
    #do things
    p.grad.data = FancyFunction.backward(p.grad.data)
    # note that I have to call .forward() and .backward() manually for 
    # an unrelated problem; this is not the issue

If fancy_flag is False, I don’t see any memory issues. If fancy_flag is True, then cuda memory consumption keeps growing and eventually it raises a cuda out of memory error.

I tried modifying the code by manually calling python garbage collector; so I added gc.collect() at the end of each epoch and at the end of my .backward() function. Unfortunately I still have this memory issue.

Do you have any ideas?

antspy · September 30, 2017, 11:07am

I think I found the problem.

Inside backward, I created a big sparse matrix, and then multiplied it by a vector. So I did something like

big_sparse = big_sparse.cuda()
return torch.mm(big_sparse, dense_vector.view(-1,1))

With some debugging, it appeared that torch.mm was responsible for the huge CUDA leaks. Therefore I tried to use make that operation run on CPU, with

return torch.mm(big_sparse, dense_vector.view(-1,1).cpu()).cuda()

With this change the memory leaks are not there anymore. I don’t know if torch.mm is creating leaks into RAM now, but with a quick check it did not appear to be the case.

I have raised an issue on the github page : https://github.com/pytorch/pytorch/issues/2912