How pytorch releases variable garbage?

Hello. I need to unfold some feature map of may network during training, which is cuda memory consuming. I found that the program dumps because of “out of cuda memory” after a few training loop, however during training loop, the variable I allocate should be local in the '‘for’ statement, I don’t know why it consumes out of memory after a few success loop.I think the memory consuming should be fixed during every loop. Can anyone help me out? Thanks!

Two methods which I frequently use for debugging:

By @smth 
def memReport():
    for obj in gc.get_objects():
        if torch.is_tensor(obj):
            print(type(obj), obj.size())
def cpuStats():
        print(psutil.virtual_memory())  # physical memory usage
        pid = os.getpid()
        py = psutil.Process(pid)
        memoryUse = py.memory_info()[0] / 2. ** 30  # memory use in GB...I think
        print('memory GB:', memoryUse)


Edited by @smth for PyTorch 0.4 and above, which doesn’t need the .data check.


Thanks! Does python gc collect garbage as soon as variable has no reference? Or with delay?

@chenchr it does immediately, unless you have reference cycles.

Thanks! Do you means that:

def func():
  a = Variable(torch.randn(2,2))
  a = Variable(torch.randn(100,100))

the memory allocated in a = Variable(torch.randn(2,2)) will be freed as soon as the code a = Variable(torch.randn(100,100)) is executed?

yes. correct…

But, don’t forget that once you call a = Variable(torch.rand(2, 2)), a holds the data.
When you call a = Variable(torch.rand(100, 100)) afterwards, first Variable(torch.rand(100, 100)) is allocated (so the first tensor is still in memory), then it is assigned to a, and then Variable(torch.rand(2, 2)) is freed.


that means there have to be enough memory for two variable during the creation of the second variable?

That means that if you have something like

a = torch.rand(1024, 1024, 1024)  # 4GB
# the following line allocates 4GB extra before the assignment,
# so you need to have 8GB in order for it to work
a = torch.rand(1024, 1024, 1024)
# now you only use 4GB