Hi,
is there a reliable way of tracking the effective memory that pytorch keeps allocated during the forward pass, including intermediate buffers that will be needed during the backward pass?
Unfortunately a simple check of tensors that are still referenced (via gc.get_objects()) gives just a lower bound.
For example consider the following code:
import torch
import torch.nn as nn
import torch.autograd as autograd
import gc
c1 = nn.Conv2d(4, 4, 3)
c2 = nn.Conv2d(4, 4, 3)
x = autograd.Variable(torch.randn(4, 4, 16, 16), requires_grad=True)
y = c2(c1(x)).mean()
gc.collect()
gc.collect()
for v in gc.get_objects():
if isinstance(v, torch.Tensor):
print(v.size())
It outputs
(4L, 4L, 3L, 3L)
(4L,)
(4L, 4L, 3L, 3L)
(4L,)
(4L, 4L, 16L, 16L)
(1L,)
while I would expect an additional tensor of size (4L,4L,14L,14L) that is needed to compute the gradient with respect to the parameters of c2.
Thanks.