Why does detach reduce the allocated memory?

I was fiddling with the outputs of a CNN and noticed something I can’t explain about the detach() methhod. Consider the following snippet:

import torch
from torchvision.models import alexnet


def print_allocated_memory():
    print("{:.2f} GB".format(torch.cuda.memory_allocated() / 1024 ** 3))


print_allocated_memory()

x = torch.rand(1024, 3, 224, 224, requires_grad=False).cuda()
print_allocated_memory()
x = x.detach()
print_allocated_memory()

model = alexnet().cuda()
y = model(x)
print_allocated_memory()
y = y.detach()
print_allocated_memory()

This is the corrresponding output:

0.00 GB
0.57 GB
0.57 GB
3.76 GB
0.81 GB

Why does detaching the model output y reduce the allocated memory while nothing happened for the input x?

Variables not only store the data you typically use but also a context graph which indicates how to backpropagate gradients to leaf variables.

If you detach your model you are getting rid of that context, thus, using less memory.

1 Like