Why the reserved_memory space generated by the forwarding of Net can't be released?

ZiQing_Li · December 8, 2021, 3:14am

Questions and Help

The demo code is listed below. Why after I used the torch.cuda.empty_cache(), only part of the reserved space generated by the forwarding of Net could be released? For the second epoch, the reserved_space will get larger while doing the same operation as the first epoch. And how does the reserved_memory work? Why is the reserved memory much larger than allocated memory?

import torch
import torch.nn as nn

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv = torch.nn.Conv2d(3, 28, (3, 3)) # in out kernel size
        self.maxpool1 = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.maxpool2 = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
    def forward(self, x):
        x = self.conv(x)
        x = self.maxpool1(x)
        x = self.maxpool2(x)
        return x

def train():
    net = Net().to('cuda')
    for i in range(5):
        with torch.no_grad():
            frames_batches = torch.randn(512 , 3, 224, 224).to('cuda')
            
            # before forward 1st | memory_reserved: 296MB | memory allocated: 294MB
            pred = net(frames_batches)
            # after forward 1st epoch | memory_reserved: 5014MB | memory allocated: 465MB
            # after forward 2nd epoch | memory_reserved: 5688MB | memory allocated: 465MB
            torch.cuda.empty_cache()
            # after empty_cache 1st epoch | memory_reserved: 1644MB | memory allocated: 465MB
if __name__ == '__main__':
    train()