How could I release all the memory without kill the process

coincheung · January 7, 2019, 8:00am

Hi,

I have trained a model, and then I implement inference with it. After the first inference, the model takes a large amount of memory. Then even though I no longer feed inputs to it, it still takes up these memories. I tried torch.cuda.empty_cache, but it still cannot shrink the memory usage to the amount before the first inference. How could I unload the model and keep the process going when no input is fed, and reload it once new inputs are available?

albanD · January 7, 2019, 9:09am

you can send your model on the cpu with .cpu() and then call .cuda() on it to send it back on the gpu when you need it there.

coincheung · January 7, 2019, 9:31am

The memory still cannot be collected completely.
my test code is like this:


import torch
import torchvision


net = torchvision.models.resnet101()
net.cuda()
net.eval()

_ = input()
inten = torch.randn((32, 3, 224, 224)).cuda()


for i in range(10):
    out = net(inten)
    print(out.shape)


_ = input()
net.cpu()
torch.cuda.empty_cache()
_ = input()

It takes around 2G before the first input, then the memory boosts to 11G before the second input. After executing net.cpu() and torch.cuda.empty_cache(), there are still around 6G memory used. How could I resume the 2G state ?

albanD · January 7, 2019, 1:25pm

Well inten is still on the gpu and out well.
You might want to wrap things in a function to reduce the number of local variables that stay around.

coincheung · January 7, 2019, 1:40pm

I tried, the memory usage drops to around 4G, but still more than the very beginning of around 1G. Could there be more optimization?

albanD · January 7, 2019, 5:20pm

Could you share a full code sample that we can run locally that reproduce the problem please?

coincheung · January 8, 2019, 1:11pm

Hi,
This is my test code:


import torch
import torchvision




def fun():
    net = torchvision.models.resnet101()
    net.cuda()
    net.eval()

    _ = input()
    inten = torch.randn((32, 3, 224, 224)).cuda()
    for i in range(10):
        out = net(inten)
        print(out.shape)


    _ = input()
    net.cpu()
    inten = inten.cpu()
    torch.cuda.empty_cache()

fun()
_ = input()

albanD · January 9, 2019, 11:28am

The torch.cuda.empty_cache() is not placed properly. When you call it, the net has been moved out but not out and inten. You need to call it after the function call as it is when exiting the function call that these will be cleaned out properly.

fun()
torch.cuda.empty_cache()
_ = input()