Pytorch Memory leak with pretrained models

I am trying to run a model that requires vgg16 pretrained models. My issue is presented here:

I have narrowed down the problem. Continuous memory increase occurs from line 36 in pretrained_networks file to line 48. (PerceptualSimilarity/pretrained_networks.py at master · richzhang/PerceptualSimilarity · GitHub)

This simply contains the layers from pretrained vgg16 network. I have tried with torch.no_grad() as well and I also tried to not inherit nn.Module. I have tried deleting my variables and also not deleting them. Problem persist and even at A100 40GB I get cuda out of memory after just some 200 images. Looking forward to quick response :slight_smile:

In the code snippet posted in the linked issue you are accumulating the loss and are thus also storing the computation graphs which are potentially attached to it:

for i in range(len(ground_truth)):

                p= lpips.im2tensor(lpips.load_image(predictions[i]))
                g= lpips.im2tensor(lpips.load_image(ground_truth[i]))
                if use_gpu:
                    p=p.cuda()
                    g=g.cuda()
                mean_total=mean_total+loss_fn.forward(g,p).mean()
                im_counter=im_counter+1
                del p
                del g
                torch.cuda.empty_cache()

The linked line of code only uses a subset of the features of the pretrained model, so I don’t think it’s “leaking” memory. Could you describe how you narrowed down the leak to this line of code?

I printed usage before and after certain lines narrowing it down to the lines for which i showed github link. But i will try to find a work around for .forward function maybe call detach().cpu().numpy before doing addition

Thanks a lot. calling detach().cpu().numpy() solves the problem.