Memory leak while exporting torchscript using torch.jit.trace

shengchunnan · January 5, 2022, 8:06am

I have found there is obvious memory leak when I try to export torchscript multiple times.
Here is the piece of code to reproduce this fault. The process may occupy over 20 GB RAM after exporting resnet50 40 times.

import torch
import torchvision.models as vision_models
from torchvision.models import resnet

resnet50 = vision_models.resnet50()

for i in range(1000):
    jit_resnet50 = torch.jit.trace(resnet50, torch.randn(16, 3, 448, 448))
    print('{} times of jit trace done!'.format(i + 1))
    jit_resnet50 = None

Pytorch version: 1.10.0

Platform: Ubuntu 18.04, with or without CUDA

tom · January 5, 2022, 8:37am

Oh oh. I think this is half-expected because of how TorchScript’s caching works. It isn’t a good thing, though, and at least there should be a way to reset things.

Best regards

Thomas

ydshieh · August 9, 2022, 12:40pm

Hope this could be fixed. It causes CI issue in transformers repository as we have more and more models added.

ganler · October 27, 2022, 4:22am

We also encountered this issue which makes it very hard for doing in-process neural architecture search or model generation since it will eventually cause OOM issue. Running JIT trace for many times leads to OOM · Issue #86537 · pytorch/pytorch · GitHub

Hope this could be fixed or we have some mechanisms to reset the caches.