How to more aggressively release GPU memory?

Hi, I am using the latest pytorch master branch and am iteratively sending images to a neural net and saving the output predictions to a video file.

Due to the iterative nature, after a few runs my puny GPU runs out of memory, so I tried adding this to each iteration to free caches:

torch.cuda.empty_cache()

But that had no effect. The only solution that worked is very slowly recreating the model in each for loop iteration:

while True:
    model = ENet(num_classes).to(device)
    optimizer = optim.Adam(model.parameters())
    model = utils.load_checkpoint(model, optimizer, model_path, model_name)[0]
    model.eval()

   # unrelated code then pulls camera images as my "input"
   with torch.no_grad():
        predictions = model(input)

   # Predictions is one-hot encoded with "num_classes" channels.
   # Convert it to a single int using the indices where the maximum (1) occurs
   _, predictions = torch.max(predictions.data, 1)

   label_to_rgb = transforms.Compose([
        ext_transforms.LongTensorToRGBPIL(class_encoding),
        transforms.ToTensor()
   ])
   color_predictions = utils.batch_transform(predictions.cpu(), label_to_rgb)

   fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(15, 7))
   ax1.imshow(np.transpose(torchvision.utils.make_grid(input.data.cpu()).numpy(), (1, 2, 0)))
   ax2.imshow(np.transpose(torchvision.utils.make_grid(color_predictions).numpy(), (1, 2, 0)))

So how can I aggressively drop all consumed memory without having to recreate the model from scratch. Could this be indicative of a memory leak?

Try running gc.collect() before torch.cuda.empty_cache(), this should free up more memory.

Thank you but unfortunately I still run out of GPU memory after 10 or 20 iterations.

Try adding del predictions (Or any other variable you don’t need that’s stored on the GPU.) after torch.cuda.empty_cache().

Hi and thank you but that doesn’t solve it. Why does my approach by recreating the model fix the issue? Could there be some state/cache the model is holding onto?

TBH, I think that loading the model iteratively should be worse since you are creating multiple models, and that should crash your gpu…

Thanks, I’ll try to file a bug.

I’m having similar issues with attribution, I am not able to uncomment below without running out of memory, even with gc:

    # following the tutorial on attribution for semantic segmentation
    def save_attribution(label, target):
        lc_attr = layer_cond.attribute(input, target=target, n_steps=5, internal_batch_size=1)
        fig, ax = viz.visualize_image_attr_multiple(
            lc_attr[0].cpu().permute(1,2,0).detach().numpy(),
            original_image=orig,
            signs=["positive", "negative"],
            methods=["blended_heat_map", "blended_heat_map"],
            use_pyplot=False
        )
        plt.savefig(f'plot_{label}.png', bbox_inches='tight')
        del lc_attr, fig, ax
        torch.cuda.empty_cache()
        gc.collect()


    save_attribution('smooth', 1)
    # save_attribution('grass', 2)
    # save_attribution('rough', 3)

Sure, try creating a minimal implementation no logging, no printing, etc. Also could you share your original implementation?

I was able to resolve the attribution OOM exception by recreating the model before each call to attribute.

So definitely feeling leaky to me, I’ll try to summarize this in an issue soon