GPU Memory Usage Handling with try/except

I am writing a function with attempts to find the upper bound of the possible model size. I do this in a loop which at each iteration “tries” to append a new layer to a moduleList, constructs a model off of that list, then attempts a single forward pass.

In the exception, I delete the moduleList, the model, and attempt to delete the output of the forward pass. Because the forward pass failed however, the output of course does not exist. The issue is, whatever intermediate computations were attempted on the way to the failure (mid-forward-pass) persist and eat up memory. How do I free them? Is there a way to kill ALL GPU tensors at this moment?

Example Code (not real):

modules = []
for i in range(1000):
    try:
        modules.append(nn.Linear(10, 10)).to(device)
        model = nn.Sequential(*modules)
        out = model(input_batch)
        del out
        del model
   except:
        try:
             del out
        except:
             pass
        del model
        del modules
        gc.collect()
        torch.cuda.empty_cache()
        print(torch.cuda.memory_allocated(0)) # this will show that memory is still full!

Just found this…Frequently Asked Questions — PyTorch 1.7.0 documentation

My code worked this whole time…7-8 hours trying to debug it…and now I found this :sob: