Clear all tesnors saved for backward (reboot)

Hi I try to release the cuda memory when I try to handle the following exception due to OOM error:

try:
    output = model(input)
    output.backward()
except Exception as e:
    # I want to clear all tensors saved for backward here
    pass

When memory is not sufficient for training (OOM exception), some already forwarded tensor activations are kept in autograd engine (and thus consume lots of memory). I want to clear them and restart with another model. I tried using torch.cuda.empty_cache(), but it doesn’t take effect. Is there any way to release these tensors?

del output should clear the graph since the output is the only thing keeping the graph alive, and the saved variables stored on the graph should be released as well as a result

1 Like