How to delete gradients once used

malfonsoarquimea · May 19, 2022, 9:46am

Hi! I am currently facing a problem where I do something like this:

for batch_of_inputs in dataset:
   batch_of_inputs .requires_grad_()
   batch_of_outputs=model(preprocess(batch_of_inputs ))
   # compute the gradient of the output with respect to the inputs
   gradients=torch.autograd.grad(outputs=batch_of_outputs,
                                                    inputs=batch_of_outputs,
                                                    grad_outputs=torch.ones_like(batch_of_outputs),
                                                    retain_graph=True)[0]
   save_to_disk(gradients)

And the thing is that GPU memory keeps increasing as the batches are processed until no more memory is available and the process crashes.
As far as I understand, the problem is that all the gradients are being tracked and stored and the memory is not cleaned after each iteration. How should I delete de gradients so that they don’t use GPU memory once they are saved to disk?
Thanks in advance

tom · May 19, 2022, 10:34am

The problem is likely retain_graph=True…
My rule of thumb is to never use retain_graph unless I can explain, in my own words, why I need to keep the old autograd graph.
Maybe if you can describe that, we could figure out a solution.

Best regards

Thomas

malfonsoarquimea · May 19, 2022, 11:56am

Yes, you were right. I shouldn’t be using retain_graph=True, that was causing the memory usage.
Thanks!