Hi! I am currently facing a problem where I do something like this:
for batch_of_inputs in dataset: batch_of_inputs .requires_grad_() batch_of_outputs=model(preprocess(batch_of_inputs )) # compute the gradient of the output with respect to the inputs gradients=torch.autograd.grad(outputs=batch_of_outputs, inputs=batch_of_outputs, grad_outputs=torch.ones_like(batch_of_outputs), retain_graph=True) save_to_disk(gradients)
And the thing is that GPU memory keeps increasing as the batches are processed until no more memory is available and the process crashes.
As far as I understand, the problem is that all the gradients are being tracked and stored and the memory is not cleaned after each iteration. How should I delete de gradients so that they don’t use GPU memory once they are saved to disk?
Thanks in advance