So, I’m trying to make sure that the computation graph is deleted after processing each batch, but none of the stuff I’ve tried seems to work:
for inputs, _ in loader:
outputs = model(inputs)
# Do stuff with outputs that requires grad
x = some_func(outputs).detach()
# Somehow delete computation graph
model.zero_grad() # this doesn't seem to fix the memory problem.
Do I have to include
backward() even though I don’t need to use it?
Deleting all the Tensors that reference the graph is enough to free it.
In your case, the del outputs should do the trick. How do you know the computational graph is kept around?
Note that zeroing the gradients does not remove the .grad field, it just zeros them. So the .grad attributes will still consume some memory. You can give
set_to_none=True to the zero grad function to actually free these TEnsors.
does deleting all Tensors that reference the graph enough to free it from my GPU too or does your suggestion only free the computers memory?
I am seeing different errors. The most common with a batch size that is “normal” e.g. 16 or more I get this error:
RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 3; 11.93 GiB total capacity; 11.34 GiB already allocated; 64.00 KiB free; 11.36 GiB reserved in total by PyTorch)
I believe this is caused by my giant trees.
At other times my error is simply freezing when my model is computing the forward pass.
It frees all memory.
Note that there is a caching allocating on the cuda side so you need to use
torch.cuda.memory_allocated() to know how much memory is actually used by Tensors.