Memory gpu fill to fast

I have an problem : I need to store many outputs of a network in a buffer (for reinforcement learning)

for i in range 1000:
compute mean loss of the elements in the buffer

x = [B,C,H,W] = [64,1,30,30]
network[x] is shaped like [64,1]

So the buffer contains tensor of very small dimension.

However the memory of the gpu fills up extremely fast.

I suspect that by adding the output to the buffer I am also adding all the computational graph.

So I add the graph hundreds of times and it makes my memory explode.

Do you have any solutions?

Thanks you

Yes, you are storing the complete computation graphs in the list, if you are not detaching the tensors.

It depends on your use case. If you need to store the computation graphs to call backward later, then you could reduce the number of iterations. Alternatively, if you don’t need to compute the gradients, you could store the tensors after calling detach() on them.

Ok thanks,

I need to call backward later so I cant detach.

I have another question : if I have a batch of 20 then the memory used by the graph will be 2 time greater than a batch of 10 ?

Thank you

More of less. You would have to take e.g. memory fragmentation into consideration. Also, if you are using cudnn with benchmark=True, different algorithms might be picked for different batch sizes depending on their speed, so you might end up with a different memory footprint.

1 Like