How to prevent the increase of the memory with a tensor accumulating

hilbert_deng · September 19, 2021, 2:16am

In my case, before the iteration I got several tensor which will be calculated through the whole training stage. As I noticed, the memory keeps increasing, even though I use .cpu(). The code structure is like allowings:
tensor_for_accumulate = [ ]
for iter in range(10):
…
tensor_current = …
tensor_for_accumulate += tensor_current
I found that the length and dtype is the same during the training. Thus, I can’t understand why the memory is increasing all the time.

ptrblck · September 20, 2021, 5:04am

If tensor_current is attached to the computation graph (i.e. if its .grad_fn attribute returns a valid function) then all computation graphs will be stored with tensor_current in tensor_for_accumulate. Pushing tensor_current to the CPU doesn’t change anything if the operations were performed on the GPU beforehand, since all intermediate will still be on the GPU.
I’m not familiar with your use case and don’t know if you want to store the computation graph. If not, detach() the tensor before accumulating it.

hilbert_deng · September 20, 2021, 3:54pm

Thanks a lot!!! detach() is work for and I check out the reason is exact what you mentioned above.