How to re-use same gpu memory for each forward pass?

In torch, we can declare an empty torch.Tensor() in a module’s init function, then in the forward function, just resize this torch tensor. I think in this way we can save some time on mallocing new memories.

In pytorch, do we even need to do this? If yes, then should it be a tensor or an autograd.Variable?


1 Like

I don’t think it can save much time/memory, I usually create a new variable at every iteration.

Oh forgot to mention, my buffer tensor remains (almost) the same for every forward pass, so in this way, I can save some time on assigning values

AFAIK, the most time is spent in copying data from CPU to GPU, allocating memory won’t cost much time.

Yeah I know, but in my case, my tensor t is like a helper tensor that is independent to input. So if I create a new one at each forward pass, then I also need to run function assign_values(t) each time, which will take some time.

On the other hand, if I am able to re-use gpu memory, then I only need to call assig_value(t) once

Anyway, I figured out how to do this in a way similar to torch. Just create a buffer tensor t in init(), then in forward pass, wrap it with Variable(t)

You’re right, register_buffer works.

Actually in my case register_buffer is not necessarily needed, since my buffer tensor is not persistent.

Anyway, thanks for your time:)

PyTorch comes with built-in caching allocator for CUDA, so memory allocations are automatically re-used internally without any extra work on your part.


Great! Good to know, thanks

Can you explain in more detail(C++) about how does the automatically re-used work?