In torch, we can declare an empty torch.Tensor() in a module’s init function, then in the forward function, just resize this torch tensor. I think in this way we can save some time on mallocing new memories.
In pytorch, do we even need to do this? If yes, then should it be a tensor or an autograd.Variable?
I don’t think it can save much time/memory, I usually create a new variable at every iteration.
Oh forgot to mention, my buffer tensor remains (almost) the same for every forward pass, so in this way, I can save some time on assigning values
AFAIK, the most time is spent in copying data from CPU to GPU, allocating memory won’t cost much time.
Yeah I know, but in my case, my tensor
t is like a helper tensor that is independent to input. So if I create a new one at each forward pass, then I also need to run function
assign_values(t) each time, which will take some time.
On the other hand, if I am able to re-use gpu memory, then I only need to call
Anyway, I figured out how to do this in a way similar to torch. Just create a buffer tensor
t in init(), then in forward pass, wrap it with
Actually in my case
register_buffer is not necessarily needed, since my buffer tensor is not persistent.
Anyway, thanks for your time:)
PyTorch comes with built-in caching allocator for CUDA, so memory allocations are automatically re-used internally without any extra work on your part.
Great! Good to know, thanks
Can you explain in more detail(C++) about how does the automatically re-used work?