Copy from GPU to CPU (C++)

Hi everyone,

I did some testing on copying a tensor from GPU to CPU (.cpu() method) and it seems that Pytorch allocates a new chunk of “CPU” memory (RAM) every time the .cpu() method is executed.

Is there a way to copy a “GPU” tensor to a preallocated buffer in RAM (CPU side) using C++ ? I would like to avoid allocating new memory space for each “GPU to CPU” copy operation.


I don’t think it’s possible with current Python API

Thanks for replying, Klory :slight_smile: My question is if it can be done in C++, though.

You could use cputensor.copy_(gputensor) (both in Python and C++).

Best regards



Excellent. Just what I needed.

Thanks, Thomas :slight_smile: