Can PyTorch move a tensor along with its computational graph from GPU to CPU, and then move it back to GPU for backpropagation? For instance, a
is originally on GPU 0, and after computing with b
, we get c
. Then, I obtain c.sum()
and move c
to the CPU to free up memory. Next, I move d
from the CPU to GPU 0, and continue computing on GPU 0 to get e
by combining c.sum()
with d
. Starting backpropagation from e.sum()
, when it propagates back to c
, I move d
back to the CPU to free up space for moving c
back to GPU 0 for continued backpropagation. Can PyTorch do this? It is a workaround for memory constraints.
The to()
operation is differentiable, but won’t move intermediates to the target device.
To save memory via CPU offloading you might want to use these hooks.
1 Like
Thank you, it looks very useful, I will apply this to my code.