Is there an easy way to perform tensor chunking similar to torch.chunk
but make it return different tensors (instead of different views of the same tensor).
I want to have seperate memory references for individual chunks, since I wish to offload some specific chunks onto the CPU.
Is there an efficient way to do this without performing clone
on every chunk?
Wouldn’t a to("cpu")
on the corresponding chunk just work? This should copy the chunk to the CPU without cloning it on the GPU beforehand.
Yes, but that won’t allow offloading it from GPU. I wish to clear up the corresponding memory from the GPU. So chunks[0].to(cpu)
will essentially copy chunks[0]
data to the CPU, but the original tensor as a whole (chunks
) still remains on the GPU, right?