Tensor chunking

Is there an easy way to perform tensor chunking similar to torch.chunk but make it return different tensors (instead of different views of the same tensor).

I want to have seperate memory references for individual chunks, since I wish to offload some specific chunks onto the CPU.

Is there an efficient way to do this without performing clone on every chunk?

Wouldn’t a to("cpu") on the corresponding chunk just work? This should copy the chunk to the CPU without cloning it on the GPU beforehand.

Yes, but that won’t allow offloading it from GPU. I wish to clear up the corresponding memory from the GPU. So chunks[0].to(cpu) will essentially copy chunks[0] data to the CPU, but the original tensor as a whole (chunks) still remains on the GPU, right?