Deleting a deep copy from cpu

I’m trying to load a large tensor from a file, and save its slices as individual “pt” files. Now since views share the same storage as the original tensors, simply iterating over the tensor and saving the slices doesn’t work (that saves the whole tensor for each iteration), and I have to create a deep copy of the slice during every iteration. Eventually these copies fill my RAM up and the machine crashes.
I tried deleting the deep copies at the end of each iteration using del but that didn’t work. So how do you delete a deep copy?

            tensor = torch.load(os.path.join(inp_dir,vid_tensor))
            logging.info(f"\n\nSuccessfuly loaded label: {label}")
            logging.info(f"Label shape is: {tensor.shape}\n\n")

            for idx,chunk in enumerate(tensor):
                chunk_name = os.path.join(label_output_dir,f"{label}_{idx}.pt")
                if not os.path.exists(chunk_name):
                    slice = chunk.detach().clone()
                    torch.save(slice,chunk_name)
                    logging.info(f"Successfuly saved {chunk_name}")
                    logging.info(f"chunk has shape {slice.shape}")
                    del slice #doesn't work

I tried to repro this with a minimal example:

import torch

tensor = torch.randn(32768, 32768)
for i in range(tensor.size(1)):
    slice = tensor[:,i].detach().clone()
    torch.save(slice, 'asdf')

But I only see constant memory usage of about ~4GiB here, which is what we would expect from 2**30 elements * 4 bytes per element.