How to use DataLoader pin memory when data is a mix of cpu and gpu

jj0mst · April 30, 2025, 10:05am

Hello,

I am writing a multi-worker data loading pipeline. My input is given by high resolution images that I am decoding directly into the GPU, while my target is given by common labels that I load in CPU memory and move them to GPU only when accessed by the main (trainer) process.

My problem is that I cannot use pin_memory with my DataLoader, since it tries to also pin the GPU data instead of skipping it, and throws this error:
RuntimeError: cannot pin ‘torch.cuda.ByteTensor’ only dense CPU tensors can be pinned

Does anyone know any workaround for this? I would like to exploit pin memory to speedup the CPU-GPU data transfer, but still have not found any trivial solution to this.

ptrblck · April 30, 2025, 3:01pm

You could explicitly create the target tensor using pinned memory or move it as described in this tutorial.