Is it more efficient to load all tensors to GPU first or do it batch wise?

Thank you for the reply. @SnowWalkerJ :slightly_smiling_face:

My thoughts were the same, but I’m running into an issue with DataLoader returning non-CUDA tensors, which made me think that that’s not how it was meant to be used. Have you faced this issue ?