Will there be any difference to transfer tensor to GPU in __getitem__ and in training loop?

Which practice is considered good?

  1. def __getitem__(self, idx): to use .cuda() while returning features, labels from __get__item or

  2. In the training loop
    features.cuda(), labels.cuda() = batch. ??

I would recommend the second approach, as it would make sure to create the complete batch using multiprocessing and transferring the batch to the device.
The first approach might work, but you could easily run into multiprocessing errors, if each worker tries to create a new CUDA context.

1 Like

I’m getting the object is not having .cuda() method when using the second one. From getitem I’m returning a tuple of (image, label). I have successfully used transformations, changed the dtype to float32. But it’s outputting the error .cuda() method can be used on that.

But works totally fine by 1st method.

The tuple class is a plain Python class and thus doesn’t have the cuda() method.
You would have to unwrap the tensors and call cuda() on each of them.