DataLoader workers add examples directly to the GPU? Or should that be handled by the main process?
DataLoader is using the
__getitem__ method to prepare the next batch of items while the main process is running a training step on the current batch of data (correct me if this is incorrect). So in my
__getitem__ I load the data (from images in my case), do some preprocessing, and stick it into a PyTorch tensor. Should that tensor be put onto the GPU in
__getitem__? I could suspect this could cause trouble since the number of these tensors slowly rises as the new data is being prepared. But if the allocation is automatically handled correctly, I could also see it being fine if it just slowly fills up the prepared space for the next batch. Should I be moving things to GPU in
__getitem__, and therefore in the data worker process? Or should it wait until the main process to move the whole batch at once? Or am I misunderstanding something about the whole procedure? Thank you!