Whats the proper way to push all data to GPU and then take small batches during training?

I’m working with the MNIST dataset where the memory of my GPU can accommodate the entire dataset.
Naturally, I’d like to push everything to GPU before starting to train my model to make things faster. However, even though I want everything in GPU, I’d like to take small batches of data to ensure good generalization. What’s the nicest way of doing so? As of now, what I do is this:

apply_transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
 train_dataset = datasets.MNIST(data_dir, train=True, download=True, transform=apply_transform)
self.train_loader = DataLoader(data.DatasetSplit(train_dataset, data_idxs), 
                                             batch_size=60000, shuffle=True)
# load the entire data to GPU and get it in a different  with desired batch size
 _, (images, labels) = next(enumerate(self.train_loader))
images, labels = images.to(args.device), labels.to(args.device)
self.train_loader = DataLoader(torch.utils.data.TensorDataset(images, labels), batch_size=args.local_bs, shuffle = True)

So briefly, first I wrap the dataset with a loader where the batch size is the size of the entire dataset. I then fetch tensors from this loader and then push them to the GPU. Finally, I create a tensor dataset with tensors I have and wrap the tensor dataset with another loader where the batch size is my desired size.
Is there a nicer way of doing this? Preferably, without meddling with data loaders at all, as profiling my code shows data loaders are the bottleneck (could be because I use them in a wrong way).

For MNIST, the dataset already stores everything in Tensors, so you can grab the ds.data and ds.targets Tensors from the torchvision MNIST dataset directly and stick them into your TensorDataset.

Best regards


1 Like