Cache entire train/test data on GPU

mkarikom · March 11, 2021, 2:06am

I have noticed that in some situations, a smaller batch size decreases error, but I want to minimize IO overhead for an epoch with many iterations.

It seems that the Dataloader(...,pin_memory=True) option is designed to speed up batch loading. What if I know that most of my train/test data sets will fit on the device? Is there a way to cache the entire data set on the GPU, so that I can tune batch size as needed without suffering from increased IO overhead for small batches?

surya00060 · March 11, 2021, 2:39am

Hey!

If your data is small enough, it can be completely stored on GPU itself for computation.

data = data.to('cuda')

Else, you can use pin_memory=True to cache the data.

Please refer to this Q&A

Hope this helps.

mkarikom · March 11, 2021, 3:31am

Thanks @surya00060!

I added an option to invoke this inside my DataSet class and it runs way faster now.