CUDA initialization error when DataLoader with CUDA Tensor

The easiest way to preload all data on GPU is by simply copying it there (Tensor.cuda()) and maintaining a Python list with all samples that you want to process. Then, instead of iterating over a dataset, you iterate over a Python list with pre-existing CUDA tensors. The reasons these multiprocessing data loaders exist are 1) datasets are typically much larger than a single GPU can hold resident in memory, 2) a single CPU cannot preprocess enough examples per second to saturate GPU throughput.

3 Likes