If you load your samples in the Dataset
on CPU and would like to push it during training to the GPU, you can speed up the host to device transfer by enabling pin_memory
.
This lets your DataLoader
allocate the samples in page-locked memory, which speeds-up the transfer.
You can find more information on the NVIDIA blog.
71 Likes